Diamonds Dataset VisualizationΒΆ

Regression on the classical diamond dataset.

  • Target distribution
  • Continuous Feature vs Target, F=9.63E-01, F=1.72E-01, F=1.00E-02
  • Categorical Feature vs Target, F=3.61E-01, F=2.83E-01, F=1.04E-01
/home/circleci/project/~/miniconda/envs/testenv/lib/python3.11/site-packages/sklearn/datasets/_openml.py:322: UserWarning: Multiple active versions of the dataset matching the name diamonds exist. Versions may be fundamentally different, returning version 1. Available versions:
- version 1, status: active
  url: https://www.openml.org/search?type=data&id=42225
- version 2, status: active
  url: https://www.openml.org/search?type=data&id=43998

  warn(warning_msg)
Target looks like regression
/home/circleci/project/dabl/plot/supervised.py:107: UserWarning: Not plotting highly correlated (0.9961166041570525) feature carat. Set prune_correlations_threshold=0 to keep.
  warn(f"Not plotting highly correlated ({corr.max()})"
/home/circleci/project/dabl/plot/supervised.py:107: UserWarning: Not plotting highly correlated (0.9978949275849379) feature y. Set prune_correlations_threshold=0 to keep.
  warn(f"Not plotting highly correlated ({corr.max()})"
/home/circleci/project/dabl/plot/supervised.py:107: UserWarning: Not plotting highly correlated (0.9873553172140505) feature z. Set prune_correlations_threshold=0 to keep.
  warn(f"Not plotting highly correlated ({corr.max()})"
/home/circleci/project/dabl/plot/supervised.py:214: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  medians = X_new.groupby(col)[target_col].median()
/home/circleci/project/~/miniconda/envs/testenv/lib/python3.11/site-packages/seaborn/categorical.py:641: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  grouped_vals = vals.groupby(grouper)
/home/circleci/project/dabl/plot/supervised.py:214: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  medians = X_new.groupby(col)[target_col].median()
/home/circleci/project/~/miniconda/envs/testenv/lib/python3.11/site-packages/seaborn/categorical.py:641: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  grouped_vals = vals.groupby(grouper)
/home/circleci/project/dabl/plot/supervised.py:214: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  medians = X_new.groupby(col)[target_col].median()
/home/circleci/project/~/miniconda/envs/testenv/lib/python3.11/site-packages/seaborn/categorical.py:641: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  grouped_vals = vals.groupby(grouper)

# sphinx_gallery_thumbnail_number = 2
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_openml
from dabl import plot

X, y = fetch_openml('diamonds', as_frame=True, return_X_y=True)

plot(X, y)
plt.show()

Total running time of the script: (0 minutes 13.201 seconds)

Gallery generated by Sphinx-Gallery