Diamonds Dataset VisualizationΒΆ

Regression on the classical diamond dataset.

  • Target distribution
  • Continuous Feature vs Target, F=9.63E-01, F=1.72E-01, F=1.00E-02
  • Categorical Feature vs Target, F=3.61E-01, F=2.84E-01, F=1.02E-01
/home/circleci/project/~/miniconda/envs/testenv/lib/python3.10/site-packages/sklearn/datasets/_openml.py:292: UserWarning: Multiple active versions of the dataset matching the name diamonds exist. Versions may be fundamentally different, returning version 1.
  warn(
/home/circleci/project/~/miniconda/envs/testenv/lib/python3.10/site-packages/sklearn/datasets/_openml.py:932: FutureWarning: The default value of `parser` will change from `'liac-arff'` to `'auto'` in 1.4. You can set `parser='auto'` to silence this warning. Therefore, an `ImportError` will be raised from 1.4 if the dataset is dense and pandas is not installed. Note that the pandas parser may return different data types. See the Notes Section in fetch_openml's API doc for details.
  warn(
Target looks like regression
/home/circleci/project/dabl/plot/supervised.py:107: UserWarning: Not plotting highly correlated (0.9961166041570525) feature carat. Set prune_correlations_threshold=0 to keep.
  warn(f"Not plotting highly correlated ({corr.max()})"
/home/circleci/project/dabl/plot/supervised.py:107: UserWarning: Not plotting highly correlated (0.9978949275849379) feature y. Set prune_correlations_threshold=0 to keep.
  warn(f"Not plotting highly correlated ({corr.max()})"
/home/circleci/project/dabl/plot/supervised.py:107: UserWarning: Not plotting highly correlated (0.9873553172140505) feature z. Set prune_correlations_threshold=0 to keep.
  warn(f"Not plotting highly correlated ({corr.max()})"

# sphinx_gallery_thumbnail_number = 2
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_openml
from dabl import plot

X, y = fetch_openml('diamonds', as_frame=True, return_X_y=True)

plot(X, y)
plt.show()

Total running time of the script: ( 0 minutes 11.616 seconds)

Gallery generated by Sphinx-Gallery