mfeat-factors dataset visualizationΒΆ

A multiclass dataset with 10 classes. Linear discriminant analysis works surprisingly well!

  • Target distribution
  • F=1.06E+03, F=7.18E+02, F=7.16E+02, F=6.95E+02, F=6.64E+02, F=6.60E+02, F=6.58E+02, F=6.29E+02, F=6.12E+02, F=5.59E+02
  • Top feature interactions, 0.540, 0.538, 0.535, 0.528
  • Discriminating PCA directions, 0.565, 0.562, 0.518, Scree plot (PCA explained variance)
  • Discriminating LDA directions, 0.763, 0.735, 0.720, 0.720
/home/circleci/project/~/miniconda/envs/testenv/lib/python3.10/site-packages/sklearn/datasets/_openml.py:292: UserWarning: Multiple active versions of the dataset matching the name mfeat-factors exist. Versions may be fundamentally different, returning version 1.
  warn(
/home/circleci/project/~/miniconda/envs/testenv/lib/python3.10/site-packages/sklearn/datasets/_openml.py:932: FutureWarning: The default value of `parser` will change from `'liac-arff'` to `'auto'` in 1.4. You can set `parser='auto'` to silence this warning. Therefore, an `ImportError` will be raised from 1.4 if the dataset is dense and pandas is not installed. Note that the pandas parser may return different data types. See the Notes Section in fetch_openml's API doc for details.
  warn(
Target looks like classification
Showing only top 10 of 216 continuous features
Linear Discriminant Analysis training set score: 0.993

# sphinx_gallery_thumbnail_number = 5
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_openml
from dabl import plot

X, y = fetch_openml('mfeat-factors', as_frame=True, return_X_y=True)

plot(X, y)
plt.show()

Total running time of the script: ( 0 minutes 13.546 seconds)

Gallery generated by Sphinx-Gallery