
class dabl.AnyClassifier(n_jobs=None, min_resources='exhaust', verbose=0, type_hints=None, portfolio='baseline')[source]

Classifier with automatic model selection.

This model uses successive halving on a portfolio of complex models (HistGradientBoosting, RandomForest, SVC, LogisticRegression) to pick the best model family and hyper-parameters.

AnyClassifier internally applies EasyPreprocessor, so no preprocessing is necessary.

n_jobsint, default=None

Number of processes to spawn for parallelizing the search.

min_resources{‘exhaust’, ‘smallest’} or int, default=’exhaust’

The minimum amount of resource that any candidate is allowed to use for a given iteration. Equivalently, this defines the amount of resources r0 that are allocated for each candidate at the first iteration. See the documentation of HalvingGridSearchCV for more information.

verboseinteger, default=0

Verbosity. Higher means more output.

type_hintsdict or None

If dict, provide type information for columns. Keys are column names, values are types as provided by detect_types.

portfoliostr, default=’baseline’

Lets you choose a portfolio. Choose ‘baseline’ for multiple classifiers with default parameters, ‘hgb’ for high-performing HistGradientBoostingClassifiers, ‘svc’ for high-performing support vector classifiers, ‘rf’ for high-performing random forest classifiers, ‘lr’ for high-performing logistic regression classifiers, ‘mixed’ for a portfolio of different high-performing classifiers.

search_HalvingGridSearchCV instance

Fitted HalvingGridSearchCV instance for inspection.

est_sklearn estimator

Best estimator (pipeline) found during search.

__init__(n_jobs=None, min_resources='exhaust', verbose=0, type_hints=None, portfolio='baseline')[source]

Initialize self. See help(type(self)) for accurate signature.

fit(X, y=None, *, target_col=None)[source]

Fit estimator.

Requires to either specify the target as separate 1d array or Series y (in scikit-learn fashion) or as column of the dataframe X specified by target_col. If y is specified, X is assumed not to contain the target.


Input features. If target_col is specified, X also includes the target.

ySeries or numpy array, optional.

Target. You need to specify either y or target_col.

target_colstring or int, optional

Column name of target if included in X.


Get parameters for this estimator.

deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.


Parameter names mapped to their values.

score(X, y, sample_weight=None)[source]

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.


Mean accuracy of self.predict(X) wrt. y.


Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.


Estimator parameters.

selfestimator instance

Estimator instance.