dabl.SimpleClassifier

class dabl.SimpleClassifier(refit=True, random_state=None, verbose=1, type_hints=None, shuffle=True)[source]

Automagic anytime classifier.

Parameters
refitboolean, True

Whether to refit the model on the full dataset.

random_staterandom state, int or None (default=None)

Random state or seed.

verboseinteger, default=1

Verbosity (higher is more output).

type_hintsdict or None

If dict, provide type information for columns. Keys are column names, values are types as provided by detect_types.

shuffleboolean, default=True

Whether to shuffle the training set in cross-validation.

Attributes
est_sklearn estimator

Best estimator found.

__init__(refit=True, random_state=None, verbose=1, type_hints=None, shuffle=True)[source]

Initialize self. See help(type(self)) for accurate signature.

fit(X, y=None, *, target_col=None)[source]

Fit classifier.

Requires to either specify the target as separate 1d array or Series y (in scikit-learn fashion) or as column of the dataframe X specified by target_col. If y is specified, X is assumed not to contain the target.

Parameters
XDataFrame

Input features. If target_col is specified, X also includes the target.

ySeries or numpy array, optional.

Target class labels. You need to specify either y or target_col.

target_colstring or int, optional

Column name of target if included in X.

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsmapping of string to any

Parameter names mapped to their values.

score(X, y, sample_weight=None)[source]

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters
Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

Returns
scorefloat

Mean accuracy of self.predict(X) wrt. y.

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters
**paramsdict

Estimator parameters.

Returns
selfobject

Estimator instance.