dabl.EasyPreprocessor¶
- 
class dabl.EasyPreprocessor(scale=True, force_imputation=True, verbose=0, types=None)[source]¶
- A simple preprocessor. - Detects variable types, encodes everything as floats for use with sklearn. - Applies one-hot encoding, missing value imputation and scaling. - Parameters
- scaleboolean, default=True
- Whether to scale continuous data. 
- force_imputationbool, default=True
- Whether to create imputers even if no training data is missing. 
- verboseint, default=0
- Control output verbosity. 
 
- Attributes
- ct_ColumnTransformer
- Main container for all transformations. 
- columns_pandas columns
- Columns of training data. 
- dtypes_Series of dtypes
- Dtypes of training data columns. 
- types_something
- Inferred input types. 
 
 - 
__init__(scale=True, force_imputation=True, verbose=0, types=None)[source]¶
- Initialize self. See help(type(self)) for accurate signature. 
 - 
fit(X, y=None)[source]¶
- A reference implementation of a fitting function for a transformer. - Parameters
- Xarray-like or sparse matrix of shape = [n_samples, n_features]
- The training input samples. 
- yNone
- There is no need of a target in a transformer, yet the pipeline API requires this parameter. 
 
- Returns
- selfobject
- Returns self. 
 
 
 - 
fit_transform(X, y=None, **fit_params)[source]¶
- Fit to data, then transform it. - Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X. - Parameters
- X{array-like, sparse matrix, dataframe} of shape (n_samples, n_features)
- yndarray of shape (n_samples,), default=None
- Target values. 
- **fit_paramsdict
- Additional fit parameters. 
 
- Returns
- X_newndarray array of shape (n_samples, n_features_new)
- Transformed array. 
 
 
 - 
get_params(deep=True)[source]¶
- Get parameters for this estimator. - Parameters
- deepbool, default=True
- If True, will return the parameters for this estimator and contained subobjects that are estimators. 
 
- Returns
- paramsmapping of string to any
- Parameter names mapped to their values. 
 
 
 - 
set_params(**params)[source]¶
- Set the parameters of this estimator. - The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form - <component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters
- **paramsdict
- Estimator parameters. 
 
- Returns
- selfobject
- Estimator instance.