dabl.plot.plot_regression_continuous

dabl.plot.plot_regression_continuous(X, *, target_col, types=None, scatter_alpha='auto', scatter_size='auto', drop_outliers=True, correlation='spearman', prune_correlations_threshold=0.95, find_scatter_categoricals=True, jitter_ordinal=True, **kwargs)[source]

Plots for continuous features in regression.

Creates plots of all the continuous features vs the target. Relevant features are determined using F statistics.

Parameters:
Xdataframe

Input data including features and target.

target_colstr or int

Identifier of the target column in X.

typesdataframe of types, optional

Output of detect_types on X. Can be used to avoid recomputing the types.

scatter_alphafloat, default=’auto’

Alpha values for scatter plots. ‘auto’ is dirty hacks.

scatter_sizefloat, default=’auto’

Marker size for scatter plots. ‘auto’ is dirty hacks.

drop_outliersbool, default=True

Whether to drop outliers (in the target column) when plotting.

correlationstr, default=”spearman”

Correlation to use for ranking plots, passed to pd.DataFrame.corrwith. Valid values are ‘pearson’, ‘kendall’, ‘spearman’.

jitter_ordinalbool, default=True

Whether to add jitter, i.e. apply noise, to ordinal features, to reduce overlap.

prune_correlations_thresholdfloat, default=.95

Whether to prune highly correlated features from the plot. Set to 0 to disable pruning.

find_scatter_categoricalsboolean, default=True

Whether to find categorical features to use as hue in scatter plots.