dvb.datascience.predictor package

Module contents

class dvb.datascience.predictor.CostThreshold(costFalseNegative: float = 1.0, costFalsePositive: float = 1.0)

Bases: dvb.datascience.predictor.ThresholdBase

class dvb.datascience.predictor.GridSearchCVProgressBar(estimator, param_grid, scoring=None, fit_params=None, n_jobs=None, iid='warn', refit=True, cv='warn', verbose=0, pre_dispatch='2*n_jobs', error_score='raise-deprecating', return_train_score='warn')

Bases: sklearn.model_selection._search.GridSearchCV

Monkey patch to have a progress bar during grid search

class dvb.datascience.predictor.PrecisionRecallThreshold

Bases: dvb.datascience.predictor.ThresholdBase

class dvb.datascience.predictor.SklearnClassifier(clf, **kwargs)

Bases: dvb.datascience.classification_pipe_base.ClassificationPipeBase

Wrapper for inclusion of sklearn classifiers in the pipeline.

fit(data: Dict[str, Any], params: Dict[str, Any])

Train on a dataset df and store the learnings so transform can be called later on to transform based on the learnings.

fit_attributes = [('clf', 'pickle', 'pickle'), ('threshold', None, None)]
input_keys = ('df', 'df_metadata')
output_keys = ('predict', 'predict_metadata')
threshold = None
transform(data: Dict[str, Any], params: Dict[str, Any]) → Dict[str, Any]

Perform an operations on df using the kwargs and the learnings from training. Transform will return a tuple with the transformed dataset and some output. The transformed dataset will be the input for the next plumber. The output will be collected and shown to the user.

class dvb.datascience.predictor.SklearnGridSearch(clf, param_grid, scoring: str = 'roc_auc')

Bases: dvb.datascience.predictor.SklearnClassifier

fit(data: Dict[str, Any], params: Dict[str, Any])

Train on a dataset df and store the learnings so transform can be called later on to transform based on the learnings.

input_keys = ('df', 'df_metadata')
output_keys = ('predict', 'predict_metadata')
class dvb.datascience.predictor.TPOTClassifier(**kwargs)

Bases: dvb.datascience.predictor.SklearnClassifier

fit(data: Dict[str, Any], params: Dict[str, Any])

Train on a dataset df and store the learnings so transform can be called later on to transform based on the learnings.

class dvb.datascience.predictor.ThresholdBase

Bases: abc.ABC

What does this do?

set_y(y_true, y_pred_proba, y_pred_proba_labels, **kwargs)
y_pred_proba = None
y_pred_proba_labels = None
y_true = None