linear_fit_operator

Operator to fit features to target columns using sklearn’s linear regression

class tasrif.processing_pipeline.custom.linear_fit_operator.LinearFitOperator(feature_names, target, target_type='continuous', **model_kwargs)

Operator to fit features to target column using linear regression

Examples

>>> import pandas as pd
>>> from tasrif.processing_pipeline.custom import LinearFitOperator
>>> df = pd.DataFrame([
...     [1, "2020-05-01 00:00:00", 10, 'poor'],
...     [1, "2020-05-01 01:00:00", 15, 'poor'],
...     [1, "2020-05-01 03:00:00", 23, 'good'],
...     [2, "2020-05-02 00:00:00", 17, 'good'],
...     [2, "2020-05-02 01:00:00", 11, 'poor']],
...     columns=['logId', 'timestamp', 'sleep_level', 'sleep_quality'])
>>>
>>> op = LinearFitOperator(feature_names='sleep_level',
...                        target='sleep_quality',
...                        target_type='categorical')
>>> print(op.process(df))
[(array(['poor', 'poor', 'good', 'good', 'poor'], dtype=object), 1.0, array([12.71063824]))]
>>> df = pd.DataFrame([
...     [15, 10, 'poor'],
...     [13, 15, 'poor'],
...     [11, 23, 'good'],
...     [25, 17, 'good'],
...     [20, 11, 'poor']],
...     columns=['feature1', 'feature2', 'target'])
>>>
>>> op = LinearFitOperator(feature_names='all',
...                        target='target',
...                        target_type='categorical')
>>> op.process(df)
[(array(['poor', 'poor', 'good', 'good', 'poor'], dtype=object),
  1.0,
  array([17.78134321]))]
__init__(feature_names, target, target_type='continuous', **model_kwargs)

Creates a new instance of LinearFitOperator

Parameters
  • feature_names (list, str) – feature_names in the given dataframe to fit. if ‘all’, then select all numerical features except target

  • target (str) – dependant target feature in the dataframe

  • target_type (str) –

    • If target_type is continuous, LinearRegression will be used

    • If target_type is categorical, LogisticRegression will be used

    • else, LogisticRegression will be used

  • **model_kwargs – key word arguments passed to sklearn LinearRegression or LogisticRegression