add_duration_operator¶
Operator to aggregate column features based on a column
-
class
tasrif.processing_pipeline.custom.add_duration_operator.
AddDurationOperator
(groupby_feature_names, date_feature_name='timestamp', duration_feature_name='duration')¶ Given a 2D dataframe representing a timeseries where each row represents a time event, this operator will add a new feature duration to compute duration.
Examples
>>> import pandas as pd >>> >>> from tasrif.processing_pipeline.custom import AddDurationOperator >>> >>> df0 = pd.DataFrame([[1, "2020-05-01 00:00:00", 1], [1, "2020-05-01 01:00:00", 1], >>> [1, "2020-05-01 03:00:00", 2], [2, "2020-05-02 00:00:00", 1],[2, "2020-05-02 01:00:00", 1]], >>> columns=['logId', 'timestamp', 'sleep_level']) >>> df0['timestamp'] = pd.to_datetime(df0['timestamp']) >>> >>> operator = AddDurationOperator( >>> groupby_feature_names="logId", >>> date_feature_name="timestamp", >>> duration_feature_name="duration") >>> df0 = operator.process(df0) >>> >>> print(df0) [ logId timestamp sleep_level duration 0 1 2020-05-01 00:00:00 1 0 days 00:00:00 1 1 2020-05-01 01:00:00 1 0 days 01:00:00 2 1 2020-05-01 03:00:00 2 0 days 02:00:00 3 2 2020-05-02 00:00:00 1 0 days 00:00:00 4 2 2020-05-02 01:00:00 1 0 days 01:00:00]
-
__init__
(groupby_feature_names, date_feature_name='timestamp', duration_feature_name='duration')¶ Creates a new instance of AddDurationOperator
- Parameters
groupby_feature_names (str) – Name of the feature to identify related timestamp series
date_feature_name (str) – Name of the feature respresenting the timestamp
duration_feature_name (str) – Name of the feature representing the duration
-