resample_operator¶
Operator to resample a timeseries based dataframe
-
class
tasrif.processing_pipeline.custom.resample_operator.
ResampleOperator
(rule, aggregation_definition, **resample_args)¶ Group and aggregate rows in 2D data frame based on a column feature. This operator works on a 2D data frames where the columns represent the features. The returned data frame contains aggregated values as the column features together with the base feature used for grouping.
Examples
>>> import pandas as pd >>> from tasrif.processing_pipeline.custom import ResampleOperator >>> df = pd.DataFrame([ >>> [1, "2020-05-01 00:00:00", 1], >>> [1, "2020-05-01 01:00:00", 1], >>> [1, "2020-05-01 03:00:00", 2], >>> [2, "2020-05-02 00:00:00", 1], >>> [2, "2020-05-02 01:00:00", 1]], >>> columns=['logId', 'timestamp', 'sleep_level']) >>> >>> df['timestamp'] = pd.to_datetime(df['timestamp']) >>> df = df.set_index('timestamp') >>> op = ResampleOperator('D', {'sleep_level': 'mean'}) >>> op.process(df) [ sleep_level timestamp 2020-05-01 1.333333 2020-05-02 1.000000]
-
__init__
(rule, aggregation_definition, **resample_args)¶ Creates a new instance of ResampleOperator
- Parameters
rule (ruleDateOffset, Timedelta, str) – The offset string or object representing target conversion.
aggregation_definition (dict, str) – Dictionary containing feature to aggregation functions mapping. function defining the aggregation behavior (‘sum’, ‘mean’, ‘ffill’, etc.)
**resample_args – key word arguments passed to pandas DataFrame.resample method
-