aggregate_operator¶
Operator to aggregate column features based on a column
-
class
tasrif.processing_pipeline.custom.aggregate_operator.
AggregateOperator
(groupby_feature_names, aggregation_definition, observers=None)¶ Group and aggregate rows in 2D data frame based on a column feature. This operator works on a 2D data frames where the columns represent the features. The returned data frame contains aggregated values as the column features together with the base feature used for grouping.
Examples
>>> import pandas as pd >>> >>> from tasrif.processing_pipeline.custom import AggregateOperator >>> from tasrif.processing_pipeline.custom import LinearFitOperator >>> >>> df = pd.DataFrame([['001', 25, 30], ['001', 17, 50], ['002', 20, 40], ['002', 21, 42]], ... columns=['pid', 'min_activity', 'max_activity']) >>> >>> operator = AggregateOperator( ... groupby_feature_names ="pid", ... aggregation_definition= {"min_temp": ["mean", "std"], ... "r2,_,intercept": LinearFitOperator(feature_names='min_activity', ... target='max_activity')}) >>> df0 = operator.process(df0) >>> >>> print(df0) [ pid min_activity_mean min_activity_std r2 intercept 0 001 21.0 5.656854 1.0 9.250000e+01 1 002 20.5 0.707107 1.0 7.105427e-15]
-
__init__
(groupby_feature_names, aggregation_definition, observers=None)¶ Creates a new instance of AggregateOperator
- Parameters
groupby_feature_names (str) – Name of the feature to base the grouping on. In case groupby_feature_names includes non string such as a function call like pd.Grouper(), the column is not shown in the result.
aggregation_definition (dict) – Dictionary containing feature to aggregation functions mapping.
observers (list[Observer]) – Python list of observers
-
set_observers
(observers)¶ Function to store the observers for the given operator.
- Parameters
observers (list of Observer) – Observer objects that observe the operator
-