read_nested_csv_operator¶
Operator to aggregate column features based on a column
-
class
tasrif.processing_pipeline.custom.read_nested_csv_operator.
ReadNestedCsvOperator
(folder_path, field, pipeline: tasrif.processing_pipeline.sequence_operator.SequenceOperator = None)¶ Operator that returns a Generator: one record per call.
Example
>>> import pandas as pd >>> import numpy as np >>> >>> from tasrif.processing_pipeline.custom import ReadNestedCsvOperator >>> >>> df = pd.DataFrame({"name": ['Alfred', 'Roy'], ... "age": [43, 32], ... "file_details": ['details1', 'details2']}) >>> >>> details1 = pd.DataFrame({'calories': [360, 540], ... 'time': [pd.Timestamp("2015-04-25"), pd.Timestamp("2015-04-26")] ... }) >>> >>> details2 = pd.DataFrame({'calories': [420, 250], ... 'time': [pd.Timestamp("2015-05-16"), pd.Timestamp("2015-05-17")] ... }) >>> >>> >>> # Save File 1 and File 2 >>> details1.to_csv('details1.csv', index=False) >>> details2.to_csv('details2.csv', index=False) >>> >>> operator = ReadNestedCsvOperator(folder_path='./', field='file_details', pipeline=None) >>> generator = operator.process(df) >>> >>> # Iterates twice >>> for record, details in generator: ... print('Subject information:') ... print(record) ... print('') ... print('Subject details:') ... print(details) ... print('============================') Subject information: name Alfred age 43 file_details details1 Name: 0, dtype: object ... Subject details: calories time 0 360 2015-04-25 1 540 2015-04-26 ============================ Subject information: name Roy age 32 file_details details2 Name: 1, dtype: object ... Subject details: calories time 0 420 2015-05-16 1 250 2015-05-17 ============================
-
__init__
(folder_path, field, pipeline: tasrif.processing_pipeline.sequence_operator.SequenceOperator = None)¶ Creates a new instance of ReadNestedCsvOperator
- Parameters
folder_path (str) – path to csv files
field (str) – column that contains the csv file names
pipeline (SequenceOperator) – pipeline to apply on dataframe record before yielding it
-