sequence_operator¶
Module that defines the SequenceOperator class
-
class
tasrif.processing_pipeline.sequence_operator.
SequenceOperator
(processing_operators, observers=None)¶ Class representing a pipeline of processing operators. The definition of the pipeline is passed in the constructor as a list of ProcessingOperator objects. Data flows from one operator to another in a chained fashion.
-
__init__
(processing_operators, observers=None)¶ Constructs a sequence operator from a list of operators
- Parameters
processing_operators (list[ProcessingOperator]) – Python list of processing operators
observers (list[Observer]) – Python list of observers
- Raises
ValueError – Occurs when one of the objects in the specified list is not a ProcessingOperator
Examples
>>> from tasrif.processing_pipeline import SequenceOperator >>> from tasrif.processing_pipeline.pandas import DropDuplicatesOperator, DropNAOperator >>> df = pd.DataFrame({"pid": ['001', '002', '003'], ... "height": [np.nan, 188, 170], ... "born": [pd.NaT, pd.Timestamp("1940-04-25"), ... pd.NaT]}) >>> pipeline = SequenceOperator([DropDuplicatesOperator(), DropNAOperator()]) >>> pipeline.process(df) ( pid height born 1 002 188.0 1940-04-25,)
-
set_observers
(observers)¶ Function to store the observers for the given operator.
- Parameters
observers (list of Observer) – Observer objects that observe the operator
-
is_functional
()¶ Function that returns whether the operator is functional or infrastructure
- Returns
whether is_functional
- Return type
is_functional (bool)
-