compose_operator

Module that defines the ComposeOperator class

class tasrif.processing_pipeline.compose_operator.ComposeOperator(processing_operators, observers=None, num_processes=1)

Class representing a composiition of processing operators. The same data flows to all the operators. The order is not important. The output of the process function is a composition of the results of all the containing operators

__init__(processing_operators, observers=None, num_processes=1)

Constructs a compose operator from a list of operators

Parameters
  • processing_operators – list[ProcessingOperator] Python list of processing operators

  • observers (list[Observer]) – Python list of observers

  • num_processes – int number of logical processes to use to process the operator

Raises

ValueError – Occurs when one of the objects in the specified list is not a ProcessingOperator

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from tasrif.processing_pipeline.pandas import DropDuplicatesOperator, DropNAOperator
>>> from tasrif.processing_pipeline import ComposeOperator
>>> pipeline = ComposeOperator([DropDuplicatesOperator(), DropNAOperator()])
>>> df = pd.DataFrame({"pid": ['001', '002', '003'],
>>>                  "height": [np.nan, 188, 170],
>>>                  "born": [pd.NaT, pd.Timestamp("1940-04-25"),
>>>                           pd.NaT]})
>>> pipeline.process(df)
[(   pid  height       born
  0  001     NaN        NaT
  1  002   188.0 1940-04-25
  2  003   170.0        NaT,),
 (   pid  height       born
  1  002   188.0 1940-04-25,)]
set_observers(observers)

Function to store the observers for the given operator.

Parameters

observers (list of Observer) – Observer objects that observe the operator

is_functional()

Function that returns whether the operator is functional or infrastructure

Returns

whether is_functional

Return type

is_functional (bool)