set_index_operator¶
Set the DataFrame index using existing columns.
Set the DataFrame index (row labels) using one or more existing columns or arrays (of the correct length). The index can replace the existing index or expand on it.
-
class
tasrif.processing_pipeline.pandas.set_index_operator.
SetIndexOperator
(keys, **kwargs)¶ Examples
>>> import pandas as pd >>> from tasrif.processing_pipeline.pandas import SetIndexOperator >>> df = pd.DataFrame([ ... [1, "2020-05-01 00:00:00", 1], ... [1, "2020-05-01 01:00:00", 1], ... [1, "2020-05-01 03:00:00", 2], ... [2, "2020-05-02 00:00:00", 1], ... [2, "2020-05-02 01:00:00", 1]], ... columns=['logId', 'timestamp', 'sleep_level']) >>> df logId timestamp sleep_level 0 1 2020-05-01 00:00:00 1 1 1 2020-05-01 01:00:00 1 2 1 2020-05-01 03:00:00 2 3 2 2020-05-02 00:00:00 1 4 2 2020-05-02 01:00:00 1
>>> op = SetIndexOperator('timestamp') >>> op.process(df) [ logId sleep_level timestamp 2020-05-01 00:00:00 1 1 2020-05-01 01:00:00 1 1 2020-05-01 03:00:00 1 2 2020-05-02 00:00:00 2 1 2020-05-02 01:00:00 2 1]
-
__init__
(keys, **kwargs)¶ Initializes the operator.
- Parameters
keys (str or list) – This parameter can be either a single column key, a single array of the same length as the calling DataFrame, or a list containing an arbitrary combination of column keys and arrays.
**kwargs – key word arguments passed to pandas DataFrame.dropna method
-