Transformers

A transformer converts a set of columns and arbitrary values into a single transformed column. Each transformer has an input type and an output column type.

Custom transformers can be implemented in Python or PySpark. See the implementation docs for a detailed guide.

Config

- kind: transformer
name: <string> # transformer name (required)
path: <string> # path to the implementation file, relative to the cortex root (default: implementations/transformers/<name>.py)
output_type: <column_type> # The type of column that will be generated by this transformer (required)
input: <input_type> # the input type of the transformer (required)

See Data Types for details about input and column types.

Example

- kind: transformer
name: normalize
output_type: FLOAT_COLUMN
input:
num: INT_COLUMN|FLOAT_COLUMN
mean: FLOAT
stddev: FLOAT

Built-in Transformers

Cortex includes common transformers that can be used out of the box (see transformers.yaml). To use built-in transformers, use the cortex namespace in the transformer name (e.g. cortex.normalize).