OrderedUniformEncoder
Last updated
Last updated
Compatibility: categorical
or boolean
data
The OrderedUniformEncoder
transforms data that represents ordered categorical values into a uniform distribution in the [0,1]
interval. It preserves the frequencies of each category with high accuracy.
(required) order
: Specify an order to the category values
[list <value>]
An ordered list of the categories that appear in the real data
The transformer assigns each category to a unique, non-overlapping subset of the [0,1]
interval. The order of the intervals is based on your custom order. The length of the interval is based on the category's frequency. For example if category 'AGREE'
occurs with 20% frequency, the subset will have the length 0.2
such as [0.5, 0.7]
.
After fitting the transformer, you can access the learned values through the attributes.
frequencies
: A dictionary that maps each category value to the observed frequency, as a float between 0 and 1
intervals
: A dictionary that maps each category value to an interval between [0,1]
. This allows you to determine the exact rules used for transforming and reverse transforming.
This transformer is only defined for ordinal categorical data. If there is no order, your data is nominal. Use the instead.