OrderedUniformEncoder
Compatibility: categorical
or boolean
data
The OrderedUniformEncoder
transforms data that represents ordered categorical values into a uniform distribution in the [0,1]
interval. It preserves the frequencies of each category with high accuracy.

from rdt.transformers.categorical import OrderedUniformEncoder
transformer = OrderedUniformEncoder(order=['STRONGLY DISAGREE', 'DISAGREE', 'NEUTRAL',
'AGREE', 'STRONGLY AGREE'])
Parameters
(required) order
: Specify an order to the category values
[list <value>]
An ordered list of the categories that appear in the real data
Examples
from rdt.transformers.categorical import OrderedUniformEncoder
transformer = OrderedUniformEncoder(order=['STRONGLY DISAGREE', 'DISAGREE', 'NEUTRAL',
'AGREE', 'STRONGLY AGREE'])
The transformer assigns each category to a unique, non-overlapping subset of the [0,1]
interval. The order of the intervals is based on your custom order. The length of the interval is based on the category's frequency. For example if category 'AGREE'
occurs with 20% frequency, the subset will have the length 0.2
such as [0.5, 0.7]
.
Attributes
After fitting the transformer, you can access the learned values through the attributes.
frequencies
: A dictionary that maps each category value to the observed frequency, as a float between 0 and 1
>>> transformer.frequencies
{
'STRONGLY DISAGREE': 0.1,
'DISAGREE': 0.2,
'NEUTRAL': 0.2,
'AGREE': 0.2,
'STRONGLY AGREE': 0.3
}
intervals
: A dictionary that maps each category value to an interval between [0,1]
. This allows you to determine the exact rules used for transforming and reverse transforming.
>>> transformer.intervals
{
'STRONGLY DISAGREE': [0, 0.1],
'DISAGREE': [0.1, 0.3],
'NEUTRAL': [0.3, 0.5],
'AGREE': [0.5, 0.7],
'STRONGLY AGREE': [0.7, 1.0]
}
FAQs
Last updated