OrderedUniformEncoder
Compatibility: categorical or boolean data
The OrderedUniformEncoder transforms data that represents ordered categorical values into a uniform distribution in the [0,1] interval. It preserves the frequencies of each category with high accuracy.

from rdt.transformers.categorical import OrderedUniformEncoder
transformer = OrderedUniformEncoder(order=['STRONGLY DISAGREE', 'DISAGREE', 'NEUTRAL',
'AGREE', 'STRONGLY AGREE'])Parameters
(required) order: Specify an order to the category values
[list <value>]
An ordered list of the categories that appear in the real data
Examples
from rdt.transformers.categorical import OrderedUniformEncoder
transformer = OrderedUniformEncoder(order=['STRONGLY DISAGREE', 'DISAGREE', 'NEUTRAL',
'AGREE', 'STRONGLY AGREE'])The transformer assigns each category to a unique, non-overlapping subset of the [0,1] interval. The order of the intervals is based on your custom order. The length of the interval is based on the category's frequency. For example if category 'AGREE' occurs with 20% frequency, the subset will have the length 0.2 such as [0.5, 0.7].
Attributes
After fitting the transformer, you can access the learned values through the attributes.
frequencies: A dictionary that maps each category value to the observed frequency, as a float between 0 and 1
>>> transformer.frequencies
{
'STRONGLY DISAGREE': 0.1,
'DISAGREE': 0.2,
'NEUTRAL': 0.2,
'AGREE': 0.2,
'STRONGLY AGREE': 0.3
}intervals: A dictionary that maps each category value to an interval between [0,1]. This allows you to determine the exact rules used for transforming and reverse transforming.
>>> transformer.intervals
{
'STRONGLY DISAGREE': [0, 0.1],
'DISAGREE': [0.1, 0.3],
'NEUTRAL': [0.3, 0.5],
'AGREE': [0.5, 0.7],
'STRONGLY AGREE': [0.7, 1.0]
}FAQs
Last updated
