# OrderedUniformEncoder

**Compatibility:** `categorical` or `boolean` data

The `OrderedUniformEncoder` transforms data that represents ordered categorical values into a uniform distribution in the `[0,1]` interval. It preserves the frequencies of each category with high accuracy.

<figure><img src="https://2225246359-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVGX92M819eIp0rMg5elc%2Fuploads%2FZXTbdkN7BsWHivuXFGLP%2Frdt_transformers-glossary-categorical-ordered-uniform-encoder_June%2002%202025.png?alt=media&#x26;token=87f000f1-4e59-445c-95e0-913645d8cdd6" alt=""><figcaption></figcaption></figure>

```python
from rdt.transformers.categorical import OrderedUniformEncoder

transformer = OrderedUniformEncoder(order=['STRONGLY DISAGREE', 'DISAGREE', 'NEUTRAL',
                                           'AGREE', 'STRONGLY AGREE'])
```

## Parameters

(required) **`order`**: Specify an order to the category values

<table data-header-hidden><thead><tr><th width="220.5"></th><th></th></tr></thead><tbody><tr><td><code>[list &#x3C;value>]</code></td><td>An ordered list of the categories that appear in the real data</td></tr></tbody></table>

### Examples

```python
from rdt.transformers.categorical import OrderedUniformEncoder

transformer = OrderedUniformEncoder(order=['STRONGLY DISAGREE', 'DISAGREE', 'NEUTRAL',
                                           'AGREE', 'STRONGLY AGREE'])
```

The transformer assigns each category to a unique, non-overlapping subset of the `[0,1]` interval. The order of the intervals is based on your custom order. The length of the interval is based on the category's frequency. For example if category `'AGREE'` occurs with 20% frequency, the subset will have the length `0.2` such as `[0.5, 0.7]`.

## Attributes

After fitting the transformer, you can access the learned values through the attributes.

**`frequencies`**: A dictionary that maps each category value to the observed frequency, as a float between 0 and 1

```python
>>> transformer.frequencies
{
  'STRONGLY DISAGREE': 0.1, 
  'DISAGREE': 0.2,
  'NEUTRAL': 0.2,
  'AGREE': 0.2,
  'STRONGLY AGREE': 0.3
}
```

**`intervals`**: A dictionary that maps each category value to an interval between `[0,1]`. This allows you to determine the exact rules used for transforming and reverse transforming.

```python
>>> transformer.intervals
{
  'STRONGLY DISAGREE': [0, 0.1], 
  'DISAGREE': [0.1, 0.3],
  'NEUTRAL': [0.3, 0.5],
  'AGREE': [0.5, 0.7],
  'STRONGLY AGREE': [0.7, 1.0]
}
```

## FAQs

<details>

<summary>When should I use this transformer?</summary>

The OrderedUniformEncoder is shown to preserve the frequency of each category value with high accuracy. This is especially useful if you have a data imbalance.

</details>

<details>

<summary>What if my categorical column does not have an order?</summary>

This transformer is only defined for ordinal categorical data. If there is no order, your data is *nominal*. Use the [**UniformEncoder**](https://docs.sdv.dev/rdt/transformers-glossary/categorical/uniformencoder) instead.

</details>

<details>

<summary>What happens to missing values?</summary>

If there are missing values in your data, they should be defined as part of your order. Use the `None` keyword to denote a missing value.

In the example below, the missing value is added as the last item.

```python
OrderedUniformEncoder(order=['STRONGLY DISAGREE', 'DISAGREE',
                             'NEUTRAL', 'AGREE', 'STRONGLY AGREE',
                             None])
```

Add the missing value to whatever ordering position makes sense for your data. If you are unsure, consider adding it to the beginning or the end of the list.

</details>
