# ＊ OutlierEncoder

{% hint style="info" %}
**＊SDV Enterprise Feature.** This feature is available to our licensed users and is not currently in our public library. For more information, visit our page to [Explore SDV](https://docs.sdv.dev/sdv/explore/sdv-enterprise/compare-features).
{% endhint %}

**Compatibility:** `numerical` data

The `OutlierEncoder` identifies the outliers to the left and right of the main data, and encodes this information in a new column. Then, it removes the outliers from the original column to make it easier for future data science use.

<figure><img src="https://2225246359-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVGX92M819eIp0rMg5elc%2Fuploads%2FsadQ2YdZKzOHFGRGiG3Y%2Frdt_transformers-glossary-numerical-outlier-encoder_June%2003%202025.png?alt=media&#x26;token=eb5ac516-a636-4a2f-b7a0-b9ed2edc68e2" alt=""><figcaption></figcaption></figure>

```python
from rdt.transformers.numerical import OutlierEncoder

transformer = OutlierEncoder()
```

## Parameters

**`distribution`**: The transformer approximates the shape (aka distribution) of the main values as well as the outliers. Use this parameter to specify the shape.

<table data-header-hidden><thead><tr><th width="226.5"></th><th></th></tr></thead><tbody><tr><td>(default) <code>'uniform'</code></td><td>Estimate the main values and outliers as uniform distributions</td></tr><tr><td><code>'truncnorm'</code></td><td>Estimate the main values and outliers using a truncated Gaussian distribution.</td></tr></tbody></table>

## Attributes

After fitting the transformer, you can access the learned values through the attributes.

**`box_plot_summary`**: A dictionary that stores the min, max and quartile values for the overall column

```python
>>> transformer.box_plot_summary
{
  'min': 0.0,
  'Q1': 5.0,
  'Q2': 10.50
  'Q3': 25.0,
  'max': 10000.0
}
```

**`iqr`**: A float that represents the [Interquartile Range](https://en.wikipedia.org/wiki/Interquartile_range#Outliers)

```python
>>> transformer.iqr
20.0
```

**`outlier_ranges`**: A dictionary that maps `'left_outliers'` to the left outlier ranges and `'right_outliers'` to the right outlier range. These may be `None` if there are no outliers.

```python
>>> transformer.outlier_ranges
{
  'left_outliers': None,
  'right_outliers': [55.0, 10000.0]
}
```

**`learned_distributions`**: A dictionary that maps `'left_outliers'`, `'main'` and `'right_outliers'` to the learned distribution for each area. These may be `None` if there are no values in the area.

```python
>>> my_transformer.learned_distributions
{
  'LEFT_OUTLIER': None,
  'MAIN': { 
    'distribution': 'uniform',
    'learned_parameters': { 'scale': 1.2, 'loc': 25.0 },
  },
  'RIGHT_OUTLIER': { 
    'distribution': 'uniform',
    'learned_parameters': { 'scale': 1.2, 'loc': 40.0 }
  }
}
```

## FAQs

<details>

<summary>When should I use this transformer?</summary>

This transformer is designed for numerical columns that contain outliers. The outliers may be on the left, right or both sides.

</details>

<details>

<summary>Will I see outliers again when I reverse transform the data?</summary>

Yes! If the initial data had outliers, the transformer will recreate outliers when reverse transforming the data. The outlier values it generates are estimates based on the learned parameters.

</details>
