* OutlierEncoder
*SDV Enterprise Feature. This feature is available to our licensed users and is not currently in our public library. To learn more about the SDV Enterprise and its extra features, get in touch with us.
Compatibility: numerical
data
The OutlierEncoder
identifies the outliers to the left and right of the main data, and encodes this information in a new column. Then, it removes the outliers from the original column to make it easier for future data science use.
Parameters
distribution
: The transformer approximates the shape (aka distribution) of the main values as well as the outliers. Use this parameter to specify the shape.
(default) | Estimate the main values and outliers as uniform distributions |
| Estimate the main values and outliers using a truncated Gaussian distribution. |
Attributes
After fitting the transformer, you can access the learned values through the attributes.
box_plot_summary
: A dictionary that stores the min, max and quartile values for the overall column
iqr
: A float that represents the Interquartile Range
outlier_ranges
: A dictionary that maps 'left_outliers'
to the left outlier ranges and 'right_outliers'
to the right outlier range. These may be None
if there are no outliers.
learned_distributions
: A dictionary that maps 'left_outliers'
, 'main'
and 'right_outliers'
to the learned distribution for each area. These may be None
if there are no values in the area.
FAQs
Last updated