❖ DPDiscreteECDFNormalizer
Last updated
Last updated
Compatibility: categorical
data
The DPDiscreteECDFNormalizer
uses differential privacy techniques to normalize your categorical values into a numerical column that is uniform or normal. To do this, estimates the empirical distribution and adds differentially private noise to your data. (On the reverse transform, this transformer brings the data back into the original category values.)
(required) epsilon
: A float >0 that represents the privacy loss budget you are willing to accommodate.
order_by
: Apply a prescribed ordering scheme. Use this if the discrete categorical values have an order.
(default) None
Do not apply a particular order
'numerical_value'
If the data is represented by integers or floats, order by those values
'alphabetical'
If the data is represented by strings, order them alphabetically.
normalized_distribution
: Add this argument to control the shape of the transformed data. Choose whatever is easiest for your downstream use case.
(default) 'uniform'
Transform the data into a uniform distribution, between 0 and 1.
'norm'
Transform the data into a standard normal distribution, aka a bell curve with mean of 0 and standard deviation of 1.