Compatibility: boolean data

The BinaryEncoder transforms True and False values into numerical values of 0 and 1.

from rdt.transformers.boolean import BinaryEncoder
transformer = BinaryEncoder()


missing_value_replacement: Add this argument to replace missing values during the transform phase

(deprecated) model_missing_values: Use the missing_value_generation parameter instead.

missing_value_generation: Add this argument to determine how to recreate missing values during the reverse transform phase


from rdt.transformers.boolean import BinaryEncoder
transformer = BinaryEncoder(missing_value_replacement='mode',


Should I replace missing values?

The decision to replace missing values is based on how you plan to use your data. For example, you might be using RDT to clean your data for machine learning (ML). Check to see whether the ML techniques you plan to use allow missing values.

What methods are the best for replacing missing values?

The method for replacing missing values is dependent on what they mean in your dataset. For example, if missing values are the equivalent of False, replace them with a 0.

When is it necessary to model missing values?

When setting the model_missing_values parameter, consider whether the "missingness" of the data is something important. For example, maybe the user opted out of supplying the info on purpose, or maybe a missing value is highly correlated with another column your dataset. If "missingness" is something you want to account for, you should model missing values.

Last updated