Preparation
Use a HyperTransformer to manage all the transformers you're applying to a multi-column dataset.
Create one by importing it from the
rdt
library. There are no parameters.from rdt import HyperTransformer
ht = HyperTransformer()
The RDT library uses pandas -- a popular open source library for data manipulation. The HyperTransformer expects your data is a pandas DataFrame object.
There are a variety of ways to load your data into the expected format. The most common case is your dataset being a csv file:
import pandas as pd
customers = pd.read_csv('./datasets/customers.csv')
Refer to the pandas documentation for more information about reading csv files or other types of files.
Last modified 3mo ago