Transformation
Last updated
Last updated
After you have set the customization options, you can finish processing the transformer and begin using it to transform between raw and numerical data.
In the fit
stage, the HyperTransformer references the config you set in the previous step while learning from your data values.
Parameters
(required) data
: A object that contains your data
Output (None)
Examples
If you ever change your config, you must re-run fit
to see any changes to the transformations.
Use the transform
method to transform all the columns in your dataset at once.
Parameters
Output A new pandas DataFrame with the transformed data. This DataFrame has fully numerical data that can be used for your data science projects.
In many cases you will want to fit
and transform
the same data. As a shortcut, you can use this method to do both at once.
Parameters
Output A new pandas DataFrame with the transformed data. This DataFrame has fully numerical data that can be used for your data science projects.
Examples
Use this method to recover data in the same format as the original. This method works just like transform
but in reverse.
Parameters
Output A pandas DataFrame with the reverse transformed data. This data has the same column names and format as the original data.
Examples
In some cases, you may only have access to a subset of columns from the original dataset. In this case, you can use the HyperTransformer to transform only a few columns, and not the full dataset.
Use this method to transform a dataset that contains only a subset of the columns that were in the original data.
Parameters
Output A new pandas DataFrame with the transformed data. This DataFrame has fully numerical data that can be used for your data science projects.
Use this method to reverse transform a dataset that contains only a subset of the columns that were in the original data.
Parameters
Output A pandas.DataFrame with the reverse transformed data. This data has the same column names and format as the original data.
Examples
Instead of transforming data, your use case might require a fully anonymizing certain columns. You may also need to control the randomness during this process.
Use this method to anonymize columns from scratch.
Parameters
(required) num_rows
: An integer >0 that describes the number of rows you want to create
Examples
Transformers may require some randomness during any of the methods above. In some cases, you may want to control this to guarantee that you get the same data for different runs.
Use this method to reset the random seed that the transformers use. After using this method, any fitting, transformation or anonymization request you make will be the same as before.
Parameters None
Output None
Examples
In this example, calling reset randomization will mean that reversed_data1
and reversed_data3
are equivalent.
The same will be true for any other call such as transform
, fit
or create_anonymized_columns
.
(required) data
: A object with that contains your data.
(required) data
: A object that contains your data
(required) data
: A object containing transformed data
(required) data
: A object with that contains your data. The data contains a subset of the columns that were in the original dataset.
(required) data
: A object containing transformed data. The data contains a subset of the overall columns.
(required) column_names
: A list of strings representing the column names that you want to create. Each column in this list must be assigned to either or in order to work. If you want to use other transformers, you'll need to reverse transform the data intead.
Output A that contains anonymized data for each column name for the desired number of rows