PseudoAnonymizedFaker
Compatibility: pii
data
The PseudoAnonymizedFaker
pseudo-anonymizes private or sensitive data. When transforming the column, it converts the original data to numerical values. When reversing the transform, it pseudo-anonymizes the column by mapping each value to a completely new, fake data using the Python Faker library. Note that the mapping is consistent so the real, sensitive values can be recovered.
You can specify the exact faker method to use for more realistic data.
Parameters
provider_name
: The name of the provider to use from the Faker library.
(default) | Use the BaseProvider from Faker, which capable of creating random text. |
|
function_name
: The name of the function to use within the Faker provider.
(default) | Use the lexify method to create random 4-character text. |
| Use the function from the specified provider to generate fake data. For example, |
Together, the provider_name
and function_name
parameters specify exactly how to create fake data. Some common values are:
A full address:
provider_name="address", function_name="address"
A basic bank account number:
provider_name="bank", function_name="bban"
A full credit card number:
provider_name="credit_card", function_name="credit_card_number"
Latitude/longitude coordinates:
provider_name="geo", function_name="local_latlng"
A phone number:
provider_name="phone_number", function_name="phone_number"
To browse for more options, visit the Faker library's docs.
function_kwargs
: Optional parameters to pass into the function that you're specifying to create Fake data.
(default) | Do not specify any additional parameters |
| Additional parameters to add. These are unique to the function name and should be represented as a dictionary.
For example for the banking |
locales
: An optional list of locales to use when generating the Fake data.
(default) | Use the default locale, which is usually set to the country you are in. |
|
Setting a locale might leak information about the original data. Anyone with access to the anonymized data will be able to tell which countries and locales are included in the original data .
Examples
FAQs
Last updated