PseudoAnonymizedFaker
Last updated
Last updated
Compatibility: pii
data
The PseudoAnonymizedFaker
pseudo-anonymizes private or sensitive data. When transforming the column, it converts the original data to numerical values. When reversing the transform, it pseudo-anonymizes the column by mapping each value to a completely new, fake data using the Python Faker library. Note that the mapping is consistent so the real, sensitive values can be recovered.
You can specify the exact faker method to use for more realistic data.
provider_name
: The name of the provider to use from the Faker library.
(default) None
<string>
function_name
: The name of the function to use within the Faker provider.
(default) 'lexify'
<string>
Together, the provider_name
and function_name
parameters specify exactly how to create fake data. Some common values are:
A full address: provider_name="address", function_name="address"
A basic bank account number: provider_name="bank", function_name="bban"
A full credit card number: provider_name="credit_card", function_name="credit_card_number"
Latitude/longitude coordinates: provider_name="geo", function_name="local_latlng"
A phone number: provider_name="phone_number", function_name="phone_number"
To browse for more options, visit the Faker library's docs.
function_kwargs
: Optional parameters to pass into the function that you're specifying to create Fake data.
(default) None
Do not specify any additional parameters
<dictionary>
locales
: An optional list of locales to use when generating the Fake data.
(default) None
Use the default locale, which is usually set to the country you are in.
<list>
Setting a locale might leak information about the original data. Anyone with access to the anonymized data will be able to tell which countries and locales are included in the original data .
Use the from Faker, which capable of creating random text.
Use the provider for a specific context, for example or .
Use the to create random 4-character text.
Use the function from the specified provider to generate fake data. For example, from the address provider or from the bank provider.
Additional parameters to add. These are unique to the function name and should be represented as a dictionary.
For example for the banking function, you can specify: {"length": 11, "primary": True}
.
Create data from the list of locales. These are specified as strings representing the language and country from Faker.
For example [
,
]
.