PseudoAnonymizedFaker
Compatibility: pii
data
The PseudoAnonymizedFaker
pseudo-anonymizes private or sensitive data. When transforming the column, it converts the original data to numerical values. When reversing the transform, it pseudo-anonymizes the column by mapping each value to a completely new, fake data using the Python Faker library. Note that the mapping is consistent so the real, sensitive values can be recovered.

from rdt.transformers.pii import PseudoAnonymizedFaker
transformer = PseudoAnonymizedFaker()
You can specify the exact faker method to use for more realistic data.
Parameters
provider_name
: The name of the provider to use from the Faker library.
(default) None
Use the BaseProvider from Faker, which capable of creating random text.
function_name
: The name of the function to use within the Faker provider.
(default) 'lexify'
Use the lexify method to create random 4-character text.
<string>
Use the function from the specified provider to generate fake data. For example, "street_address"
from the address provider or "swift"
from the bank provider.
function_kwargs
: Optional parameters to pass into the function that you're specifying to create Fake data.
(default) None
Do not specify any additional parameters
<dictionary>
Additional parameters to add. These are unique to the function name and should be represented as a dictionary.
For example for the banking "swift"
function, you can specify: {"length": 11, "primary": True}
.
locales
: An optional list of locales to use when generating the Fake data.
Setting a locale might leak information about the original data. Anyone with access to the anonymized data will be able to tell which countries and locales are included in the original data .
Examples
from rdttransformers.pii import PseudoAnonymizedFaker
# create more realistic-looking data by specifying a provider and function
transformer = PseudoAnonymizedFaker(
provider_name="person",
function_name="name")

FAQs
Last updated