Links

AnonymizedGeoExtractor

The AnonymizedGeoExtractor performs Contextual Anonymization on phone number data. It transforms phone numbers by extracting geographical context. When reversing the transform, it generates new, fake phone numbers in the correct context.
from rdt_plus.tranfsormers.phone_number import AnonymizedGeoExtractor
age = AnonymizedGeoExtractor()

Parameters

default_country: If phone number does not have an international country code, provide the country code to use.
(default) None
No default country. All phone numbers must have international country codes.
<string>
A string representing an Alpha-2 country code ("US")
match_unique_numbers_per_region: Limit the number of new phone numbers created to the number originally found in the dataset.
(default) False
Create a variety of new phone numbers based on the geography
True
Put a limit on the amount of new phone numbers created. Phone numbers will be recycled after the limit is reached.
Setting this to True will leak information about the number of phone numbers within each geographical region. However, these numbers will be newly created numbers that may not appear in the original data. Always evaluate the risk of a data leak before sharing your transformed data.

Examples

from rdt_plus.transformers.phone_number import AnonymizedGeoExtractor
# the phone numbers are domestic US phone numbers
age = AnonymizedGeoExtractor(default_country="US")
# the phone numbers are international; place a limit
# on the new phone numbers created
age = AnonymizedGeoExtractor(match_unique_numbers_per_region=True)

Is this right for my use case?

Privacy. Extracting geographical information may leak some PII about the phone numbers, especially if you set match_unique_numbers_per_region to True. However, the privacy risk is lowered because the original phone numbers are not present in the transformed data.
Always evaluate the risk of a data leak before sharing your transformed or reverse transformed data.
Quality. Deleting the original phone numbers may reduce the quality, but extracting geographical information provides valuable insight to anyone using the transformed data. If you set match_unique_numbers_per_region to True, then there is additional information about unique and repeating phone numbers.