＊SDV Enterprise Feature. This feature is available to our licensed users and is not currently in our public library. To learn more about the SDV Enterprise and its extra features, get in touch with us.
AnonymizedGeoExtractorperforms Contextual Anonymization on phone number data. It transforms phone numbers by extracting geographical context. When reversing the transform, it generates new, fake phone numbers in the correct context.
from rdt.tranformers.phone_number import AnonymizedGeoExtractor
transformer = AnonymizedGeoExtractor()
default_country: If phone number does not have an international country code, provide the country code to use.
match_unique_numbers_per_region: Limit the number of new phone numbers created to the number originally found in the dataset.
Setting this to
Truewill leak information about the number of phone numbers within each geographical region. However, these numbers will be newly created numbers that may not appear in the original data. Always evaluate the risk of a data leak before sharing your transformed data.
from rdt.transformers.phone_number import AnonymizedGeoExtractor
transformer = AnonymizedGeoExtractor(
After fitting the transformer, you can access the learned values through the attributes.
region_to_unique_count: The number of unique phone numbers that belong to every region found in the original data
('US', 'Berkeley,CA'): 15,
('US', 'Cambridge,MA'): 34,
Note: If you have not selected to match unique numbers per region, then the transformer will not store these values and you'll see