* AnonymizedGeoExtractor
*SDV Enterprise Feature. This feature is available to our licensed users and is not currently in our public library. To learn more about the SDV Enterprise and its extra features, get in touch with us.
The
AnonymizedGeoExtractor
performs Contextual Anonymization on phone number data. It transforms phone numbers by extracting geographical context. When reversing the transform, it generates new, fake phone numbers in the correct context.
from rdt.tranformers.phone_number import AnonymizedGeoExtractor
transformer = AnonymizedGeoExtractor()
default_country
: If phone number does not have an international country code, provide the country code to use.(default) None | No default country. All phone numbers must have international country codes. |
<string> |
match_unique_numbers_per_region
: Limit the number of new phone numbers created to the number originally found in the dataset.(default) False | Create a variety of new phone numbers based on the geography |
True | Put a limit on the amount of new phone numbers created. Phone numbers will be recycled after the limit is reached. |
Setting this to
True
will leak information about the number of phone numbers within each geographical region. However, these numbers will be newly created numbers that may not appear in the original data. Always evaluate the risk of a data leak before sharing your transformed data.from rdt.transformers.phone_number import AnonymizedGeoExtractor
transformer = AnonymizedGeoExtractor(
default_country="US",
match_unique_numbers_per_region=True
)
After fitting the transformer, you can access the learned values through the attributes.
region_to_unique_count
: The number of unique phone numbers that belong to every region found in the original data>>> transformer.region_to_unique_count
{
('US', 'Berkeley,CA'): 15,
('US', 'Cambridge,MA'): 34,
...
}
Note: If you have not selected to match unique numbers per region, then the transformer will not store these values and you'll see
None
instead.Last modified 3mo ago