Search…
⌃K
Links

GeoExtractor (+Maps)

The GeoExtractor extracts geographical context from the phone numbers. It keeps the original phone numbers so that the same exact numbers can be recovered during the reverse transform.
from rdt_plus.tranfsormers.phone_number import GeoExtractor
ge = GeoExtractor()

Parameters

default_country: If phone number does not have an international country code, provide the country code to use.
(default) None
No default country. All phone numbers must have international country codes.
<string>
A string representing an Alpha-2 country code ("US")

Examples

from rdt_plus.transformers.phone_number import GeoExtractor
# the phone numbers are domestic US phone numbers
ge = GeoExtractor(default_country="US")

Combining with a map

To increase the privacy, we recommend mapping the original phone number data to new numbers before using this transformer.
from rdt_plus.transformers.phone_number import NewNumberMap
# Create a consistent map between original phone numbers and new ones
map = NewNumberMap(default_country='US')
map.fit(data, column=['phone_number'])
mapped_data = map.transform(data)
Now, you can use the GeoExtractor on the mapped data, which won't leak the PII of the original phone numbers.
ge = GeoExtractor(default_country='US')
ge.fit(mapped_data, column=['phone_number.phone_number'])
Anyone who has access to the NewNumberMap object also has access to the original phone numbers.
map.get_mapping()
{ '4086581972': '4081123345',
'3106591150': '3105551234',
'4158200978': '4156789100' }
Separating out the mapping from the extractor allows you to control who has access the real phone number data.

Is this right for my use case?

Privacy. By itself, the GeoExtractor keeps (and leaks) all the original phone numbers. This should only be used if the phone number data is not PII. If you use the GeoExtractor with the NewNumberMap, then the chances of leaking PII are lower.
Always evaluate the risk of a data leak before sharing your transformed or reverse transformed data.
Quality. The GeoExtractor with a mapping produces the highest quality transforms and reverse transforms in the Phone Number Add-On:
  • All geographical information is preserved
  • Information about individual, repeating phone numbers is also preserved

FAQs

There are two mapping transformers available in the Phone Number Add On. Anyone with access to the map object also has access to the original phone numbers.
The NewNumberMap creates new phone numbers and consistently maps them to the original ones.
from rdt_plus.transformers.phone_number import NewNumberMap
map = NewNumberMap(default_country='US')
map.fit(data, columns=['phone_number'])
mapped_data = map.transform(data)
# get the consistent mapping (original --> new number)
map.get_mapping('phone_number')
Example output: Mapping between original number and the new ones. The new ones were not in the original dataset, but they are from the same geographical region.
{ '4086581972': '4081123345',
'3106591150': '3105551234',
'4158200978': '4156789100' }
The ScrambledMap is another option that may expose more sensitive information. It consistently maps an original phone number with another existing phone number.
from rdt_plus.transformers.phone_number import ScrambledMap
map = ScrambledMap(default_country='US')
map.fit(data, columns=['phone_number'])
mapped_data = map.transform(data)
# get the consistent mapping (original --> another original number)
map.get_mapping('phone_number')
Example output: Notice that the new numbers are pulled from the original data.
{ '4081234567': '4082223344',
'4082223344': '4081001000',
'4081001000': '4081234567' }