* RegionalAnonymizer

*SDV Enterprise Feature. This feature is available to our licensed users and is not currently in our public library. To learn more about the SDV Enterprise and its extra features, visit our website.

The RegionalAnonymizer performs Contextual Anonymization on address data. It preserves the broad regions in your original data, while anonymizing the precise street address.

It transforms the data by dropping the precise location information. When reverse transforming, it generates new, fake street addresses for the same overall regions as your real data.

from rdt.transformers.address import RegionalAnonymizer

transformer = RegionalAnonymizer(locales=['en_US'])

Parameters

locales: An optional list of locales to use when generating the precise street addresses. These will be chosen from the list of available countries with the provided languages.

Examples

This transformer takes multiple columns as input. Make sure that each column involved in your address is a supported sdtype such as city, state and postcode. For more information, see the suported sdtypes list.

from rdt.transformers.address import RegionalAnonymizer

transformer = RegionalAnonymizer(
    locales=['en_US', 'fr_CA']
)

# in the hypertransformer, ensure that each column has a supported sdtype
ht.update_sdtypes(column_name_to_sdtype={
    'addr_1': 'street_address',
    'city_name': 'city',
    'state': 'state_abbr',
})

# in the hypertransformer, assign set of columns to your transformer
ht.update_transformers(column_name_to_transformer={
    ('addr_1', 'city_name', 'state'): transformer
})

FAQs

Will the generated locations be real places?

The general region is guaranteed to be real. That is, the combination of the city, administrative unit, country, and post code can be found on a map. For example: Boston, MA USA 02116.

However, anything more precise than that will be fake, including street address and secondary address. For example: 123 Main St., Suite #204 is completely made up and not necessarily located in Boston. This is by design to help protect the privacy of real homes and businesses.

Will this transformer create broad areas that are not in my original data?

This transformer is designed to only generate the broader areas that are in your original data. For example, if your original data only contained addresses from California and New York, then the anonymized addresses will also only contain these regions. It will not contain any other city such as Boston.

If you'd like to create addresses from new locations, use the RandomLocationGenerator.

Worldwide, regional data is provided by www.geonames.org.

Last updated