LogoLogo
GitHubSlackDataCebo
  • RDT: Reversible Data Transforms
  • Getting Started
    • Installation
    • Quickstart
  • Usage
    • Basic Concepts
    • HyperTransformer
      • Preparation
      • Configuration
      • Transformation
  • Transformers Glossary
    • Numerical
      • ClusterBasedNormalizer
      • FloatFormatter
      • GaussianNormalizer
      • LogScaler
      • LogitScaler
      • * OutlierEncoder
      • ❖ DPECDFNormalizer
      • ❖ DPLaplaceNoiser
      • ❖ ECDFNormalizer
      • ❖ XGaussianNormalizer
    • Categorical
      • LabelEncoder
      • OrderedLabelEncoder
      • FrequencyEncoder
      • OneHotEncoder
      • OrderedUniformEncoder
      • UniformEncoder
      • BinaryEncoder
      • ❖ DPDiscreteECDFNormalizer
      • ❖ DPResponseRandomizer
      • ❖ DPWeightedResponseRandomizer
    • Datetime
      • OptimizedTimestampEncoder
      • UnixTimestampEncoder
      • ❖ DPTimestampLaplaceNoiser
    • ID
      • AnonymizedFaker
      • IndexGenerator
      • RegexGenerator
      • Treat IDs as categorical labels
    • Generic PII Anonymization
      • AnonymizedFaker
      • PseudoAnonymizedFaker
    • * Deep Data Understanding
      • * Address
        • * RandomLocationGenerator
        • * RegionalAnonymizer
      • * Email
        • * DomainBasedAnonymizer
        • * DomainBasedMapper
        • * DomainExtractor
      • * GPS Coordinates
        • * RandomLocationGenerator
        • * GPSNoiser
        • * MetroAreaAnonymizer
      • * Phone Number
        • * AnonymizedGeoExtractor
        • * NewNumberMapper
        • * GeoExtractor
  • Resources
    • Use Cases
      • Contextual Anonymization
      • Differential Privacy
      • Statistical Preprocessing
    • For Businesses
    • For Developers
Powered by GitBook
On this page
  • Parameters
  • Examples
  • FAQs
  1. Transformers Glossary
  2. * Deep Data Understanding
  3. * Address

* RegionalAnonymizer

Previous* RandomLocationGeneratorNext* Email

Last updated 6 months ago

The RegionalAnonymizer performs Contextual Anonymization on address data. It preserves the broad regions in your original data, while anonymizing the precise street address.

It transforms the data by dropping the precise location information. When reverse transforming, it generates new, fake street addresses for the same overall regions as your real data.

from rdt.transformers.address import RegionalAnonymizer

transformer = RegionalAnonymizer(locales=['en_US'])

Parameters

locales: An optional list of locales to use when generating the precise street addresses. These will be chosen from the list of available countries with the provided languages.

(default) ["en_US"]

Create precise, street-level data from the US in English.

<list>

Examples

from rdt.transformers.address import RegionalAnonymizer

transformer = RegionalAnonymizer(
    locales=['en_US', 'fr_CA']
)

# in the hypertransformer, ensure that each column has a supported sdtype
ht.update_sdtypes(column_name_to_sdtype={
    'addr_1': 'street_address',
    'city_name': 'city',
    'state': 'state_abbr',
})

# in the hypertransformer, assign set of columns to your transformer
ht.update_transformers(column_name_to_transformer={
    ('addr_1', 'city_name', 'state'): transformer
})

FAQs

Will the generated locations be real places?

The general region is guaranteed to be real. That is, the combination of the city, administrative unit, country, and post code can be found on a map. For example: Boston, MA USA 02116.

However, anything more precise than that will be fake, including street address and secondary address. For example: 123 Main St., Suite #204 is completely made up and not necessarily located in Boston. This is by design to help protect the privacy of real homes and businesses.

Will this transformer create broad areas that are not in my original data?

This transformer is designed to only generate the broader areas that are in your original data. For example, if your original data only contained addresses from California and New York, then the anonymized addresses will also only contain these regions. It will not contain any other city such as Boston.

Create precise, street-level data from the list of countries specified in the languages specified. For example [, ] creates a mix of locations from the US in English and from Canada in French.

This transformer takes multiple columns as input. Make sure that each column involved in your address is a supported sdtype such as city, state and postcode. For more information, see the .

If you'd like to create addresses from new locations, use the .

Worldwide, regional data is provided by .

RandomLocationGenerator
www.geonames.org
"en_US"
"fr_CA"

*SDV Enterprise Feature. This feature is available to our licensed users and is not currently in our public library. For more information, visit our page to .

Explore SDV
suported sdtypes list