LogoLogo
GitHubSlackDataCebo
  • RDT: Reversible Data Transforms
  • Getting Started
    • Installation
    • Quickstart
  • Usage
    • Basic Concepts
    • HyperTransformer
      • Preparation
      • Configuration
      • Transformation
  • Transformers Glossary
    • Numerical
      • ClusterBasedNormalizer
      • FloatFormatter
      • GaussianNormalizer
      • LogScaler
      • LogitScaler
      • * OutlierEncoder
      • ❖ DPECDFNormalizer
      • ❖ DPLaplaceNoiser
      • ❖ ECDFNormalizer
      • ❖ XGaussianNormalizer
    • Categorical
      • LabelEncoder
      • OrderedLabelEncoder
      • FrequencyEncoder
      • OneHotEncoder
      • OrderedUniformEncoder
      • UniformEncoder
      • BinaryEncoder
      • ❖ DPDiscreteECDFNormalizer
      • ❖ DPResponseRandomizer
      • ❖ DPWeightedResponseRandomizer
    • Datetime
      • OptimizedTimestampEncoder
      • UnixTimestampEncoder
      • ❖ DPTimestampLaplaceNoiser
    • ID
      • AnonymizedFaker
      • IndexGenerator
      • RegexGenerator
      • Treat IDs as categorical labels
    • Generic PII Anonymization
      • AnonymizedFaker
      • PseudoAnonymizedFaker
    • * Deep Data Understanding
      • * Address
        • * RandomLocationGenerator
        • * RegionalAnonymizer
      • * Email
        • * DomainBasedAnonymizer
        • * DomainBasedMapper
        • * DomainExtractor
      • * GPS Coordinates
        • * RandomLocationGenerator
        • * GPSNoiser
        • * MetroAreaAnonymizer
      • * Phone Number
        • * AnonymizedGeoExtractor
        • * NewNumberMapper
        • * GeoExtractor
  • Resources
    • Use Cases
      • Contextual Anonymization
      • Differential Privacy
      • Statistical Preprocessing
    • For Businesses
    • For Developers
Powered by GitBook
On this page
  • Parameters
  • Examples
  1. Transformers Glossary
  2. * Deep Data Understanding
  3. * Email

* DomainExtractor

Previous* DomainBasedMapperNext* GPS Coordinates

Last updated 6 months ago

The DomainExtractor extracts domains from emails so that they can be used later for data science. It keeps the original emails so that the same exact emails can be recovered during the reverse transform.

from rdt.transformers.email import DomainExtractor

transformer = DomainExtractor()

Parameters

extracted_domain: Which parts of the overall email domain to extract during the transformation phase

(default) 'full'

Extract the full domain, which is everything after the @ sign. For example if the email is 'info@datacebo.com', the full domain is 'datacebo.com'.

'top'

Extract only the top domain, which is everything after the . character. For example if the email is 'info@datacebo.com', the top domain is 'com'.

Examples

from rdt.transformers.email import DomainExtractor

transformers = DomainExtractor(extracted_domain='top')

*SDV Enterprise Feature. This feature is available to our licensed users and is not currently in our public library. For more information, visit our page to .

Explore SDV