* DomainBasedMapper

*SDV Enterprise Feature. This feature is available to our licensed users and is not currently in our public library. To learn more about the SDV Enterprise and its extra features, visit our website.

The DomainBasedMapper creates and applies a consistent mapping between real emails and fake, synthetic emails. The mapping preserves the domain of the email. That is, emails belonging to a domain are mapped to fake emails in the same domain.

from rdt.transformers.email import DomainBasedMapper

mapper = DomainBasedMapper()

Parameters

preserved_domain: Which parts of the overall email domain to preserve during the transformation phase

(default) 'full'

Preserve the full domain, which is everything after the @ sign. For example if the email is 'info@datacebo.com', it will be mapped to a fake email that also ends with 'datacebo.com'.

'top'

Extract only the top domain, which is everything after the . character. For example if the email is 'info@datacebo.com', it will be mapped a fake email that also ends with '.com'.

obfuscate_emails: Control whether the synthetic email looks realistic or follows random patterns.

(default) False

Create realistic-looking usernames and emails such as 'johndoe@gmail.com'.

True

Obfuscate the usernames and emails to create random values such as 'dkep22ocp2@sdv-example.com'.

Setting this to False may result in emails that correspond to real user emails by complete coincidence. If you are worried about creating emails that accidentally correspond to real users, please set this toTrue.

Examples

from rdt.transformers.email import DomainBasedMapper

mapper = DomainBasedMapper(
    preserved_domain='full',
    obfuscate_emails=True
)

Attributes

After fitting the transformer, you can access the learned values through the attributes.

mapping: A dictionary that maps the original, real emails to the new, fake email

>>> transformer.mapping
{
    'info@datacebo.com': 'dkep22ocp2@datacebo.com',
    'admissions@mit.edu': '9p3ocoo1pf@mit.edu',
    ...
}

Last updated