# Contextual Anonymization

**Contextual Anonymization** is a novel technique invented at DataCebo to anonymize data types that have particular meanings in your business.

Contextual Anonymization combines the ability to anonymize Personal Identifiable Information (PII) while **preserving the format and the broader context of the original data**.&#x20;

<table><thead><tr><th width="150"></th><th width="189.79365045840106" align="center">Contextual Anonymization</th><th align="center">Existing Technique: Faking or Mapping</th><th align="center">Existing Technique: Generalization or Masking</th></tr></thead><tbody><tr><td><p>Preserves</p><p>format?</p></td><td align="center"><span data-gb-custom-inline data-tag="emoji" data-code="2705">✅</span></td><td align="center"><span data-gb-custom-inline data-tag="emoji" data-code="2705">✅</span></td><td align="center"><span data-gb-custom-inline data-tag="emoji" data-code="274c">❌</span></td></tr><tr><td>Preserves context?</td><td align="center"><span data-gb-custom-inline data-tag="emoji" data-code="2705">✅</span></td><td align="center"><span data-gb-custom-inline data-tag="emoji" data-code="274c">❌</span></td><td align="center"><span data-gb-custom-inline data-tag="emoji" data-code="2705">✅</span></td></tr></tbody></table>

## Existing Anonymization Techniques

Existing techniques preserve either the original format or the broader context of the data, but not both.

### Fake data doesn't capture the context

A common anonymization technique is to completely fake new PII values that match the format of the original. We can see how this works for phone number PII data.

![](https://2225246359-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVGX92M819eIp0rMg5elc%2Fuploads%2Fje5fThGu97AoQZJWRXkT%2Frdt_resources_contextual-anonymization-1_June%2002%202025.png?alt=media\&token=1310546c-c749-4e37-a4e5-730510dd557e)

However, you might notice that the context of the phone numbers are not preserved. Phone numbers have geographical context: The country and region codes indicate where the caller resides. This geographical context is lost in the fake numbers, so you cannot use the anonymized data if the context matters.

### Generalized data doesn't capture the format

The generalization technique explicitly extracts the context. In our example with phone numbers, one approach would be to anonymize the phone numbers by extracting the geographical areas. Alternatively, we could mask the non-contextual digits with the letter `'X'`.

![](https://2225246359-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVGX92M819eIp0rMg5elc%2Fuploads%2FPWrfVC4OiOIYC0Zq11uk%2Frdt_resources_contextual-anonymization-2_June%2002%202025.png?alt=media\&token=b086c22f-81ff-47ec-a5ce-e93aa442b94b)

Careful generalization can preserve the context but not the original format. The phone numbers don't make sense as actual numbers you can call. You cannot use the anonymized data if the format matters, for example if you want to put it through a QA testing suite that expects valid phone numbers.

## Introducing: Contextual Anonymization

**Contextual Anonymization** is a novel anonymization technique that produces contextually fake data. This preserves both the format of the original data and its context.

In our phone number data, this means that the anonymized data has the same geographical context *and* the same format of your original phone numbers.

![](https://2225246359-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVGX92M819eIp0rMg5elc%2Fuploads%2FpwuKeWNo5PvJBqwddwUP%2Frdt_resources_contextual-anonymization-3_June%2003%202025.png?alt=media\&token=fa84c23d-f31b-4de9-a4d4-ff2cc7675442)

This allows you to use the anonymized data wherever you may use the real data. You don't have to worry about missing the geographical context or incorrect formatting in the anonymized dataset.

### Layering other techniques: Mapping

Contextual Anonymization is not an isolated technique. You can layer it with others.

For example, the RDT phone number Add-On provides mapping functionality. Using it, you can contextually anonymize phone numbers in a consistent way: A repeating phone number is consistently mapped to the same contextually fake number.

{% hint style="info" %}
Read more about contextual anonymization in our blog post: <https://datacebo.com/blog/anonymization-techniques/>
{% endhint %}

## Try out Contextual Anonymization!

The RDT library offers transformers for contextually anonymizing different types of data.

* **＊ Physical/mailing addresses**: Use the [RandomLocationGenerator](https://docs.sdv.dev/rdt/transformers-glossary/deep-data-understanding/address/randomlocationgenerator) and [RegionalAnonymizer](https://docs.sdv.dev/rdt/transformers-glossary/deep-data-understanding/address/regionalanonymizer) to create anonymous addresses, while making sure the broader regions make sense.
* **＊ Email addresses**: Use the [DomainBasedAnonymizer](https://docs.sdv.dev/rdt/transformers-glossary/deep-data-understanding/email/domainbasedanonymizer) and [DomainBasedMapper](https://docs.sdv.dev/rdt/transformers-glossary/deep-data-understanding/email/domainbasedmapper) to to create anonymous email addresses, while making sure the email domains (like `gmail.com`) make sense.
* **＊ GPS coordinates** (latitude/longitude): Use the [RandomLocationGenerator](https://docs.sdv.dev/rdt/transformers-glossary/deep-data-understanding/gps-coordinates/randomlocationgenerator), [GPSNoiser](https://docs.sdv.dev/rdt/transformers-glossary/deep-data-understanding/gps-coordinates/gpsnoiser), and [MetroAreaAnonymizer](https://docs.sdv.dev/rdt/transformers-glossary/deep-data-understanding/gps-coordinates/metroareaanonymizer) to create new, anonymous latitude/longitude pairs within realistic regions.
* **＊ Phone number**: Use the [AnonymizedGeoExtractor](https://docs.sdv.dev/rdt/transformers-glossary/deep-data-understanding/phone-number/anonymizedgeoextractor) and [NewNumberMapper](https://docs.sdv.dev/rdt/transformers-glossary/deep-data-understanding/phone-number/newnumbermapper) to create anonymous phone numbers in the same overall countries/regions.

{% hint style="info" %}
**＊SDV Enterprise Feature.** This feature is available to our licensed users and is not currently in our public library. For more information, visit our page to [Explore SDV](https://docs.sdv.dev/sdv/explore/sdv-enterprise/compare-features).
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.sdv.dev/rdt/resources/use-cases/contextual-anonymization.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
