LogoLogo
GitHubSlackDataCebo
  • RDT: Reversible Data Transforms
  • Getting Started
    • Installation
    • Quickstart
  • Usage
    • Basic Concepts
    • HyperTransformer
      • Preparation
      • Configuration
      • Transformation
  • Transformers Glossary
    • Numerical
      • ClusterBasedNormalizer
      • FloatFormatter
      • GaussianNormalizer
      • LogScaler
      • LogitScaler
      • * OutlierEncoder
      • ❖ DPECDFNormalizer
      • ❖ DPLaplaceNoiser
      • ❖ ECDFNormalizer
      • ❖ XGaussianNormalizer
    • Categorical
      • LabelEncoder
      • OrderedLabelEncoder
      • FrequencyEncoder
      • OneHotEncoder
      • OrderedUniformEncoder
      • UniformEncoder
      • BinaryEncoder
      • ❖ DPDiscreteECDFNormalizer
      • ❖ DPResponseRandomizer
      • ❖ DPWeightedResponseRandomizer
    • Datetime
      • OptimizedTimestampEncoder
      • UnixTimestampEncoder
      • ❖ DPTimestampLaplaceNoiser
    • ID
      • AnonymizedFaker
      • IndexGenerator
      • RegexGenerator
      • Treat IDs as categorical labels
    • Generic PII Anonymization
      • AnonymizedFaker
      • PseudoAnonymizedFaker
    • * Deep Data Understanding
      • * Address
        • * RandomLocationGenerator
        • * RegionalAnonymizer
      • * Email
        • * DomainBasedAnonymizer
        • * DomainBasedMapper
        • * DomainExtractor
      • * GPS Coordinates
        • * RandomLocationGenerator
        • * GPSNoiser
        • * MetroAreaAnonymizer
      • * Phone Number
        • * AnonymizedGeoExtractor
        • * NewNumberMapper
        • * GeoExtractor
  • Resources
    • Use Cases
      • Contextual Anonymization
      • Differential Privacy
      • Statistical Preprocessing
    • For Businesses
    • For Developers
Powered by GitBook
On this page
  • Parameters
  • FAQs
  1. Transformers Glossary
  2. ID

IndexGenerator

PreviousAnonymizedFakerNextRegexGenerator

Last updated 1 month ago

In previous versions of RDT, this transformer was called IDGenerator.

Compatibility: id data

The IndexGenerator is used to create an indexed ID column that you may be using as a primary key. When transforming the data, it removes the column. When reversing the the transform, it creates an indexed ID column starting at a specified counter, with any additional prefixes or suffixes you provide.

from rdt.transformers.text import IndexGenerator

transformer = IndexGenerator(prefix='ID_', starting_value=0, suffix='-synthetic')

Parameters

prefix: A string with the prefix to use for the counter. All generated IDs will have the prefix.

(default) None

Do not add a prefix

<string>

Add the prefix to every ID

starting_value: The starting value for the counter

(default) 0

Start the counter at 0.

<integer>

Use the integer as the starting value. This must be >=0.

suffix: A string with the suffix to use for the counter. All generated IDs will have the suffix.

(default) None

Do not add a suffix

<string>

Add the suffix to every ID

FAQs

Will the generated IDs always be unique?

Yes. The generated IDs start at the starting_value parameter, and always increment by 1 for each new ID. Since there is no maximum value, the transformer will create unique IDs. (This is useful for primary keys.)

When should I use this transformer?

The IndexGenerator is useful for ID columns that do not have any mathematical meaning. This transformer is useful for columns that represent indexed IDs, such as a primary key column.

Tip: Use the method to reset the counter back to the original starting_value.

reset_randomization