# Basic Concepts

The RDT library is a collection of objects that can understand your raw data convert it into cleaned, numerical data.&#x20;

## Transformers

Transformers are the basic building blocks. They are designed to modify a single column of your dataset. All transformers can also be reversed.

![](https://2225246359-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVGX92M819eIp0rMg5elc%2Fuploads%2FLfzUKzoY8FqJCpB6gcFE%2Frdt_basic-concepts-transformers_June%2002%202025.png?alt=media\&token=39e18cec-210e-414a-aa2b-371779c76e7f)

Transformers are designed to work on specific types of data using different techniques. You can determine which strategies to use for your data, including handling missing values.

{% hint style="info" %}
The [Transformers Glossary](https://docs.sdv.dev/rdt/transformers-glossary) contains a full list of available transformers and their settings.
{% endhint %}

## HyperTransformer

The **HyperTransformer** manages all the transformers you need for an entire, multi-column dataset. You can mix and match your favorite transformers on different columns of your data.

![](https://2225246359-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVGX92M819eIp0rMg5elc%2Fuploads%2Fl9PgfcHAYkfbWyiG2Yn5%2Frdt_transform.gif?alt=media\&token=0922689f-3469-494e-ae89-c98eca2b64ed)

You can also reverse the process to recover the original data format.

{% hint style="info" %}
Read the [HyperTransformer usage guide](https://docs.sdv.dev/rdt/usage/hypertransformer) to learn more.
{% endhint %}

## Sdtypes

The RDT library uses sdtypes to keep track of what each column in your data represents. You can think of an sdtype as representing the **semantic** (or statistical) meaning of a datatype.

The valid sdtypes in the public RDT library are: `'categorical'`, `'datetime'`, `'numerical'`, `'pii'` and `'id'`. More are available to licensed, Enterprise users.&#x20;

*Older versions of RDT before 1.13.0 included an sdtype called `'text'`. In the newer versions, please use `'id'` instead.*

{% hint style="info" %}
An sdtype is a high level concept that does not depend on how a computer stores the data. A single sdtype (such as `'categorical'`) can be stored by a computer in several ways (text, integer, etc).
{% endhint %}

## Config

The config describes the plan for transforming all the columns in a dataset. It describes the columns in your dataset, their sdtypes and the transformer that will be applied to each one.&#x20;

```python
{
  'sdtypes': {
    'last_login': 'datetime',
    'email_optin': 'boolean',
    'credit_card': 'categorical',
    'age': 'numerical',
    'dollars_spent': 'numerical'
  },
  'transformers': {
    'last_login': UnixTimestampEncoder(),
    'email_optin': LabelEncoder(add_noise=True),
    'credit_card': None, # do not do anything with this column
    'age': None, # do not do anything with this column
    'dollars_spent': FloatFormatter(missing_value_replacement="random")
  }
}
```

In the example above, different transformers are assigned to each column, based on their types. Some columns do not have a transformer assigned to them, indicating that their data will not be transformed.

{% hint style="success" %}
**Some transformers work on a combination of columns.** For example, addresses may be present in multiple columns each corresponding to a different sdtype such as city or postcode. You can supply multiple columns to a transformer using a tuple.

```python
{
    'sdtypes': {
        'name': 'pii',
        'age': 'numerical',
        'addr_1': 'street_address',
        'addr_2': 'secondary_address',
        'city': 'city',
        'state': 'state_abbr'
    },
    'transformers': {
        'name': AnonymizedFaker(),
        'age': FloatFormatter(missing_value_replacement="random"),
        ('addr_1', 'addr_2', 'city', 'state'): RandomLocationGenerator()
    }
}
```

{% endhint %}
