# Configuration

In order for your HyperTransformer to work, you'll need to provide it a configuration that describes:

* the columns in your dataset and
* the transformers that should be applied to turn them into numerical data.

## Creating the config

To create the config you can either allow the HyperTransformer to automatically detect it from your data or you can write it by hand.

### detect\_initial\_config()

This method automatically detects the config from your data and sets it. It overrides any existing config you may have previously set or detected.

**Parameters**

* (required) **`data`**: a [pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) containing your data.

**Output** (None) This function prints out the status and detected config. The config describes the sdtypes of each column and the transformer objects that will be used for each. For more details, see the [Basic Concepts](https://docs.sdv.dev/rdt/basic-concepts#config) guide.&#x20;

**Examples**

```python
ht.detect_initial_config(data=customers)
```

```python
Detecting a new config from the data ... SUCCESS
Setting the new config ... SUCCESS

Config:
{
  'sdtypes': {
    'last_login': 'datetime',
    'email_optin': 'boolean',
    'credit_card': 'categorical',
    'age': 'numerical',
    'dollars_spent': 'numerical'
  },
  'transformers': {
    'last_login': UnixTimestampEncoder(missing_value_replacement="mean"),
    'email_optin': UniformEncoder(),
    'credit_card': UniformEncoder(),
    'age': FloatFormatter(),
    'dollars_spent': FloatFormatter(missing_value_replacement="mean")
  }
}
```

### set\_config()

This method sets the config. Use this as an alternative to `detect_initial_config` if you want to write and set the config manually.

**Parameters**

* (required) **`config`**: A nested dictionary that describes the config. It must follow the format shown below.

```python
{
  'sdtypes': {
    'column_name': <sdtype>,
    'column_name': <sdtype>,
    ...
  },
  'transformers': {
    'column_name': <transformer object>,
    'column_name': <transformer object>,
    ...
  } 
}
```

{% hint style="info" %}
The public RDT supports the following sdtypes:`'categorical'`, `'datetime'`, `'numerical'`, `'pii'` and `'id'`

You can use any transformer object from the RDT (or specify `None` if you do not want to transform the column). Visit the [Transformers Glossary](https://docs.sdv.dev/rdt/transformers-glossary) to browse through the available transformers and their settings.

See the [Config guide](https://docs.sdv.dev/rdt/basic-concepts#config) for more details.
{% endhint %}

**Output** (None)

**Examples**

You must provide the full config that describes all the columns in your dataset.

```python
from rdt.transformers.datetime import UnixTimestampEncoder
from rdt.transformers.categorical import LabelEncoder
from rdt.transformers.numerical import FloatFormatter

ht.set_config(config={
  'sdtypes': {
    'last_login': 'datetime',
    'email_optin': 'boolean',
    'credit_card': 'categorical',
    'age': 'numerical',
    'dollars_spent': 'numerical'
  },
  'transformers': {
    'last_login': UnixTimestampEncoder(missing_value_replacement="mean"),
    'email_optin': UniformEncoder(),
    'credit_card': UniformEncoder(),
    'age': None,
    'dollars_spent': FloatFormatter(missing_value_replacement="mean")
  }
})
```

## Viewing the config

### get\_config()

At any point, you can use this method to retrieve the current config.

**Parameters** (None)

**Output** A nested dictionary that describes the config. It follows the format shown below.

```python
{
  'sdtypes': {
    'column_name': <sdtype>,
    'column_name': <sdtype>,
    ...
  },
  'transformers': {
    'column_name': <transformer object>,
    'column_name': <transformer object>,
    ...
  } 
```

See the [Config guide](https://docs.sdv.dev/rdt/basic-concepts#config) for more details.

**Examples**

```
config = ht.get_config()
```

```python
{
  'sdtypes': {
    'last_login': 'datetime',
    'email_optin': 'boolean',
    'credit_card': 'categorical',
    'age': 'numerical',
    'dollars_spent': 'numerical'
  },
  'transformers': {
    'last_login': UnixTimestampEncoder(missing_value_replacement="mean"),
    'email_optin': UniformEncoder(),
    'credit_card': UniformEncoder(),
    'age': None,
    'dollars_spent': FloatFormatter(missing_value_replacement="mean")
  }
}
```

## Modifying the config

Customize your HyperTransformer by modifying the config.

### update\_sdtypes()

This method modifies the sdtypes. It also automatically assigns a new transformer that's compatible with the new sdtype.

**Parameters**

* (required) **`column_name_to_sdtype`**: A dictionary that maps a column name to its new sdtype. The public RDT supports `'boolean'`, `'categorical'`, `'datetime'`, `'numerical'`, `'pii'` and `'id'` sdtypes. More are available for licensed users.

**Output** (None) After using this method, you can use `get_config()` to verify the changes.

**Examples**

```python
ht.update_sdtypes(column_name_to_sdtype={
  'last_login': 'datetime',
  'email_optin': 'categorical'
})
```

### update\_transformers()

This method updates the transformers that will be used on specific columns. Use it to customize your HyperTransformer, for example by changing a transformer setting or swapping out one transformer for another.

**Parameters**

* (required) **`column_name_to_transformer`**: A dictionary that maps a column name to the new transformer that will be used on it.&#x20;

{% hint style="info" %}
You can use any transformer object from the RDT. Visit the [Transformers Glossary](https://docs.sdv.dev/rdt/transformers-glossary) to browse through the available transformers and their settings.
{% endhint %}

**Output** (None) After using this method, you can use `get_config()` to verify the changes.

**Examples**

To update transformers, you must first create the transformers you want to use and then apply the method.

```python
from rdt.transformers.datetime import OptimizedTimestampEncoder
from rdt.transformers.categorical import LabelEncoder

# create new transformer objects
login_transformer = OptimizedTimestampEncoder(missing_value_replacement='random')
credit_transformer = LabelEncoder(add_noise=True)

# update the columns to use our the new transformers
ht.update_transformers(column_name_to_transformer={
  'last_login': login_transformer,
  'credit_card': credit_transformer
})
```

### remove\_transformers()

This method removes transformers for specific columns. Use this is if you do not want the HyperTransformer to modify certain columns at all. It will skip over the column names and modify the remaining columns that do have transformers.

**Parameters**

* (required) **`column_names`**: A list of column names. The transformers for these column names are removed.

**Output** (None) After using this method, you can use `get_config()` to verify the changes.

**Examples**

```python
# do not transform the credit_card or age columns
ht.remove_transformers(column_names=['credit_card', 'age'])
```

### update\_transformers\_by\_sdtype()

This method updates all columns of a given sdtype to using a specific transformer.

**Parameters**

* (required) **`sdtype`**: An sdtype. This method will select all columns that match the sdtype.
* (required) **`transformer_name`**: A string with the name of the transformer to use.&#x20;
* **`transformer_parameters`**: A dictionary that maps the name of the transformer parameter (string) to the parameter value. Use this if you want to override the default settings.

{% hint style="info" %}
Visit the [Transformers Glossary](https://docs.sdv.dev/rdt/transformers-glossary) to browse through the available transformers and their settings.
{% endhint %}

**Output** (None) After using this method, you can use `get_config()` to verify the changes.

**Examples**

```python
# update all numerical columns to use a specific transforemr
ht.update_transformers_by_sdtype(
  sdtype='numerical',
  transformer_name='FloatFormatter',
  transformer_parameters={'missing_value_generation': 'from_column',
                          'enforce_min_max_values': True}
)
```

### remove\_transformers\_by\_sdtype()

This method removes transformers for all columns of a given sdtype. Use this method if you do not want to transform any columns of a particular sdtype.

**Parameters**

* (required) **`sdtype`**: An sdtype. This method will remove the transformer for all columns that match the given sdtype.

**Output** (None) After using this method, you can use `get_config()` to verify the changes.

**Examples**

```python
# do not transform any categorical columns in the dataset
ht.remove_transformers_by_sdtype(sdtype='categorical')
```
