Custom Logic
If the predefined constraint classes don't meet your needs, you can write your own custom business logic.
Custom constraints are a last resort. Adding a custom constraint requires you to specify and maintain your own logic. The SDV team does not offer debugging support to public users for their custom logic.
In many cases, it's possible to achieve your result more easily with existing SDV features:
- Metadata. The SDV metadata supports many semantic data types such as emails, phone numbers and credit card numbers. When you specify these sdtypes, the SDV automatically creates valid data for them. For more info, see the Metadata Spec and sdtype definition.
- Preprocessing. Tuning the pre- and post-processing leads to higher quality data. For more information, read about transformation for single and multi-table usages.
- Predefined Constraints. The SDV team has created and tested predefined constraints so we recommend using them when possible. For more information, see predefined constraints.
To write your custom logic, you'll need to include:
- Validity Check: A test that determines whether the logic is valid for all rows of the data, and
- (optional) Transformation Functions: Functions to modify the data before & after modeling
The SDV uses the functionality you provide to meet the constraint, as shown in the diagram below.

The SDV checks to see if the real data is valid and then transforms it before passing it into the model. The synthetic data needs to be reverse transformed. Finally, the SDV will filter it through the validity check to only return valid synthetic data.
Should I provide transformation functions? What happens if I don't? Providing transformation functions is highly encouraged.
The SDV always attempts to transform and reverse transform your data. This is the most efficient way to ensuring that your constraint is met. If you do not provide this function (or if it crashes) then the SDV will fallback to only using the validity check.
The validity check and transformations must be implemented in a separate Python file. For example
example_custom_constraint.py
. Make sure you always provide this file as an attachment.Inside the file, define the validity, transformations and create the custom constraint class.
To check for validity, write a function with the the following signature.
Parameters
- (required)
column_names
: A list of column names to check the validity for. If your logic is defined only for a single column, you can use only the first element of the list. **kwargs
: Any other parameters that you need.
Output: A pandas Series object of
True/False
values that specify whether each row is valid. There should be exactly 1 True/False
value for every row in the data.import pandas as pd
def is_valid(column_names, data, extra_parameter):
# replace with your custom logic
validity = [True]*data.shape[0]
return pd.Series(validity)
Optionally, you can provide a transformation function that modifies the data. The modification should allow the model to learn the rule. It should be paired with complementary function that reverses the transformation.
Parameters
- (required)
column_names
: A list of column names to transform. Note that the column names must be present in the original data, but you can modify or delete them in the function. **kwargs
: Any other parameters that you need. These should be the same as the validity function.
Output: A pandas DataFrame that represents the transformed version of the data.
def transform(column_names, data, extra_parameter):
# replace with your custom logic
transformed_data = data.copy()
return transformed_data
def reverse_transform(column_names, transformed_data, custom_parameter):
# replace with your custom logic
reversed_data = transformed_data.copy()
return reversed_data
Note that all functions accept the same signature. The
**kwargs
can be defined however you want but all functions should use the same ones.When you have your functions defined, use the
create_custom_constraint_class
factory method to create your constraint class.Parameters
- (required)
is_valid_fn
: The validity check transform_fn
: The transformation function. If this is not provided, no transformation is applied.reverse_transform_fn
: The reverse transformation function. This only required if you provided the transform function.
Output: A Python class for your constraint
from sdv.constraints import create_custom_constraint_class
MyCustomConstraintClass = create_custom_constraint_class(
is_valid_fn=is_valid,
transform_fn=transform,
reverse_transform_fn=reverse_transform
)
Download the template file below to get started with creating your custom constraint class.
custom_constraint_template.py
3KB
Text
In a separate Python file, you'll create a synthesizer. There, you can load apply your custom logic. The synthesizer you use will have more information about how to use your custom constraint. A general example for a single table synthesizer is shown below.
# load the constraint from the file
synthesizer.load_custom_constraint_classes(
filepath='custom_constraint_template.py',
class_names=['MyCustomConstraintClass']
)
Once you've loaded the file, you can create your custom constraint using the logic.
Parameters
When creating the constraints, you'll create a dictionary object with the constraint class and constraint parameters. The parameters are the column names and any extra parameters you've added.
{
'constraint_class': 'MyCustomConstraintClass',
'constraint_parameters': {
'column_names': ['column_A', 'column_B'],
'extra_parameter': 10.00
}
}
- (required)
column_names
: A list of one or more column names involved in the constraint <other parameters>
: Any other parameters you defined in your functions
Finally, you you'll need to (re)train your synthesizer. All synthetic data it produces will be valid for the constraint.
synthesizer.fit(data)
synthetic_data = synthesizer.sample(10)
Last modified 27d ago