Custom Logic
Last updated
Last updated
If the predefined constraint classes don't meet your needs, you can write your own custom business logic.
Compatibility: Any type of column except for primary and foreign keys
Custom constraints are a last resort. Adding a custom constraint requires you to specify and maintain your own logic. The SDV team does not offer debugging support to public users for their custom logic.
In many cases, it's possible to achieve your result more easily with existing SDV features:
Metadata. The SDV metadata supports many semantic data types such as emails, phone numbers and credit card numbers. When you specify these sdtypes, the SDV automatically creates valid data for them. For more info, see the and .
Preprocessing. Tuning the pre- and post-processing leads to higher quality data. For more information, read about transformation for and usages.
Predefined Constraints. The SDV team has created and tested predefined constraints so we recommend using them when possible. For more information, see .
If you have any questions, please reach out to us on or and we'll be happy to point you in the right direction.
To write your custom logic, you'll need to include:
Validity Check: A test that determines whether the logic is valid for all rows of the data, and
(optional) Transformation Functions: Functions to modify the data before & after modeling
The SDV uses the functionality you provide to meet the constraint, as shown in the diagram below.
The validity check and transformations must be implemented in a separate Python file. For example example_custom_constraint.py
. Make sure you always provide this file as an attachment.
Inside the file, define the validity, transformations and create the custom constraint class.
To check for validity, write a function with the the following signature.
Parameters
(required) column_names
: A list of column names to check the validity for. If your logic is defined only for a single column, you can use only the first element of the list.
**kwargs
: Any other parameters that you need.
Optionally, you can provide a transformation function that modifies the data. The modification should allow the model to learn the rule. It should be paired with complementary function that reverses the transformation.
Parameters
(required) column_names
: A list of column names to transform. Note that the column names must be present in the original data, but you can modify or delete them in the function.
**kwargs
: Any other parameters that you need. These should be the same as the validity function.
Output: A pandas DataFrame that represents the transformed version of the data.
When you have your functions defined, use the create_custom_constraint_class
factory method to create your constraint class.
Parameters
(required) is_valid_fn
: The validity check
transform_fn
: The transformation function. If this is not provided, no transformation is applied.
reverse_transform_fn
: The reverse transformation function. This only required if you provided the transform function.
Output: A Python class for your constraint
Download the template file below to get started with creating your custom constraint class.
In a separate Python file, you'll create a synthesizer. There, you can load apply your custom logic. The synthesizer you use will have more information about how to use your custom constraint. A general example for a single table synthesizer is shown below.
Once you've loaded the file, you can create your custom constraint using the logic.
Parameters
When creating the constraints, you'll create a dictionary object with the constraint class and constraint parameters. The parameters are the column names and any extra parameters you've added.
(required) column_names
: A list of one or more column names involved in the constraint
<other parameters>
: Any other parameters you defined in your functions
Finally, you you'll need to (re)train your synthesizer. All synthetic data it produces will be valid for the constraint.
(required) data
: A table of data, represented as a object
Output: A object of True/False
values that specify whether each row is valid. There should be exactly 1 True/False
value for every row in the data.
(required) data
or transformed_data
: A table of data, represented as a object
Unfortunately, the SDV team is unable to offer individualized customer constraint support to public SDV users. For SDV Enterprise users, we offer help with debugging and may prioritize creating a new, predefined constraint related to your logic. To learn more about the SDV Enterprise features and purchasing a license, .
Unfortunately, the SDV team is unable to offer individualized customer constraint support to public SDV users. For SDV Enterprise users, we offer help with debugging and may prioritize creating a new, predefined constraint related to your logic. To learn more about the SDV Enterprise features and purchasing a license, .