Constraint Logic
Do you have rules that every row in the data must follow? Are these the same regardless of how much data there is? You can use constraints to describe this business logic in your metadata.
Predefined Constraint Classes
The SDV has 9 predefined constraint classes that are commonly used in enterprise. For example, when the value in one column must always be greater than another, use the Inequality
constraint.
my_constraint = {
'constraint_class': 'Inequality',
'table_name': 'guests', # for multi table synthesizers
'constraint_parameters': {
'low_column_name': 'checkin_date',
'high_column_name': 'checkout_date',
'strict_boundaries': True
}
}
my_synthesizer.add_constraints(constraints=[
my_constraint
])
Browse the predefined constraints to learn more.
Custom Business Logic
If your dataset includes business logic that cannot be covered by the predefined constraints, then you can create your own custom constraint. The logic must be defined in a separate Python file that you can load.
synthesizer.load_custom_constraint_classes(
filepath='custom_constraint_template.py',
class_names=['MyCustomConstraintClass']
)
Then, you can create a custom constraint just like a predefined constraint.
my_custom_constraint = {
'constraint_class': 'MyCustomConstraintClass',
'table_name': 'guests', # for multi table synthesizers
'constraint_parameters': {
'column_names': ['column_A', 'column_B'],
'extra_parameter': 10.00
}
}
my_synthesizer.add_constraints(constraints=[
my_custom_constraint
])
See the Custom Business Logic guide for more details.
FAQs
Do you need constraints? Before adding a constraint to your model, carefully consider whether it is necessary. Here are a few questions to ask:
How do I plan to use the synthetic data? Without the constraint, the rule may still be valid a majority of the time. Only add the constraint if you require 100% adherence.
Who do I plan to share the synthetic data with? Consider whether they will be able to use the business rule to uncover sensitive information about the real data.
How did the rule come to be? In some cases, there may be other data sources that are present without extra columns and rules.
In the ideal case, there are only a handful constraints you are applying to your model.
Last updated