Constraint-Augmented Generation (CAG)
Do you have business rules in your dataset? These are deterministic rules that every single row in your data must follow in order to be considered valid. By default, SDV synthesizers are probabilistic so they may not learn to match your rule 100% of the time.
The good news is that you can input your business rules into your synthesizer using constraints. Our constraint-augmented generation ensures that your synthetic data meets the constraint — 100% of the time.
Constraint Example
One example of a business rule is when the values in one column always have to be greater than values in another column. This is true for every single row of data.

checkout_date
must be greater than the checkin_date
for all rows.You can supply this business rule to a synthesizer using using an Inequality constraint.
from sdv.cag import Inequality
# create a constraint that corresponds to your business rule
my_constraint = Inequality(
low_column_name='checkin_date',
high_column_name='checkout_date'
)
# add the constraint to your SDV synthseizer
my_synthesizer.add_constraints(constraints=[
my_constraint
])
Predefined Constraint Classes
Used predefined constraints to apply logic within a single table or between multiple tables. Predefined constraints represent common business rules that may appear in your dataset.
Browse the predefined constraints to learn more. We also recommend going through our tutorial.
Program Your Own Constraint
If your logic cannot be described by predefined constraints, program your own constraint. The logic must be defined in a separate Python file that you can load and add to any synthesizer.
See the Program Your Own Constraint guide for more details. We also recommend going through our tutorial.
Constraints API
Create and add constraint objects using the add_constraints
function. For more details, see the API Reference.
FAQs
Last updated