Range
The Range constraint enforces that for all rows, the value of one of the columns is bounded by the values in the other two columns.
Constraint API
Create a Range
constraint.
(required)
low_column_name
: The name of the column that contains the lowest value. This must be a numerical or datetime column.(required)
middle_column_name
: The name of the column that must be between the low and the high columns. This must be a numerical or datetime column.(required)
high_column_name
: The name of the column that contains the highest value. This must be a numerical or datetime column.strict_boundaries
: Whether the boundaries between each of the comparisons are strict(default)
True
: The middle column must be strictly greater than the low column and strictly less than the high column.False
: The middle column must be greater than or equal to the low column and less than or equal to the high column
table_name
: A string with the name of the table to apply this to. Required if you have a multi-table dataset.
from sdv.cag import Range
my_constraint = Range(
low_column_name='child_age',
middle_column_name='parent_age',
high_column_name='grandparent_age',
strict_bounadires=True
)
Usage
Apply the constraint to any SDV synthesizer. Then fit and sample as usual.
synthesizer = GaussianCopulaSynthesizer(metadata)
synthesizer.add_constraints([my_constraint])
synthesizer.fit(data)
synthetic_data = synthesizer.sample()
For more information about using predefined constraints, please see the Constraint-Augmented Generation tutorial.
FAQs
Last updated