Inequality
The Inequality constraint enforces an inequality relationship between a pair of columns. For every row, the value in one column must be greater than a value in another.
Constraint API
Parameters
(required)
low_column_name: The name of the column whose values must be lower. Only numerical and datetime columns are allowed.(required)
high_column_name: The name of the column whose values must be greater. Only numerical and datetime columns are allowed.strict_boundaries: Whether the high column must be strictly greater than the low column(default)
True: The value in the high column must be strictly greater than the value in the low columnFalse: The value in the high column must be greater than or equal to the value in the low column.
table_name: A string with the name of the table to apply this to. Required if you have a multi-table dataset.
from sdv.cag import Inequality
my_constraint = Inequality(
low_column_name='checkin_date',
high_column_name='checkout_date',
strict_boundaries=True
)Usage
Apply the constraint to any SDV synthesizer. Then fit and sample as usual.
For more information about using predefined constraints, please see the Constraint-Augmented Generation tutorial.
❖ Auto-Detection
❖ SDV Enterprise Bundle. This feature is available as part of the CAG Bundle, an optional add-on to SDV Enterprise. For more information, please visit the CAG Bundle page.
Auto-detection is allowed for this constraint but it is not enabled by default. We recommend auto-detecting the ChainedInequality constraint instead, as it covers all the logic of this constraint and more!
If you'd like to auto-detect this constraint specifically, you can supply it in the auto-detection parameters. This will detect all instances of the constraint throughout the dataset.
Detection Parameters: By default, SDV detects this constraint for datetime columns only. Use the the sdtypes parameter to update this. Provide a list of sdtypes to detect (datetime, numerical, or both).
FAQs
What happens to missing values?
This constraint ignores missing values. The constraint considered is valid as long as the numerical values (non-missing values) follow the inequality.
What if I want to compare a column to a single, fixed value?
Many of our SDV synthesizers are already designed to learned the min/max values in every column and replicate the ranges in the synthetic data. This parameter is often called enforce_min_max_values and it applies to all numerical/datetime columns. For more information, check your synthesizer's API guide.
You can also control the enforcement on a per-column basis. Turn on/off the enforcement on individual columns by accessing and updating the transformers. For more information, see the Preprocessing guide.
Both of these options will allow you to fix the range (as observed in the real data) or expand it (by not enforcing it). If you'd like to further restrict the range, we encourage you to model the data as-is and use conditional sampling to get the range you need.
Last updated