FixedCombinations
Compatibility: 2 or more categorical columns
The
FixedCombinations
constraint enforces that the combinations between a set of columns are fixed. That is, no other permutations or shuffling is allowed other than what's already observed in the data.(required)
column_names
: A list of two or more columns whose combinations are fixed. The SDV will not further shuffle the data between these column names. Define your constraint using the parameters and then add it to a synthesizer.
my_constraint = {
'constraint_class': 'FixedCombinations',
'table_name': 'locations', # for multi table synthesizers
'constraint_parameters': {
'column_names': ['city', 'country']
}
}
my_synthesizer.add_constraints(constraints=[
my_constraint
])
This constraint ensures that the synthetic data only contains combinations that exist in the real data. If there is only one column, there are no combinations.
The SDV already guarantees that the synthetic data contains the same categorical values as the real data for a single column.
Yes. This constraint prevents the SDV from creating additional permutations between columns. But the same permutations are allowed to appear multiple times.
For example, it will prevent the SDV from inventing new
city, country
pairs, but a valid pair such as Boston, USA
may appear more than once.Last modified 2mo ago