❖ MixedScales
The MixedScales constraint enforces that the value of a categorical column (or a combination of categorical columns) determines the scale of a numerical column.

test_type
and units
will determine the value of numerical column for test_results
. So if test type is 'height'
and units are 'inches'
then it forces the synthesizer to learn the scale specially for this segment.Constraint API
Create a MixedScales
constraint.
Parameters
(required)
segment_column_names
: A list of one or more categorical columns that ultimately segment the data into different groups of rows. Each group will have a different scale.(required)
mixed_scale_column_name
: A numerical column whose scale depends on the segment.table_name
: A string with the name of the table to apply this to. Required if you have a multi-table dataset.
from sdv.cag import MixedScales
my_constraint = MixedScales(
segment_column_names=['test_type', 'units'],
'mixed_scale_column_name'='test_results'
)
Usage
Apply the constraint to any SDV synthesizer. Then fit and sample as usual.
synthesizer = GaussianCopulaSynthesizer(metadata)
synthesizer.add_constraints([my_constraint])
synthesizer.fit(data)
synthetic_data = synthesizer.sample()
For more information about using predefined constraints, please see the Constraint-Augmented Generation tutorial.
FAQs
Last updated