❖ MixedScales
Last updated
Last updated
The MixedScales constraint enforces that the value of a categorical column (or a combination of categorical columns) determines the scale of a numerical column.
Create a MixedScales
constraint.
Parameters
(required) segment_column_names
: A list of one or more categorical columns that ultimately segment the data into different groups of rows. Each group will have a different scale.
(required) mixed_scale_column_name
: A numerical column whose scale depends on the segment.
table_name
: A string with the name of the table to apply this to. Required if you have a multi-table dataset.
from sdv.cag import MixedScales
my_constraint = MixedScales(
segment_column_names=['test_type', 'units'],
'mixed_scale_column_name'='test_results'
)
Apply the constraint to any SDV synthesizer. Then fit and sample as usual.
synthesizer = GaussianCopulaSynthesizer(metadata)
synthesizer.add_constraints([my_constraint])
synthesizer.fit(data)
synthetic_data = synthesizer.sample()
test_type
and units
will determine the value of numerical column for test_results
. So if test type is 'height'
and units are 'inches'
then it forces the synthesizer to learn the scale specially for this segment.For more information about using predefined constraints, please see the Constraint-Augmented Generation tutorial.