Constraints

Do you have rules that every row in the data must follow? Are these the same regardless of how much data there is? You can use constraints to describe this business logic to your synthesizer.

Adding Predefined Logic

The SDV library comes with some predefined constraint logic that is ready to use. You can define each constraint using a dictionary format.

  • (required) 'constraint_class: A string with the name of the predefined constraint class

  • (required) 'constraint_parameters': A dictionary with the parameters. Each class has different parameters.

checkin_checkout_constraint = {
    'constraint_class': 'Inequality',
    'constraint_parameters': {
        'low_column_name': 'checkin_date',
        'high_column_name': 'checkout_date'
    }
}

For more details about the classes and parameters, browse the Predefined Constraint Classes.

add_constraints

Apply constraints by adding them to your synthesizer. Use this function to add your constraints to the synthesizer.

Parameters

  • (required) constraints: A list of dictionaries that describe your constraint. These can be predefined constraints or your own, custom constraint.

Output (None)

synthesizer.add_constraints(
    constraints=[checkin_checkout_constraint, my_custom_constraint]
)

get_constraints

Use this function to inspect all of the constraints your synthesizer contains.

Parameters None

Output A list of dictionaries that describe all the constraints applied to your synthesizer

Note that the returned list is a representation of the constraints. Changing it will not modify the constraints in any way.

synthesizer.get_constraints()
[{
    'constraint_class': 'Inequality',
    'constraint_parameters': {
        'low_column_name': 'checkin_date',
        'high_column_name': 'checkout_date'
    }
},{
    'constraint_class': ...,
    'constraint_parameters': ...,
}]

Training the synthesizer

After adding constraints, you'll need to (re)train the synthesizer with real data.

synthesizer.fit(data)

The synthesizer will now learn from your real data with the added constraints. All synthetic data it produces will be valid for the constraint.

synthetic_data = synthesizer.sample(num_rows=10)

Adding Custom Logic

In some cases, you may need to add some custom constraints. We recommend you define the logic in a separate Python file. See the Custom Logic Reference for more details.

load_custom_constraint_classes

Use this function to load your custom logic from a separate Python file.

Parameters

  • (required) filepath: A string describing the filepath of your Python file. If your constraints are defined in the current file, you can specify None.

  • (required) class_names: A list of strings, describing the class names to load

Output (None) After using this function, you can apply your custom constraints to your synthesizer.

synthesizer.load_custom_constraint_classes(
    filepath='my_business_logic/my_custom_constraints',
    class_names=['CustomClassA', 'CustomClassB']
)

Use a Custom Constraint

Once you have loaded your custom logic, you can define constraints using your custom name and parameters. Then you can apply it to your synthesizer just like any other constraint.

my_custom_constraint = {
    'constraint_class': 'CustomClassA',
    'constraint_parameters': {
        'column_names': ['age', 'weight'],
        'my_custom_parameter': 10
    }
}

synthesizer.add_constraints(
    constraints=[my_custom_constraint]
)

synthesizer.fit(data)
synthetic_data = synthesizer.sample(num_rows=10)
Can I write my custom logic in the same file as the synthesizer?

We recommend writing your custom logic in a different file.

But if you need to write it in the same file, then use the add_custom_constraint_class method instead of loading it from a file.

# register the class object directly to the synthesizer
synthesizer.add_custom_constraint_class(
    class_object=CustomClassA, # your constraint class
    class_name='CustomClassA' # the name you use to refer to it
)

# define your constraint and apply it
my_custom_constraint = {
    'constraint_class': 'CustomClassA', # use the class_name
    'constraint_parameters': {
        'column_names': ['age', 'weight'],
        'my_custom_parameter': 10
    }
}

synthesizer.add_constraints(
    constraints=[my_custom_constraint]
)

synthesizer.fit(data)
synthetic_data = synthesizer.sample(num_rows=10)

Last updated

Copyright (c) 2023, DataCebo, Inc.