Synthetic Data Vault
GitHubSlackDataCebo
  • Welcome to the SDV!
  • Tutorials
  • Explore SDV
    • SDV Community
    • SDV Enterprise
      • ⭐Compare Features
    • SDV Bundles
      • ❖ AI Connectors
      • ❖ CAG
      • ❖ Differential Privacy
      • ❖ XSynthesizers
  • Single Table Data
    • Data Preparation
      • Loading Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • GaussianCopulaSynthesizer
        • CTGANSynthesizer
        • TVAESynthesizer
        • ❖ XGCSynthesizer
        • ❖ SegmentSynthesizer
        • * DayZSynthesizer
        • ❖ DPGCSynthesizer
        • ❖ DPGCFlexSynthesizer
        • CopulaGANSynthesizer
      • Customizations
        • Constraints
        • Preprocessing
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Multi Table Data
    • Data Preparation
      • Loading Data
        • Demo Data
        • CSV
        • Excel
        • ❖ AlloyDB
        • ❖ BigQuery
        • ❖ MSSQL
        • ❖ Oracle
        • ❖ Spanner
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • * DayZSynthesizer
        • * IndependentSynthesizer
        • HMASynthesizer
        • * HSASynthesizer
      • Customizations
        • Constraints
        • Preprocessing
      • * Performance Estimates
    • Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Sequential Data
    • Data Preparation
      • Loading Data
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • PARSynthesizer
      • Customizations
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
  • Concepts
    • Metadata
      • Sdtypes
      • Metadata API
      • Metadata JSON
    • Constraints
      • Predefined Constraints
        • Positive
        • Negative
        • ScalarInequality
        • ScalarRange
        • FixedIncrements
        • FixedCombinations
        • ❖ FixedNullCombinations
        • ❖ MixedScales
        • OneHotEncoding
        • Inequality
        • Range
        • * ChainedInequality
      • Custom Logic
        • Example: IfTrueThenZero
      • ❖ Constraint Augmented Generation (CAG)
        • ❖ CarryOverColumns
        • ❖ CompositeKey
        • ❖ ForeignToForeignKey
        • ❖ ForeignToPrimaryKeySubset
        • ❖ PrimaryToPrimaryKey
        • ❖ PrimaryToPrimaryKeySubset
        • ❖ SelfReferentialHierarchy
        • ❖ ReferenceTable
        • ❖ UniqueBridgeTable
  • Support
    • Troubleshooting
      • Help with Installation
      • Help with SDV
    • Versioning & Backwards Compatibility Policy
Powered by GitBook

Copyright (c) 2023, DataCebo, Inc.

On this page
  • Adding Predefined Logic
  • add_constraints
  • get_constraints
  • Training the synthesizer
  • Adding Custom Logic
  • load_custom_constraint_classes
  • Use a Custom Constraint
  1. Multi Table Data
  2. Modeling
  3. Customizations

Constraints

PreviousCustomizationsNextPreprocessing

Last updated 8 months ago

Do you have rules that every row in the data must follow? Are these the same regardless of how much data there is? You can use constraints to describe this business logic to your synthesizer.

Adding Predefined Logic

The SDV library comes with some predefined constraint logic that is ready to use. You can define each constraint using a dictionary format.

  • (required) 'constraint_class: A string with the name of the predefined constraint class

  • (required) 'table_name': A string with the name of the table for the constraint

  • (required) 'constraint_parameters': A dictionary with the parameters. Each class has different parameters.

checkin_checkout_constraint = {
    'constraint_class': 'Inequality',
    'table_name': 'guests',
    'constraint_parameters': {
        'low_column_name': 'checkin_date',
        'high_column_name': 'checkout_date'
    }
}

For more details about the classes and parameters, browse the .

add_constraints

Apply constraints by adding them to your synthesizer. Use this function to add your constraints to the synthesizer.

Parameters

  • (required) constraints: A list of dictionaries that describe your constraint. These can be predefined constraints or your own, custom constraint.

Output (None)

synthesizer.add_constraints(
    constraints=[checkin_checkout_constraint, my_custom_constraint]
)

get_constraints

Use this function to inspect all of the constraints your synthesizer contains.

Parameters None

Output A list of dictionaries that describe all the constraints applied to your synthesizer

Note that the returned list is a representation of the constraints. Changing it will not modify the constraints in any way.

synthesizer.get_constraints()
[{
    'constraint_class': 'Inequality',
    'table_name': 'guests',
    'constraint_parameters': {
        'low_column_name': 'checkin_date',
        'high_column_name': 'checkout_date'
    }
},{
    'constraint_class': ...,
    'table_name': ...,
    'constraint_paramters': ...
}]

Training the synthesizer

After adding constraints, you'll need to (re)train the synthesizer with real data.

synthesizer.fit(data)

The synthesizer will now learn from your real data with the added constraints. All synthetic data it produces will be valid for the constraint.

synthetic_data = synthesizer.sample()

Adding Custom Logic

load_custom_constraint_classes

Use this function to load your custom logic from a separate Python file.

Parameters

  • (required) filepath: A string describing the filepath of your Python file. If your constraints are defined in the current file, you can specify None.

  • (required) class_names: A list of strings, describing the class names to load

Output (None) After using this function, you can apply your custom constraints to your synthesizer.

synthesizer.load_custom_constraint_classes(
    filepath='my_business_logic/my_custom_constraints',
    class_names=['CustomClassA', 'CustomClassB']
)

Use a Custom Constraint

Once you have loaded your custom logic, you can define constraints using your custom name and parameters. Then you can apply it to your synthesizer just like any other constraint.

my_custom_constraint = {
    'constraint_class': 'CustomClassA',
    'table_name': 'guests',
    'constraint_parameters': {
        'column_names': ['has_rewards', 'amenities_fee'],
        'my_custom_parameter': 10
    }
}

synthesizer.add_constraints(
    constraints=[my_custom_constraint]
)

synthesizer.fit(data)
synthetic_data = synthesizer.sample(num_rows=10)
Can I write my custom logic in the same file as the synthesizer?

We recommend writing your custom logic in a different file.

But if you need to write it in the same file, then use the add_custom_constraint_class method instead of loading it from a file.

# register the class object directly to the synthesizer
synthesizer.add_custom_constraint_class(
    class_object=CustomClassA, # your constraint class
    class_name='CustomClassA' # the name you use to refer to it
)

# define your constraint and apply it
my_custom_constraint = {
    'constraint_class': 'CustomClassA', # use the class_name
    'table_name': 'guests',
    'constraint_parameters': {
        'column_names': ['has_rewards', 'amenities_fee'],
        'my_custom_parameter': 10
    }
}

synthesizer.add_constraints(
    constraints=[my_custom_constraint]
)

synthesizer.fit(data)
synthetic_data = synthesizer.sample()

In some cases, you may need to add some custom constraints. We recommend you define the logic in a separate Python file. See the for more details.

Predefined Constraint Classes
Custom Logic Reference