FixedIncrements

The FixedIncrements constraint enforces that all the values in a column are increments of a particular, fixed value. That is, all the data must be divisible by the value.

Constraint API

Parameters

  • (required) column_name: The name of the column that must follow the constraint. This must be a numerical column.

  • (required) increment: The size of the increment. This must be a positive integer

  • table_name: A string with the name of the table to apply this to. Required if you have a multi-table dataset.

from sdv.cag import FixedIncrements

my_constraint = FixedIncrements(
    column_name='salary',
    increment=1000
)

Usage

Apply the constraint to any SDV synthesizer. Then fit and sample as usual.

synthesizer = GaussianCopulaSynthesizer(metadata)
synthesizer.add_constraints([my_constraint])

synthesizer.fit(data)
synthetic_data = synthesizer.sample()

For more information about using predefined constraints, please see the Constraint-Augmented Generation tutorial.

FAQs

What happens to missing values?

This constraint ignores missing values in the dataset. The constraint is valid as long as the numerical values (non-missing values) are increments of the provided integer value.

What if I want to specify a smaller increment 1 for rounding decimal digits?

The increments in this constraint must be integers greater than 1.

The SDV already learns the number of digits in your real data and matches them to your synthetic data. For example, if you have a column that represents US dollar amounts, then the real data will only contain values with 2 decimal digits. The SDV will learn this and ensure the synthetic data also has 2 decimal digits. No constraints are needed in this case.

Last updated