Synthetic Data Vault
GitHubSlackDataCebo
  • Welcome to the SDV!
  • Tutorials
  • Explore SDV
    • SDV Community
    • SDV Enterprise
      • ⭐Compare Features
    • SDV Bundles
      • ❖ AI Connectors
      • ❖ CAG
      • ❖ Differential Privacy
      • ❖ XSynthesizers
  • Single Table Data
    • Data Preparation
      • Loading Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • GaussianCopulaSynthesizer
        • CTGANSynthesizer
        • TVAESynthesizer
        • ❖ XGCSynthesizer
        • ❖ BootstrapSynthesizer
        • ❖ SegmentSynthesizer
        • * DayZSynthesizer
        • ❖ DPGCSynthesizer
        • ❖ DPGCFlexSynthesizer
        • CopulaGANSynthesizer
      • Customizations
        • Constraints
        • Preprocessing
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
      • Privacy
        • Empirical Differential Privacy
        • SDMetrics: Privacy Metrics
  • Multi Table Data
    • Data Preparation
      • Loading Data
        • Demo Data
        • CSV
        • Excel
        • ❖ AlloyDB
        • ❖ BigQuery
        • ❖ MSSQL
        • ❖ Oracle
        • ❖ Spanner
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • * DayZSynthesizer
        • * IndependentSynthesizer
        • HMASynthesizer
        • * HSASynthesizer
      • Customizations
        • Constraints
        • Preprocessing
      • * Performance Estimates
    • Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Sequential Data
    • Data Preparation
      • Loading Data
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • PARSynthesizer
      • Customizations
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
  • Concepts
    • Metadata
      • Sdtypes
      • Metadata API
      • Metadata JSON
    • Constraint-Augmented Generation (CAG)
      • Predefined Constraints
        • FixedCombinations
        • FixedIncrements
        • Inequality
        • OneHotEncoding
        • Range
        • ❖ CarryOverColumns
        • * ChainedInequality
        • ❖ CompositeKey
        • ❖ FixedNullCombinations
        • ❖ ForeignToForeignKey
        • ❖ ForeignToPrimaryKeySubset
        • ❖ MixedScales
        • ❖ PrimaryToPrimaryKey
        • ❖ PrimaryToPrimaryKeySubset
        • ❖ ReferenceTable
        • ❖ SelfReferentialHierarchy
        • ❖ UniqueBridgeTable
      • Program Your Own Constraint
      • Constraints API
  • Support
    • Troubleshooting
      • Help with Installation
      • Help with SDV
    • Versioning & Backwards Compatibility Policy
Powered by GitBook

Copyright (c) 2023, DataCebo, Inc.

On this page
  • Constraint API
  • Usage
  • FAQs
  1. Concepts
  2. Constraint-Augmented Generation (CAG)
  3. Predefined Constraints

FixedIncrements

The FixedIncrements constraint enforces that all the values in a column are increments of a particular, fixed value. That is, all the data must be divisible by the value.

Constraint API

Parameters

  • (required) column_name: The name of the column that must follow the constraint. This must be a numerical column.

  • (required) increment: The size of the increment. This must be a positive integer

  • table_name: A string with the name of the table to apply this to. Required if you have a multi-table dataset.

from sdv.cag import FixedIncrements

my_constraint = FixedIncrements(
    column_name='salary',
    increment=1000
)

Usage

Apply the constraint to any SDV synthesizer. Then fit and sample as usual.

synthesizer = GaussianCopulaSynthesizer(metadata)
synthesizer.add_constraints([my_constraint])

synthesizer.fit(data)
synthetic_data = synthesizer.sample()

FAQs

What happens to missing values?

This constraint ignores missing values in the dataset. The constraint is valid as long as the numerical values (non-missing values) are increments of the provided integer value.

What if I want to specify a smaller increment 1 for rounding decimal digits?

The increments in this constraint must be integers greater than 1.

The SDV already learns the number of digits in your real data and matches them to your synthetic data. For example, if you have a column that represents US dollar amounts, then the real data will only contain values with 2 decimal digits. The SDV will learn this and ensure the synthetic data also has 2 decimal digits. No constraints are needed in this case.

PreviousFixedCombinationsNextInequality

Last updated 1 day ago

For more information about using predefined constraints, please see the .

Constraint-Augmented Generation tutorial