Synthetic Data Vault
GitHubSlackDataCebo
  • Welcome to the SDV!
  • Tutorials
  • Explore SDV
    • SDV Community
    • SDV Enterprise
      • ⭐Compare Features
    • SDV Bundles
      • ❖ AI Connectors
      • ❖ CAG
      • ❖ Differential Privacy
      • ❖ XSynthesizers
  • Single Table Data
    • Data Preparation
      • Loading Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • GaussianCopulaSynthesizer
        • CTGANSynthesizer
        • TVAESynthesizer
        • ❖ XGCSynthesizer
        • ❖ BootstrapSynthesizer
        • ❖ SegmentSynthesizer
        • * DayZSynthesizer
        • ❖ DPGCSynthesizer
        • ❖ DPGCFlexSynthesizer
        • CopulaGANSynthesizer
      • Customizations
        • Constraints
        • Preprocessing
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
      • Privacy
        • Empirical Differential Privacy
        • SDMetrics: Privacy Metrics
  • Multi Table Data
    • Data Preparation
      • Loading Data
        • Demo Data
        • CSV
        • Excel
        • ❖ AlloyDB
        • ❖ BigQuery
        • ❖ MSSQL
        • ❖ Oracle
        • ❖ Spanner
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • * DayZSynthesizer
        • * IndependentSynthesizer
        • HMASynthesizer
        • * HSASynthesizer
      • Customizations
        • Constraints
        • Preprocessing
      • * Performance Estimates
    • Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Sequential Data
    • Data Preparation
      • Loading Data
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • PARSynthesizer
      • Customizations
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
  • Concepts
    • Metadata
      • Sdtypes
      • Metadata API
      • Metadata JSON
    • Constraint-Augmented Generation (CAG)
      • Predefined Constraints
        • FixedCombinations
        • FixedIncrements
        • Inequality
        • OneHotEncoding
        • Range
        • ❖ CarryOverColumns
        • * ChainedInequality
        • ❖ CompositeKey
        • ❖ FixedNullCombinations
        • ❖ ForeignToForeignKey
        • ❖ ForeignToPrimaryKeySubset
        • ❖ MixedScales
        • ❖ PrimaryToPrimaryKey
        • ❖ PrimaryToPrimaryKeySubset
        • ❖ ReferenceTable
        • ❖ SelfReferentialHierarchy
        • ❖ UniqueBridgeTable
      • Program Your Own Constraint
      • Constraints API
  • Support
    • Troubleshooting
      • Help with Installation
      • Help with SDV
    • Versioning & Backwards Compatibility Policy
Powered by GitBook

Copyright (c) 2023, DataCebo, Inc.

On this page
  • Constraint API
  • Usage
  1. Concepts
  2. Constraint-Augmented Generation (CAG)
  3. Predefined Constraints

❖ CarryOverColumns

PreviousRangeNext* ChainedInequality

Last updated 1 day ago

Use the CarryOverColumns constraint when you have the same columns in different tables, and the values of those columns have to match up according to a key (ID) column.

Constraint API

This functionality is in Beta. At this time, select SDV Enterprise users have been invited to use this feature.

Create a CarryOverColumns constraint

Parameters:

  • (required) common_column_info: A list of dictionaries that describe all the columns within your dataset that are carried over. Each dictionary should have the following keys:

    • table_name: The name of the table that contains the column

    • carryover_column_name: The name of the column that is repeated across your dataset

    • key_column_name: The name of the column to use as the key for the carried over column. Must be a PII or ID sdtype.

Please supply 2 or more columns for this constraint.

from sdv.cag import CarryOverColumns

my_constraint = CarryOverColumns(
    common_column_info=[{
        'table_name': 'Account',
        'carryover_column_name': 'Type',
        'key_column_name': 'ID'
    }, {
        'table_name': 'Transaction',
        'carryover_column_name': 'Type',
        'key_column_name': 'Account ID'
    }]
)

Usage

Apply the constraint to any SDV synthesizer. Then fit and sample as usual.

synthesizer = HSASynthesizer(metadata)
synthesizer.add_constraints([my_constraint])

synthesizer.fit(data)
synthetic_data = synthesizer.sample()

Make sure that all the tables and columns you provide are in your .

Metadata
The 'Type' column is carried over based on the ID/Account ID information.

For more information about using predefined constraints, please see the .

Constraint-Augmented Generation tutorial

❖ SDV Enterprise Bundle. This feature is available as part of the CAG Bundle, an optional add-on to SDV Enterprise. For more information, please visit the page.

CAG Bundle