Synthetic Data Vault
GitHubSlackDataCebo
  • Welcome to the SDV!
  • Tutorials
  • Explore SDV
    • SDV Community
    • SDV Enterprise
      • ⭐Compare Features
    • SDV Bundles
      • ❖ AI Connectors
      • ❖ CAG
      • ❖ Differential Privacy
      • ❖ XSynthesizers
  • Single Table Data
    • Data Preparation
      • Loading Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • GaussianCopulaSynthesizer
        • CTGANSynthesizer
        • TVAESynthesizer
        • ❖ XGCSynthesizer
        • ❖ SegmentSynthesizer
        • * DayZSynthesizer
        • ❖ DPGCSynthesizer
        • ❖ DPGCFlexSynthesizer
        • CopulaGANSynthesizer
      • Customizations
        • Constraints
        • Preprocessing
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Multi Table Data
    • Data Preparation
      • Loading Data
        • Demo Data
        • CSV
        • Excel
        • ❖ AlloyDB
        • ❖ BigQuery
        • ❖ MSSQL
        • ❖ Oracle
        • ❖ Spanner
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • * DayZSynthesizer
        • * IndependentSynthesizer
        • HMASynthesizer
        • * HSASynthesizer
      • Customizations
        • Constraints
        • Preprocessing
      • * Performance Estimates
    • Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Sequential Data
    • Data Preparation
      • Loading Data
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • PARSynthesizer
      • Customizations
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
  • Concepts
    • Metadata
      • Sdtypes
      • Metadata API
      • Metadata JSON
    • Constraints
      • Predefined Constraints
        • Positive
        • Negative
        • ScalarInequality
        • ScalarRange
        • FixedIncrements
        • FixedCombinations
        • ❖ FixedNullCombinations
        • ❖ MixedScales
        • OneHotEncoding
        • Inequality
        • Range
        • * ChainedInequality
      • Custom Logic
        • Example: IfTrueThenZero
      • ❖ Constraint Augmented Generation (CAG)
        • ❖ CarryOverColumns
        • ❖ CompositeKey
        • ❖ ForeignToForeignKey
        • ❖ ForeignToPrimaryKeySubset
        • ❖ PrimaryToPrimaryKey
        • ❖ PrimaryToPrimaryKeySubset
        • ❖ SelfReferentialHierarchy
        • ❖ ReferenceTable
        • ❖ UniqueBridgeTable
  • Support
    • Troubleshooting
      • Help with Installation
      • Help with SDV
    • Versioning & Backwards Compatibility Policy
Powered by GitBook

Copyright (c) 2023, DataCebo, Inc.

On this page
  • Basic Single Table Synthesizers
  • Specialty Synthesizers
  1. Single Table Data
  2. Modeling

Synthesizers

PreviousModelingNextGaussianCopulaSynthesizer

Last updated 17 days ago

The SDV offers a variety of synthesizers, which use different algorithms to generate synthetic data.

Basic Single Table Synthesizers

These synthesizers are available in the SDV Community package. They build a generative AI model using your real data, and use it to create synthetic data.

We recommend starting with for fast performance, good quality, and customization.

For higher fidelity, try a neural network-based synthesizer such as or . Modeling and sampling performance may be slower for these synthesizers, especially if you have categorical columns with many different values (high cardinality).

Experimental synthesizer: The combines classical statistics with GAN-based modeling.

Specialty Synthesizers

Specialty synthesizers are available for special situations — such as improving speed, enhancing quality, or providing privacy guarantees.

Specialty synthesizers available for licensed, SDV Enterprise users (denoted by *) or through purchasing additional bundles (denoted by ❖). For more information, see and .

Generate synthetic data from scratch. Use this when you don't have a lot of real data.

GaussianCopulaSynthesizer
CTGANSynthesizer
TVAESynthesizer

Use a classical ML algorithm to learn from real data. This is fast, transparent, and customizable.

Use GAN-based ML algorithm to learn from real data. This may take longer to learn and be harder to debug.

Use a variational autoencoder ML model to learn from real data. This may take longer to learn and be harder to debug.

CopulaGANSynthesizer
SDV Enterprise
SDV Bundles

Use extra features on top of Gaussian Copula for higher quality synthetic data and improved performance.

Use this synthesizer when your real data is highly segmented, with different patterns for each.

Use Gaussian Copula while guaranteeing differential privacy.

[Experimental] Use and customize Gaussian Copula while guaranteeing differential privacy.

*

❖

❖

❖

❖

DayZSynthesizer
GaussianCopulaSynthesizer
CTGANSynthesizer
TVAE Synthesizer
XGCSynthesizer
SegmentSynthesizer
DPGCSynthesizer
DPGCFlexSynthesizer