Synthesizers
Last updated
Last updated
Copyright (c) 2023, DataCebo, Inc.
The SDV offers a variety of synthesizers, which use different algorithms to generate synthetic data.
These synthesizers are available in the SDV Community package. They build a generative AI model using your real data, and use it to create synthetic data.
We recommend starting with GaussianCopulaSynthesizer. This synthesizer models data quickly with high statistical quality. It also supports customizations and many options for sampling.
Experimental synthesizer: The CopulaGANSynthesizer combines classical statistics with GAN-based modeling.
Specialty synthesizers are available for special situations. You can also use them to improve speed and performance.
Some specialty synthesizers available for licensed, SDV Enterprise users (denoted by *) or through purchasing additional bundles (denoted by ❖). For more information, see our page to Explore SDV.
Generate synthetic data from scratch. Use this when you don't have a lot of real data.
Use a classical ML algorithm to learn from real data. This is fast, transparent, and customizable.
Use GAN-based ML algorithm to learn from real data. This may take longer to learn and be harder to debug.
Use a variational autoencoder ML model to learn from real data. This may take longer to learn and be harder to debug.
Use extra features on top of Gaussian Copula for higher quality synthetic data and improved performance.
Use this synthesizer when your real data is highly segmented, with different patterns for each.