Synthesizers
The SDV offers a variety of synthesizers, which use different algorithms to generate synthetic data.
Basic Single Table Synthesizers
These synthesizers are available in the SDV Community package. They build a generative AI model using your real data, and use it to create synthetic data.
We recommend starting with GaussianCopulaSynthesizer for fast performance, good quality, and customization.
For higher fidelity, try a neural network-based synthesizer such as CTGANSynthesizer or TVAESynthesizer. Modeling and sampling performance may be slower for these synthesizers, especially if you have categorical columns with many different values (high cardinality).
Use GAN-based ML algorithm to learn from real data. This may take longer to learn and be harder to debug.
Experimental synthesizer: The CopulaGANSynthesizer combines classical statistics with GAN-based modeling.
Specialty Synthesizers
Specialty synthesizers are available for special situations — such as improving speed, enhancing quality, or providing privacy guarantees.
Use extra features on top of Gaussian Copula for higher quality synthetic data and improved performance.
A synthesizer that is optimized to learn from a smaller number of training rows.
Generate synthetic data from scratch. Use this when you don't have a lot of real data.
Last updated