The SDV creates synthetic data using machine learning. A synthesizer is an object that you can use to accomplish this task.
- 1.You'll start by creating a synthesizer based on your metadata
- 2.Next, you'll train the synthesizer using real data. In this phase, the synthesizer will learn patterns from the real data.
- 3.Once your synthesizer is trained, you can use it to generate new, synthetic data.
from sdv.single_table import GaussianCopulaSynthesizer
# Step 1: Create the synthesizer
synthesizer = GaussianCouplaSynthesizer(metadata)
# Step 2: Train the synthesizer
# Step 3: Generate synthetic data
synthetic_data = synthesizer.sample(num_rows=100)
Choose from a variety of synthesizers. Each synthesizer uses a different machine learning technique for training.
Want to improve your synthetic data? You can control the pre- and post-processing steps in your synthesizer, and set up custom, anonymization controls. You can also enforce logical rules in the form of constraints. See the Advanced Features for more options.