Modeling

The SDV creates synthetic data using machine learning. A synthesizer is an object that you can use to accomplish this task.

  1. You'll start by creating a synthesizer based on your metadata

  2. Next, you'll train the synthesizer using real data. In this phase, the synthesizer will learn patterns from the real data.

  3. Once your synthesizer is trained, you can use it to generate new, synthetic data.

from sdv.multi_table import HMASynthesizer

# Step 1: Create the synthesizer
synthesizer = HMASynthesizer(metadata)

# Step 2: Train the synthesizer
synthesizer.fit(real_data)

# Step 3: Generate synthetic data
synthetic_data = synthesizer.sample()

What's next?

Explore the synthesizers. Create multi table synthetic data using a variety of synthesizers.

Want to improve your synthetic data? You can control the pre- and post-processing steps in your synthesizer, and set up custom, anonymization controls. You can also enforce logical rules in the form of constraints. See the Advanced Features for more options.

Last updated

Copyright (c) 2023, DataCebo, Inc.