Modeling

The SDV creates synthetic data using machine learning. A synthesizer is an object that you can use to accomplish this task.

  1. You'll start by creating a synthesizer based on your metadata

  2. Next, you'll train the synthesizer using real data. In this phase, the synthesizer will learn patterns from the real data.

  3. Once your synthesizer is trained, you can use it to generate new, synthetic data.

from sdv.sequential import PARSynthesizer

# Step 1: Create the synthesizer
synthesizer = PARSynthesizer(metadata)

# Step 2: Train the synthesizer
synthesizer.fit(real_data)

# Step 3: Generate synthetic data
synthetic_data = synthesizer.sample(num_sequences=100)

What's next?

Explore the PARSynthesizer. This sequential synthesizer is available in the open source SDV.

Want to improve your synthetic data? You can control the pre- and post-processing steps in your synthesizer, and set up custom, anonymization controls. See the Advanced Features for more options.

Last updated

Copyright (c) 2023, DataCebo, Inc.