SDGym
GitHubSlackDataCebo
  • Welcome to SDGym!
  • Installation
  • Benchmarking
    • Running a Benchmark
    • Interpreting Results
  • Customization
    • Synthesizers
      • SDV Synthesizers
      • Basic Synthesizers
      • 3rd Party Synthesizers
      • Custom Synthesizers
    • Datasets
      • Public SDV Datasets
      • Custom Datasets
    • AWS Integration
  • Resources
    • Metadata
Powered by GitBook

© Copyright 2023, DataCebo, Inc.

On this page
  1. Customization
  2. Synthesizers

Basic Synthesizers

Last updated 9 months ago

The SDGym library includes some basic synthesizers that you can use for benchmarking purposes. Pass the string names into the synthesizers parameter.

import sdgym

sdgym.benchmark_single_table(
    synthesizers=['DataIdentity', 'UniformSynthesizer']
)

Use basic synthesizers for comparison purposes only! The basic synthesizers listed below are likely not great candidates for creating usable synthetic data. Use them as comparisons with other synthesizers, such as .

Basic Synthesizer
Description

DataIdentity

This synthesizer* returns the same data that it receives. It serves as an identity function. *Technically, this technique doesn't really count as a synthesizer as it does not create new data

UniformSynthesizer

This synthesizer learns the numerical ranges or categories of each column. Then, it creates synthetic data by randomly generating values within the boundaries.

ColumnSynthesizer

This synthesizer learns the marginal distributions of each column independently to generate synthetic data. For numerical columns, it learns a . For categorical columns, it learns the frequencies of each category. This synthesizer does not learn any correlations between the different columns.

FAQs

What if I have an idea for another basic synthesizer?

If there are other basic techniques you'd like to see included in the SDGym library, please with your ideas.

In the meantime, you can create a where you can implement the techniques.

SDV Synthesizers
create a Feature Request
Custom Synthesizer
Gaussian Mixture