Synthetic Data Vault
GitHubSlackDataCebo
  • Welcome to the SDV!
  • Tutorials
  • Explore SDV
    • SDV Community
    • SDV Enterprise
      • ⭐Compare Features
    • SDV Bundles
      • ❖ AI Connectors
      • ❖ CAG
      • ❖ Differential Privacy
      • ❖ XSynthesizers
  • Single Table Data
    • Data Preparation
      • Loading Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • GaussianCopulaSynthesizer
        • CTGANSynthesizer
        • TVAESynthesizer
        • ❖ XGCSynthesizer
        • ❖ SegmentSynthesizer
        • * DayZSynthesizer
        • ❖ DPGCSynthesizer
        • ❖ DPGCFlexSynthesizer
        • CopulaGANSynthesizer
      • Customizations
        • Constraints
        • Preprocessing
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Multi Table Data
    • Data Preparation
      • Loading Data
        • Demo Data
        • CSV
        • Excel
        • ❖ AlloyDB
        • ❖ BigQuery
        • ❖ MSSQL
        • ❖ Oracle
        • ❖ Spanner
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • * DayZSynthesizer
        • * IndependentSynthesizer
        • HMASynthesizer
        • * HSASynthesizer
      • Customizations
        • Constraints
        • Preprocessing
      • * Performance Estimates
    • Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Sequential Data
    • Data Preparation
      • Loading Data
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • PARSynthesizer
      • Customizations
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
  • Concepts
    • Metadata
      • Sdtypes
      • Metadata API
      • Metadata JSON
    • Constraints
      • Predefined Constraints
        • Positive
        • Negative
        • ScalarInequality
        • ScalarRange
        • FixedIncrements
        • FixedCombinations
        • ❖ FixedNullCombinations
        • ❖ MixedScales
        • OneHotEncoding
        • Inequality
        • Range
        • * ChainedInequality
      • Custom Logic
        • Example: IfTrueThenZero
      • ❖ Constraint Augmented Generation (CAG)
        • ❖ CarryOverColumns
        • ❖ CompositeKey
        • ❖ ForeignToForeignKey
        • ❖ ForeignToPrimaryKeySubset
        • ❖ PrimaryToPrimaryKey
        • ❖ PrimaryToPrimaryKeySubset
        • ❖ SelfReferentialHierarchy
        • ❖ ReferenceTable
        • ❖ UniqueBridgeTable
  • Support
    • Troubleshooting
      • Help with Installation
      • Help with SDV
    • Versioning & Backwards Compatibility Policy
Powered by GitBook

Copyright (c) 2023, DataCebo, Inc.

On this page
  • Access the powerful SDV platform
  • Installation
  • What's next?
  1. Explore SDV

SDV Community

PreviousTutorialsNextSDV Enterprise

Last updated 22 days ago

SDV Community is our publicly available synthetic data product. Use SDV Community to get started with exploring the benefits of synthetic data.

Access the powerful SDV platform

SDV Community uses features throughout SDV platform ecosystem. The platform includes a suite of libraries and features that work together to form a one-stop shop for your synthetic data needs. You can also browse and use the platform features in a standalone way.

Installation

SDV Community is available as a Python SDK that you can install and use on-prem. It is distributed under the , and has been developed on Python . As with most Python libraries, we recommend using a virtual environment (such as ) to avoid conflicts with other software on your device.

We recommend downloading SDV Community using (or alternatively ).

pip install sdv

Then, open up Python and verify that SDV has installed correctly.

import sdv
print(sdv.version.public)

For more information about the latest version of SDV Community, see the .

Having trouble? Visit our to diagnose any issues. You can also ask a question on our or .

What's next?

Once you've installed SDV, you're ready to create synthetic data.

import pandas as pd
from sdv.single_table import GaussianCopulaSynthesizer
from sdv.metadata import Metadata

data = pd.read_csv('my_data_file.csv')
metadata = Metadata.detect_from_dataframe(data)

synthesizer = GaussianCopulaSynthesizer(metadata)
synthesizer.fit(data)
synthetic_data = synthesizer.sample(num_rows=1000)

First time here? Check out our to explore the features. The tutorials will walk you through creating synthetic data for single table, multi-table, and sequential data.

Tutorials

Preprocess and anonymize your data using reversible data transformers.

Evaluate synthetic data for quality and privacy. Visualize & share results.

Benchmark synthetic data generators across SDV and other libraries.

Business Source License
3.8-3.13
virtualenv
pip
conda
Release Notes
troubleshooting section
GitHub
Slack channel

RDT
SDMetrics
SDGym