SDV Community

SDV Community is our publicly available synthetic data product. Use SDV Community to get started with exploring the benefits of synthetic data.

Access the powerful SDV platform

SDV Community uses features throughout SDV platform ecosystem. The platform includes a suite of libraries and features that work together to form a one-stop shop for your synthetic data needs. You can also browse and use the platform features in a standalone way.

Installation

SDV Community is available as a Python SDK that you can install and use on-prem. It is distributed under the Business Source License, and has been developed on Python 3.8-3.13. As with most Python libraries, we recommend using a virtual environment (such as virtualenv) to avoid conflicts with other software on your device.

We recommend downloading SDV Community using pip (or alternatively conda).

pip install sdv

Then, open up Python and verify that SDV has installed correctly.

import sdv
print(sdv.version.public)

For more information about the latest version of SDV Community, see the Release Notes.

Having trouble? Visit our troubleshooting section to diagnose any issues. You can also ask a question on our GitHub or Slack channel.

What's next?

Once you've installed SDV, you're ready to create synthetic data.

import pandas as pd
from sdv.single_table import GaussianCopulaSynthesizer
from sdv.metadata import Metadata

data = pd.read_csv('my_data_file.csv')
metadata = Metadata.detect_from_dataframe(data)

synthesizer = GaussianCopulaSynthesizer(metadata)
synthesizer.fit(data)
synthetic_data = synthesizer.sample(num_rows=1000)

First time here? Check out our Tutorials to explore the features. The tutorials will walk you through creating synthetic data for single table, multi-table, and sequential data.

Last updated