SDV Community
SDV Community is our publicly available synthetic data product. Use SDV Community to get started with exploring the benefits of synthetic data.
Access the powerful SDV platform
SDV Community uses features throughout SDV platform ecosystem. The platform includes a suite of libraries and features that work together to form a one-stop shop for your synthetic data needs. You can also browse and use the platform features in a standalone way.
Installation
SDV Community is available as a Python SDK that you can install and use on-prem. It is distributed under the Business Source License, and has been developed on Python 3.8-3.13. As with most Python libraries, we recommend using a virtual environment (such as virtualenv) to avoid conflicts with other software on your device.
We recommend downloading SDV Community using pip (or alternatively conda).
pip install sdv
Then, open up Python and verify that SDV has installed correctly.
import sdv
print(sdv.version.community)
For more information about the latest version of SDV Community, see the Release Notes.
What's next?
Once you've installed SDV, you're ready to create synthetic data.
import pandas as pd
from sdv.single_table import GaussianCopulaSynthesizer
from sdv.metadata import Metadata
data = pd.read_csv('my_data_file.csv')
metadata = Metadata.detect_from_dataframe(data)
synthesizer = GaussianCopulaSynthesizer(metadata)
synthesizer.fit(data)
synthetic_data = synthesizer.sample(num_rows=1000)
First time here? Check out our Tutorials to explore the features. The tutorials will walk you through creating synthetic data for single table, multi-table, and sequential data.
Last updated