Links

Welcome to the SDV!

The Synthetic Data Vault (SDV) is a Python library designed to be your one-stop shop for creating tabular synthetic data. It is available to the public under the Business Source License.

Key Features

🧠 Train your own Generative AI Model
Choose from a variety of AI models meant for tabular data. Browse options for single table and multi-table (relational) data.
📊 Evaluate & Visualize Synthetic Data
Diagnose problems and measure statistical quality. For even more insight, visualize synthetic vs. real data.
⚙️ Customize your Synthesizer
Add business logic, control data the data pre-processing rules, and select anonymization options for sensitive values.

Ready to take SDV to the next level?

With SDV Enterprise* you can take SDV to the next level with more scalable synthesizers, deeper data understanding, and integrations. You can also deploy synthetic data applications enterprise-wide.
*SDV Enterprise is currently in Early Access. Contact Us to be part of our Beta program.

Owned & Maintained by DataCebo

The SDV library is a part of the greater Synthetic Data Vault Project, first created at MIT's Data to AI Lab in 2016. After 4 years of research and traction with enterprise, we created DataCebo in 2020 with the goal of growing the project.
Today, DataCebo is the proud developer of the SDV, the largest ecosystem for synthetic data generation & evaluation.
Last modified 7d ago
Copyright (c) 2023, DataCebo, Inc.