Welcome to the SDV!

The Synthetic Data Vault (SDV) is a Python library designed to be your one-stop shop for creating tabular synthetic data. It is available to the public under the Business Source License. Additional plans are also available.

Key Features

🧠 Train your own Generative AI Model

Choose from a variety of AI models meant for tabular data. Browse options for single table and multi-table (relational) data.

📊 Evaluate & Visualize Synthetic Data

Diagnose problems and measure statistical quality. For even more insight, visualize synthetic vs. real data.

⚙️ Customize your Synthesizer

Add business logic, control data the data pre-processing rules, and select anonymization options for sensitive values.

Ready to take SDV to the next level?

With SDV Enterprise you can take SDV to the next level with more scalable synthesizers, deeper data understanding, and integrations. You can also deploy synthetic data applications enterprise-wide. To learn more about pricing and plans, visit our website.

Owned & Maintained by DataCebo

The SDV library is a part of the greater Synthetic Data Vault Project, first created at MIT's Data to AI Lab in 2016. After 4 years of research and traction with enterprise, we created DataCebo in 2020 with the goal of growing the project.

Today, DataCebo is the proud developer of the SDV, the largest ecosystem for synthetic data generation & evaluation.

Last updated

Copyright (c) 2023, DataCebo, Inc.