Welcome to the SDV!
Last updated
Last updated
The Synthetic Data Vault (SDV) is a Python library designed to be your one-stop shop for creating tabular synthetic data.
SDV Community is great for exploring the benefits of synthetic data. Train a generative AI with your own, simple datasets as a proof-of-concept. Create synthetic data that has the same patterns.
SDV Enterprise is available to licensed users. With SDV Enterprise, you'll have access to everything in SDV Community plus the ability to ...
Train your own generative AI model. Choose from a variety of AI algorithms designed for tabular data — single table, sequential, or multi-table (relational) data. Train your own synthesizer using your real data, and create any amount of synthetic data on-demand. SDV is designed to work on-prem, with standard CPUs.
Evaluate & visualize synthetic data. Measure the statistical quality of your synthetic data and diagnose problems. For even more insight, create visualizations that compare your synthetic data with your real data.
Customize your synthesizer. The SDV platform offers powerful features for creating higher quality synthetic data. You can add constraints, adjust the data preprocessing, and selecting anonymization options for any SDV synthesizer.
Get started with the publicly available , distributed under the .
Get started now! Check out the SDV Community and .
Create synthetic data for large numbers of complex, interconnected data tables using scalable synthesizers
Improve the quality of your synthetic data with more advanced data preprocessing, deeper data understanding, and enhanced AI algorithms
Easily integrate data sources and deploy synthetic data applications enterprise-wide
To learn more, visit the page.
The SDV library is a part of the greater , first created at MIT's Data to AI Lab in 2016. After 4 years of research and traction with enterprise, we created DataCebo in 2020 with the goal of growing the project.
Today, is the proud developer of the SDV, the largest ecosystem for synthetic data generation & evaluation.