# Welcome to the SDV!

The **Synthetic Data Vault** (SDV) is a Python library designed to be your one-stop shop for creating tabular synthetic data.

<figure><img src="https://1967107441-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FfNxEeZzl9uFiJ4Zf4BRZ%2Fuploads%2FOBG7qwnU9TkQQpIwga5i%2FSDV%20Intro.png?alt=media&#x26;token=3bf4ade2-2943-44d8-8048-bbc1fba6d690" alt=""><figcaption></figcaption></figure>

## Key Features

:brain: **Train your own generative AI model.** Choose from a variety of AI algorithms designed for tabular data — single table, sequential, or multi-table (relational) data. Train your own synthesizer using your real data, and create any amount of synthetic data on-demand. SDV is designed to work on-prem, with standard CPUs.

:bar\_chart: **Evaluate & visualize synthetic data.** Measure the statistical quality of your synthetic data and diagnose problems. For even more insight, create visualizations that compare your synthetic data with your real data.

:gear: **Customize your synthesizer.** The SDV platform offers powerful features for creating higher quality synthetic data. You can add constraints, adjust the data preprocessing, and selecting anonymization options for any SDV synthesizer.

## Get started with SDV Community

Get started with the publicly available [**SDV Community**](https://docs.sdv.dev/sdv/explore/sdv-community), distributed under the [Business Source License](https://github.com/sdv-dev/SDV/blob/main/LICENSE).

```bash
pip install sdv
```

SDV Community is great for exploring the benefits of synthetic data. Train a generative AI with your own, simple datasets as a proof-of-concept. Create synthetic data that has the same patterns.

```python
import pandas as pd
from sdv.single_table import GaussianCopulaSynthesizer
from sdv.metadata import Metadata

data = pd.read_csv('my_data_file.csv')
metadata = Metadata.detect_from_dataframe(data)

synthesizer = GaussianCopulaSynthesizer(metadata)
synthesizer.fit(data)
synthetic_data = synthesizer.sample(num_rows=1000)
```

**Get started now!** Check out the SDV Community [installation guide](https://docs.sdv.dev/sdv/explore/sdv-community#installation) and [tutorials](https://docs.sdv.dev/sdv/tutorials).

## Take synthetic data to the next level with SDV Enterprise

**SDV Enterprise** is available to licensed users. With SDV Enterprise, you'll have access to everything in SDV Community *plus the ability to ...*

:white\_check\_mark: Create synthetic data for large numbers of complex, interconnected data tables using scalable synthesizers

:white\_check\_mark: Improve the quality of your synthetic data with more advanced data preprocessing, deeper data understanding, and enhanced AI algorithms

:white\_check\_mark: Easily integrate data sources and deploy synthetic data applications enterprise-wide

To learn more, visit the [**SDV Enterprise**](https://docs.sdv.dev/sdv/explore/sdv-enterprise) page.

## Owned & Maintained by DataCebo

The SDV library is a part of the greater [Synthetic Data Vault Project](https://sdv.dev/), first created at MIT's Data to AI Lab in 2016. After 4 years of research and traction with enterprise, we created DataCebo in 2020 with the goal of growing the project.

Today, [DataCebo](https://datacebo.com/) is the proud developer of the SDV, the largest ecosystem for synthetic data generation & evaluation.
