Explore SDV

SDV is available in Public or Enterprise formats. Use this page to determine which one is right for your project needs.

Public SDV

Explore Synthetic Data. Train a generative AI with your own, simple datasets as a proof-of-concept. Create synthetic data that has the same patterns.

Publicly available with a Business Source License. Get started today!

SDV Enterprise

Ready for scale? Expand synthetic data solutions in your enterprise. Create generate AIs for more complex datasets.

To learn more about pricing and plans, visit our website.

Features

AI-Based Synthesizers

These synthesizers use AI to learn patterns from your data and use them to recreate synthetic data.

Public SDVSDV Enterprise

GaussianCopula statistical AI

CTGAN, TVAE, CopulaGAN neural networks

PAR for sequential data

HMA multi-table for limited tables (<5)

HSA multi-table for unlimited tables

Independent multi-table for unlimited tables

Test Data Synthesizers

These synthesizers create random test data based on metadata alone. They do not use AI so you do not need to input any training data.

Public SDVSDV Enterprise

DayZSynthesizer single table

DayZSynthesizer multi table

Integrate You Data

These features make it easy to integrate the SDV into your application and pipeline.

Public SDVSDV Enterprise

Auto-detect metadata using data CSVs or DataFrames

Auto-detect metadata with a DDL file from an SQL schema

Pre-Process Statistical Information

Transformers are used to pre-process your data, which can improve data quality. SDV synthesizers select transformers by default, but you can always customize these to your dataset.

Public SDVSDV Enterprise

FloatFormatter for missing value imputation, numerical columns

ClusterBased and Gaussian Normalizers statistical transforms

Uniform, Label, and OneHot Encoding for discrete variables ( and )

Datetime Encoding including datetime format parsing

OutlierEncoder for numerical outliers

Understand & Anonymize Real-World Concepts

Transformers are used to pre-process your data, which can improve data quality. SDV synthesizers select transformers by default, but you can always customize these to your dataset.

These transformers are geared towards columns that correspond to industry or domain-specific concepts. Their structure may be human-created.

Public SDVSDV Enterprise

RegexGenerator, IDGenerator for keys and IDs

AnonymizedFaker general-purpose anonymization

PsuedoAnonymizedFaker for general pseudo-anonymization with a mapping

Emails understanding domains

Addresses understanding locations

Phone Numbers understanding country and area codes

[Coming soon!] GPS Coordinates understanding geographical areas and distances

Constraints

Constraints represent business rules and logic that you can apply to your synthesizer.

Public SDVSDV Enterprise

Predefined logic for individual columns: FixedIncrements, Negative, Positive, ScalarInequality, ScalarRange

Predefined logic for multiple columns: FixedCombinations, Inequality, OneHotEncoding, Range

Write your own custom constraints

Advanced, predefined logic: ChainedInequality

Support for custom constraints and additional predefined logic

Synthetic Data Evaluation

Evaluate your synthetic data by comparing it against the real data.

Public SDVSDV Enterprise

Access to SDMetrics library vendor-agnostic, open source

Diagnostic Report basic data validity checks , single and multi-table

Quality Report statistical similarity, single and multi-table

Visualization 1D and 2D bars, scatterplots, heatmaps and more

Use case-specific metrics: OutlierCoverage, SmoothnessSimilarity

Last updated

Copyright (c) 2023, DataCebo, Inc.