Explore SDV

SDV is available in Community and Enterprise editions.

SDV Community

Explore Synthetic Data. Train a generative AI with your own, simple datasets as a proof-of-concept. Create synthetic data that has the same patterns.

Publicly available with a Business Source License. Get started today!

SDV Enterprise

Ready for scale? Expand synthetic data solutions in your enterprise. Create generate AIs for more complex datasets.

Reach out to ask about pricing and plans. Visit our website.

SDV Bundles

SDV Enterprise users have the option to purchase any of the bundles below for access to additional features. Reach out to ask about pricing and plans. Visit our website.

Use Constraint Augmented Generation (CAG) to apply complex logic between multiple tables.

Take your gen AI models to the next level with enhanced synthesizers and transformers.

More coming soon!

Check back for additional bundle offerings.

Full Feature List

AI-Based Synthesizers

These synthesizers use AI to learn patterns from your data and use them to recreate synthetic data.

SDV CommunitySDV Enterprise

GaussianCopula statistical AI

CTGAN, TVAE, CopulaGAN neural networks

XGC advanced Copula modeling with flexible shapes, faster runtime and more

💠 XSynthesis bundle

SegmentSynthesizer for separately modeling highly segmented data

💠 XSynthesis bundle

PAR for sequential data

HMA multi-table for limited tables (<5)

HSA multi-table for unlimited tables

Independent multi-table for unlimited tables

Performance estimates for multi-table synthesizers with various dataset sizes

Test Data Synthesizers

These synthesizers create random test data based on metadata alone. They do not use AI so you do not need to input any training data.

SDV CommunitySDV Enterprise

DayZSynthesizer single table

DayZSynthesizer multi table

Integrate You Data

These features make it easy to integrate the SDV into your application and pipeline.

SDV CommunitySDV Enterprise

Auto-detect metadata using data CSVs or DataFrames

Directly connect to a database for importing real data and creating metadata

Connect to a database for exporting synthetic data

Pre-Process Statistical Information

Transformers are used to pre-process your data, which can improve data quality. SDV synthesizers select transformers by default, but you can always customize these to your dataset.

SDV CommunitySDV Enterprise

FloatFormatter for missing value imputation, numerical columns

ClusterBasedNormalizer and GaussianNormalizer statistical transforms

XGaussianNormalizer with support for 100+ statistical distributions

💠 XSynthesis bundle

Uniform, Label, and OneHot Encoding for discrete variables ( and )

Datetime Encoding including datetime format parsing

OutlierEncoder for numerical outliers

Understand & Anonymize Real-World Concepts

Transformers are used to pre-process your data, which can improve data quality. SDV synthesizers select transformers by default, but you can always customize these to your dataset.

These transformers are geared towards columns that correspond to industry or domain-specific concepts. Their structure may be human-created.

SDV CommunitySDV Enterprise

RegexGenerator, IDGenerator for keys and IDs

AnonymizedFaker general-purpose anonymization

PsuedoAnonymizedFaker for general pseudo-anonymization with a mapping

Emails understanding domains

Addresses understanding locations

Phone Numbers understanding country and area codes

GPS Coordinates understanding geographical areas and distances

Constraints

Constraints represent business rules and logic that you can apply to your synthesizer.

SDV CommunitySDV Enterprise

Predefined logic for individual columns: FixedIncrements, Negative, Positive, ScalarInequality, ScalarRange

Predefined logic for multiple columns: FixedCombinations, Inequality, OneHotEncoding, Range

Write your own custom constraints

Advanced, predefined logic: ChainedInequality

Support for custom constraints and additional predefined logic

Advanced predefined logic: FixedNullCombinations, MixedScales

💠 CAG bundle

Advanced, multi-table logic & algorithms: CarryOverColumns, CompositeKey, ForeignToPrimaryKeySubset, UniqueBridgeTable, and more.

💠 CAG bundle

Synthetic Data Evaluation

Evaluate your synthetic data by comparing it against the real data.

Public SDVSDV Enterprise

Access to SDMetrics library vendor-agnostic, open source

Diagnostic Report basic data validity checks , single and multi-table

Quality Report statistical similarity, single and multi-table

Visualization 1D and 2D bars, scatterplots, heatmaps and more

Use case-specific metrics: OutlierCoverage, SmoothnessSimilarity

Last updated

Copyright (c) 2023, DataCebo, Inc.