Explore SDV
SDV is available in Community and Enterprise editions.
SDV Community
Explore Synthetic Data. Train a generative AI with your own, simple datasets as a proof-of-concept. Create synthetic data that has the same patterns.
Publicly available with a Business Source License. Get started today!
SDV Enterprise
Ready for scale? Expand synthetic data solutions in your enterprise. Create generate AIs for more complex datasets.
Reach out to ask about pricing and plans. Visit our website.
SDV Bundles
SDV Enterprise users have the option to purchase any of the bundles below for access to additional features. Reach out to ask about pricing and plans. Visit our website.
Use Constraint Augmented Generation (CAG) to apply complex logic between multiple tables.
Take your gen AI models to the next level with enhanced synthesizers and transformers.
More coming soon!
Check back for additional bundle offerings.
Full Feature List
AI-Based Synthesizers
These synthesizers use AI to learn patterns from your data and use them to recreate synthetic data.
SDV Community | SDV Enterprise | |
---|---|---|
GaussianCopula statistical AI | ✅ | ✅ |
✅ | ||
XGC advanced Copula modeling with flexible shapes, faster runtime and more | ❌ | 💠 XSynthesis bundle |
SegmentSynthesizer for separately modeling highly segmented data | ❌ | 💠 XSynthesis bundle |
PAR for sequential data | ✅ | ✅ |
HMA multi-table for limited tables (<5) | ✅ | ✅ |
HSA multi-table for unlimited tables | ❌ | ✅ |
Independent multi-table for unlimited tables | ❌ | ✅ |
Performance estimates for multi-table synthesizers with various dataset sizes | ❌ | ✅ |
Test Data Synthesizers
These synthesizers create random test data based on metadata alone. They do not use AI so you do not need to input any training data.
SDV Community | SDV Enterprise | |
---|---|---|
DayZSynthesizer single table | ❌ | ✅ |
DayZSynthesizer multi table | ❌ | ✅ |
Integrate You Data
These features make it easy to integrate the SDV into your application and pipeline.
SDV Community | SDV Enterprise | |
---|---|---|
Auto-detect metadata using data CSVs or DataFrames | ✅ | ✅ |
Directly connect to a database for importing real data and creating metadata | ❌ | ✅ |
Connect to a database for exporting synthetic data | ❌ | ✅ |
Pre-Process Statistical Information
Transformers are used to pre-process your data, which can improve data quality. SDV synthesizers select transformers by default, but you can always customize these to your dataset.
SDV Community | SDV Enterprise | |
---|---|---|
FloatFormatter for missing value imputation, numerical columns | ✅ | ✅ |
ClusterBasedNormalizer and GaussianNormalizer statistical transforms | ✅ | ✅ |
XGaussianNormalizer with support for 100+ statistical distributions | ❌ | 💠 XSynthesis bundle |
✅ | ✅ | |
Datetime Encoding including datetime format parsing | ✅ | ✅ |
OutlierEncoder for numerical outliers | ❌ | ✅ |
Understand & Anonymize Real-World Concepts
Transformers are used to pre-process your data, which can improve data quality. SDV synthesizers select transformers by default, but you can always customize these to your dataset.
These transformers are geared towards columns that correspond to industry or domain-specific concepts. Their structure may be human-created.
SDV Community | SDV Enterprise | |
---|---|---|
RegexGenerator, IDGenerator for keys and IDs | ✅ | ✅ |
AnonymizedFaker general-purpose anonymization | ✅ | ✅ |
PsuedoAnonymizedFaker for general pseudo-anonymization with a mapping | ✅ | ✅ |
Emails understanding domains | ❌ | ✅ |
Addresses understanding locations | ❌ | ✅ |
Phone Numbers understanding country and area codes | ❌ | ✅ |
GPS Coordinates understanding geographical areas and distances | ❌ | ✅ |
Constraints
Constraints represent business rules and logic that you can apply to your synthesizer.
SDV Community | SDV Enterprise | |
---|---|---|
Predefined logic for individual columns: FixedIncrements, Negative, Positive, ScalarInequality, ScalarRange | ✅ | ✅ |
Predefined logic for multiple columns: FixedCombinations, Inequality, OneHotEncoding, Range | ✅ | ✅ |
Write your own custom constraints | ✅ | ✅ |
Advanced, predefined logic: ChainedInequality | ❌ | ✅ |
Support for custom constraints and additional predefined logic | ❌ | ✅ |
Advanced predefined logic: FixedNullCombinations, MixedScales | ❌ | 💠 CAG bundle |
Advanced, multi-table logic & algorithms: CarryOverColumns, CompositeKey, ForeignToPrimaryKeySubset, UniqueBridgeTable, and more. | ❌ | 💠 CAG bundle |
Synthetic Data Evaluation
Evaluate your synthetic data by comparing it against the real data.
Public SDV | SDV Enterprise | |
---|---|---|
Access to SDMetrics library vendor-agnostic, open source | ✅ | ✅ |
Diagnostic Report basic data validity checks , single and multi-table | ✅ | ✅ |
Quality Report statistical similarity, single and multi-table | ✅ | ✅ |
Privacy Metrics: CategoricalCAP, NewRowSynthesis, Inference Attacks | ✅ | ✅ |
Visualization 1D and 2D bars, scatterplots, heatmaps and more | ✅ | ✅ |
Use case-specific metrics: OutlierCoverage, SmoothnessSimilarity | ❌ | ✅ |
Last updated