Explore SDV
SDV is available in Public or Enterprise formats. Use this page to determine which one is right for your project needs.
Public SDV
Explore Synthetic Data. Train a generative AI with your own, simple datasets as a proof-of-concept. Create synthetic data that has the same patterns.
Publicly available with a Business Source License. Get started today!
SDV Enterprise
Ready for scale? Expand synthetic data solutions in your enterprise. Create generate AIs for more complex datasets.
To learn more about pricing and plans, visit our website.
Features
AI-Based Synthesizers
These synthesizers use AI to learn patterns from your data and use them to recreate synthetic data.
Public SDV | SDV Enterprise | |
---|---|---|
GaussianCopula statistical AI | ||
PAR for sequential data | ||
HMA multi-table for limited tables (<5) | ||
HSA multi-table for unlimited tables | ||
Independent multi-table for unlimited tables |
Test Data Synthesizers
These synthesizers create random test data based on metadata alone. They do not use AI so you do not need to input any training data.
Public SDV | SDV Enterprise | |
---|---|---|
DayZSynthesizer single table | ||
DayZSynthesizer multi table |
Integrate You Data
These features make it easy to integrate the SDV into your application and pipeline.
Public SDV | SDV Enterprise | |
---|---|---|
Auto-detect metadata using data CSVs or DataFrames | ||
Auto-detect metadata with a DDL file from an SQL schema | ||
Directly connect to a database for importing real data and exporting synthetic data |
Pre-Process Statistical Information
Transformers are used to pre-process your data, which can improve data quality. SDV synthesizers select transformers by default, but you can always customize these to your dataset.
Public SDV | SDV Enterprise | |
---|---|---|
FloatFormatter for missing value imputation, numerical columns | ||
ClusterBased and Gaussian Normalizers statistical transforms | ||
Datetime Encoding including datetime format parsing | ||
OutlierEncoder for numerical outliers |
Understand & Anonymize Real-World Concepts
Transformers are used to pre-process your data, which can improve data quality. SDV synthesizers select transformers by default, but you can always customize these to your dataset.
These transformers are geared towards columns that correspond to industry or domain-specific concepts. Their structure may be human-created.
Public SDV | SDV Enterprise | |
---|---|---|
RegexGenerator, IDGenerator for keys and IDs | ||
AnonymizedFaker general-purpose anonymization | ||
PsuedoAnonymizedFaker for general pseudo-anonymization with a mapping | ||
Emails understanding domains | ||
Addresses understanding locations | ||
Phone Numbers understanding country and area codes | ||
GPS Coordinates understanding geographical areas and distances |
Constraints
Constraints represent business rules and logic that you can apply to your synthesizer.
Public SDV | SDV Enterprise | |
---|---|---|
Predefined logic for individual columns: FixedIncrements, Negative, Positive, ScalarInequality, ScalarRange | ||
Predefined logic for multiple columns: FixedCombinations, Inequality, OneHotEncoding, Range | ||
Write your own custom constraints | ||
Advanced, predefined logic: ChainedInequality | ||
Support for custom constraints and additional predefined logic |
Synthetic Data Evaluation
Evaluate your synthetic data by comparing it against the real data.
Public SDV | SDV Enterprise | |
---|---|---|
Access to SDMetrics library vendor-agnostic, open source | ||
Diagnostic Report basic data validity checks , single and multi-table | ||
Quality Report statistical similarity, single and multi-table | ||
Privacy Metrics: CategoricalCAP, NewRowSynthesis, Inference Attacks | ||
Visualization 1D and 2D bars, scatterplots, heatmaps and more | ||
Use case-specific metrics: OutlierCoverage, SmoothnessSimilarity |
Last updated