LogoLogo
GitHubSlackDataCebo
  • SDMetrics
  • Getting Started
    • Installation
    • Quickstart
    • Metadata
      • Single Table Metadata
      • Multi Table Metadata
      • Sequential Metadata
  • Reports
    • Quality Report
      • What's included?
      • Single Table API
      • Multi Table API
    • Diagnostic Report
      • What's included?
      • Single Table API
      • Multi Table API
    • Other Reports
    • Visualization Utilities
  • Metrics
    • Diagnostic Metrics
      • BoundaryAdherence
      • CardinalityBoundaryAdherence
      • CategoryAdherence
      • KeyUniqueness
      • ReferentialIntegrity
      • TableStructure
    • Quality Metrics
      • CardinalityShapeSimilarity
      • CategoryCoverage
      • ContingencySimilarity
      • CorrelationSimilarity
      • KSComplement
      • MissingValueSimilarity
      • RangeCoverage
      • SequenceLengthSimilarity
      • StatisticMSAS
      • StatisticSimilarity
      • TVComplement
    • Privacy & Fairness Metrics
      • DCRBaselineProtection
      • DCROverfittingProtection
      • DisclosureProtection
      • DisclosureProtectionEstimate
      • EqualizedOddsImprovement
      • CategoricalCAP
    • ML Augmentation Metrics
      • BinaryClassifierPrecisionEfficacy
      • BinaryClassifierRecallEfficacy
    • Metrics in Beta
      • CSTest
      • Data Likelihood
        • BNLikelihood
        • BNLogLikelihood
        • GMLikelihood
      • Detection: Sequential
      • Detection: Single Table
      • InterRowMSAS
      • ML Efficacy: Sequential
      • ML Efficacy: Single Table
        • Binary Classification
        • Multiclass Classification
        • Regression
      • NewRowSynthesis
      • * OutlierCoverage
      • Privacy Against Inference
      • * SmoothnessSimilarity
  • Resources
    • Citation
    • Contributions
      • Defining your metric
      • Development
      • Release FAQs
    • Enterprise
      • Domain Specific Reports
    • Blog
Powered by GitBook
On this page
  • Flexible, Intuitive Evaluation
  • 📊 Visualize & share your results with reports
  • ⚖️ Choose from a variety of metrics
  • 📚 Participate in cutting edge research
  • Owned & Maintained by DataCebo

SDMetrics

NextInstallation

Last updated 3 months ago

Synthetic Data Metrics (SDMetrics) is an open source Python library for evaluating tabular synthetic data. Compare synthetic data against real data using a variety metrics, generate visual reports and share them with your team.

Flexible, Intuitive Evaluation

The SDMetrics library is model-agnostic, meaning you can use it with synthetic data created by any model at any time.

📊 Visualize & share your results with reports

Easily generate reports for your project. Reports focus on a particular aspect of synthetic data, for example data quality. Use them to drill down visually until you get answers.

We are also here to help with custom reports tailored to your enterprise needs.

⚖️ Choose from a variety of metrics

You'll find many different types of metrics for evaluating synthetic data. SDMetrics docs explain relevant mathematical concepts and help you decide the best ones to apply.

📚 Participate in cutting edge research

The SDMetrics library welcomes contributions from active research areas! Browse our Metrics in Beta and experiment with cutting edge methods to evaluate your data.

Owned & Maintained by DataCebo

The SDMetrics library is a part of the Synthetic Data Vault Project, first created at MIT's Data to AI Lab in 2016. After 4 years of research and traction with enterprise, we created DataCebo in 2020 with the goal of growing the project.

Today, DataCebo is the proud developer of the SDV, the largest ecosystem for synthetic data generation & evaluation.

This is an example a visualization from the SDMetrics Quality Report.
This is an example illustrating the DisclosureProtection metric that measures privacy.