LogoLogo
GitHubSlackDataCebo
  • SDMetrics
  • Getting Started
    • Installation
    • Quickstart
    • Metadata
      • Single Table Metadata
      • Multi Table Metadata
      • Sequential Metadata
  • Reports
    • Quality Report
      • What's included?
      • Single Table API
      • Multi Table API
    • Diagnostic Report
      • What's included?
      • Single Table API
      • Multi Table API
    • Other Reports
    • Visualization Utilities
  • Metrics
    • Diagnostic Metrics
      • BoundaryAdherence
      • CardinalityBoundaryAdherence
      • CategoryAdherence
      • KeyUniqueness
      • ReferentialIntegrity
      • TableStructure
    • Quality Metrics
      • CardinalityShapeSimilarity
      • CategoryCoverage
      • ContingencySimilarity
      • CorrelationSimilarity
      • KSComplement
      • MissingValueSimilarity
      • RangeCoverage
      • SequenceLengthSimilarity
      • StatisticMSAS
      • StatisticSimilarity
      • TVComplement
    • Privacy Metrics
      • DCRBaselineProtection
      • DCROverfittingProtection
      • DisclosureProtection
      • DisclosureProtectionEstimate
      • CategoricalCAP
    • ML Augmentation Metrics
      • BinaryClassifierPrecisionEfficacy
      • BinaryClassifierRecallEfficacy
    • Metrics in Beta
      • CSTest
      • Data Likelihood
        • BNLikelihood
        • BNLogLikelihood
        • GMLikelihood
      • Detection: Sequential
      • Detection: Single Table
      • InterRowMSAS
      • ML Efficacy: Sequential
      • ML Efficacy: Single Table
        • Binary Classification
        • Multiclass Classification
        • Regression
      • NewRowSynthesis
      • * OutlierCoverage
      • Privacy Against Inference
      • * SmoothnessSimilarity
  • Resources
    • Citation
    • Contributions
      • Defining your metric
      • Development
      • Release FAQs
    • Enterprise
      • Domain Specific Reports
    • Blog
Powered by GitBook
On this page
  1. Metrics

Quality Metrics

PreviousTableStructureNextCardinalityShapeSimilarity

Last updated 1 month ago

Quality metrics capture the statistical similarity between real data and synthetic data. If the synthetic and real data are statistically similar, we refer to the synthetic data as being high quality.

We intend the quality metrics to be aspirational. While it may not always be possible to achieve 100% quality on all metrics, optimizing them can benefit your downstream synthetic data use case.

Measure the quality of your entire dataset. The is designed to capture quality measurements across multiple tables and columns. It determines which metrics to apply based on the type of columns, providing a consolidated score.

Browse

Apply these metrics to individual columns and tables in your data:

  • , : compare column shapes (aka marginal distributions, histograms)

  • , : compare 2D distributions & pairwise correlations

  • : compare the frequency of parent/child connections (aka cardinality)

  • , : measure whether the overall synthetic data spans all the possibilities

  • , : compares the quality of real and synthetic data that represents sequential information

  • , : compare individual statistics of the data

Quality Report
KSComplement
TVComplement
ContingencySimilarity
CorrelationSimilarity
CardinalityShapeSimilarity
CategoryCoverage
RangeCoverage
SequenceLengthSimilarity
StatisticMSAS
MissingValueSimilarity
StatisticSimilarity