LogoLogo
GitHubSlackDataCebo
  • SDMetrics
  • Getting Started
    • Installation
    • Quickstart
    • Metadata
      • Single Table Metadata
      • Multi Table Metadata
      • Sequential Metadata
  • Reports
    • Quality Report
      • What's included?
      • Single Table API
      • Multi Table API
    • Diagnostic Report
      • What's included?
      • Single Table API
      • Multi Table API
    • Other Reports
    • Visualization Utilities
  • Metrics
    • Diagnostic Metrics
      • BoundaryAdherence
      • CardinalityBoundaryAdherence
      • CategoryAdherence
      • KeyUniqueness
      • ReferentialIntegrity
      • TableStructure
    • Quality Metrics
      • CardinalityShapeSimilarity
      • CategoryCoverage
      • ContingencySimilarity
      • CorrelationSimilarity
      • KSComplement
      • MissingValueSimilarity
      • RangeCoverage
      • SequenceLengthSimilarity
      • StatisticMSAS
      • StatisticSimilarity
      • TVComplement
    • Privacy Metrics
      • DCRBaselineProtection
      • DCROverfittingProtection
      • DisclosureProtection
      • DisclosureProtectionEstimate
      • CategoricalCAP
    • ML Augmentation Metrics
      • BinaryClassifierPrecisionEfficacy
      • BinaryClassifierRecallEfficacy
    • Metrics in Beta
      • CSTest
      • Data Likelihood
        • BNLikelihood
        • BNLogLikelihood
        • GMLikelihood
      • Detection: Sequential
      • Detection: Single Table
      • InterRowMSAS
      • ML Efficacy: Sequential
      • ML Efficacy: Single Table
        • Binary Classification
        • Multiclass Classification
        • Regression
      • NewRowSynthesis
      • * OutlierCoverage
      • Privacy Against Inference
      • * SmoothnessSimilarity
  • Resources
    • Citation
    • Contributions
      • Defining your metric
      • Development
      • Release FAQs
    • Enterprise
      • Domain Specific Reports
    • Blog
Powered by GitBook
On this page
  1. Metrics

Privacy Metrics

PreviousTVComplementNextDCRBaselineProtection

Last updated 1 month ago

Privacy metrics broadly capture the safety that synthetic data can provide to you, especially in cases where you'd like to disclose the synthetic data rather than real data. It's important to note that safety can be defined in many ways, depending on what type of information is valuable to protect and the assumptions about how it may be leaked.

Privacy metrics are based on statistical patterns across your real and synthetic datasets. You can decide whether divulging the pattern (via synthetic data) is worth it for your project.

Browse

Apply these metrics to evaluate the privacy of your data tables:

  • , : Measure the risk of disclosing sensitive information about specific, sensitive columns in your dataset.

  • : Measure the distance between the real and synthetic data, ensuring that your synthetic data doesn't too closely match the real data (Overfitting refers to your synthesizer being overfit on the real data.)

  • : Measure the distance between the real and synthetic data, comparing it against random data as a baseline. (Random data provides the highest privacy.)

DisclosureProtection
DisclosureProtectionEstimate
DCROverfittingProtection
DCRBaselineProtection