Synthetic Data Vault
GitHubSlackDataCebo
  • Welcome to the SDV!
  • Tutorials
  • Explore SDV
    • SDV Community
    • SDV Enterprise
      • ⭐Compare Features
    • SDV Bundles
      • ❖ AI Connectors
      • ❖ CAG
      • ❖ Differential Privacy
      • ❖ XSynthesizers
  • Single Table Data
    • Data Preparation
      • Loading Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • GaussianCopulaSynthesizer
        • CTGANSynthesizer
        • TVAESynthesizer
        • ❖ XGCSynthesizer
        • ❖ SegmentSynthesizer
        • * DayZSynthesizer
        • ❖ DPGCSynthesizer
        • ❖ DPGCFlexSynthesizer
        • CopulaGANSynthesizer
      • Customizations
        • Constraints
        • Preprocessing
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Multi Table Data
    • Data Preparation
      • Loading Data
        • Demo Data
        • CSV
        • Excel
        • ❖ AlloyDB
        • ❖ BigQuery
        • ❖ MSSQL
        • ❖ Oracle
        • ❖ Spanner
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • * DayZSynthesizer
        • * IndependentSynthesizer
        • HMASynthesizer
        • * HSASynthesizer
      • Customizations
        • Constraints
        • Preprocessing
      • * Performance Estimates
    • Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Sequential Data
    • Data Preparation
      • Loading Data
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • PARSynthesizer
      • Customizations
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
  • Concepts
    • Metadata
      • Sdtypes
      • Metadata API
      • Metadata JSON
    • Constraints
      • Predefined Constraints
        • Positive
        • Negative
        • ScalarInequality
        • ScalarRange
        • FixedIncrements
        • FixedCombinations
        • ❖ FixedNullCombinations
        • ❖ MixedScales
        • OneHotEncoding
        • Inequality
        • Range
        • * ChainedInequality
      • Custom Logic
        • Example: IfTrueThenZero
      • ❖ Constraint Augmented Generation (CAG)
        • ❖ CarryOverColumns
        • ❖ CompositeKey
        • ❖ ForeignToForeignKey
        • ❖ ForeignToPrimaryKeySubset
        • ❖ PrimaryToPrimaryKey
        • ❖ PrimaryToPrimaryKeySubset
        • ❖ SelfReferentialHierarchy
        • ❖ ReferenceTable
        • ❖ UniqueBridgeTable
  • Support
    • Troubleshooting
      • Help with Installation
      • Help with SDV
    • Versioning & Backwards Compatibility Policy
Powered by GitBook

Copyright (c) 2023, DataCebo, Inc.

On this page
  • Included Features
  • Installation
  1. Explore SDV
  2. SDV Bundles

❖ Differential Privacy

Previous❖ CAGNext❖ XSynthesizers

Last updated 15 days ago

The Differential Privacy bundle allows you to create synthetic data that is private, according to methods that are backed by . The differential privacy framework enforces a limit on how much one individual record can affect the synthesizer — and ultimately leak into the synthetic data.

Share your synthetic data broadly. Our differential privacy synthesizers guarantee that a single row of data will not unduly affect the patterns that the synthesizer learns. We use , which allows you to provide a privacy loss budget, ε (epsilon). This budget allows you to control the privacy/quality tradeoffs.

Upscale your synthetic data. Once you've fit your synthesizer, use it to create any size of differentially-private synthetic data — even 10x or 100x the original size. Privacy guarantees apply to all data your synthesizer creates.

Included Features

Synthesizers for generating differentially-private data.

Preprocessing methods for generating differentially private columns.

Under-the-hood, the synthesizers use preprocessing techniques for generating differentially private columns of data. You can apply these transformers in a standalone way.

Installation

Purchase the Differential Privacy bundle and install it separately.

% pip install -U bundle-differential-privacy --index-url https://pypi.datacebo.com

Save and share your synthesizer. Save and load in your synthesizer to sample more synthetic data at any time. No real data or sensitive statistics are saved in the synthesizer, so you can share it without worry.

The creates differentially private data using the GaussianCopula method

The experimental runs a similar method, but offers more flexibility in the data pre-processing that you can use

Noise the column using differential privacy: , , ,

Normalize the column into numerical data of a specific shape, using differential privacy: ,

This command prompts you for your .

💾
DPGCSynthesizer
DPGCFlexSynthesizer
DPLaplaceNoiser
DPTimestampLaplaceNoiser
DPResponseRandomizer
DPWeightedResponseRandomizer
DPECDFNormalizer
DPDiscreteECDFNormalizer
📊
⭐
mathematically-rigorous findings
ε-differential privacy
SDV Enterprise credentials
The differential privacy framework enforces a limit on how much 1 individual record (row of data, outlined in orange) can affect what the synthesizer learns (dots, colored in orange).