Synthetic Data Vault
GitHubSlackDataCebo
  • Welcome to the SDV!
  • Tutorials
  • Explore SDV
    • SDV Community
    • SDV Enterprise
      • ⭐Compare Features
    • SDV Bundles
      • ❖ AI Connectors
      • ❖ CAG
      • ❖ Differential Privacy
      • ❖ XSynthesizers
  • Single Table Data
    • Data Preparation
      • Loading Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • GaussianCopulaSynthesizer
        • CTGANSynthesizer
        • TVAESynthesizer
        • ❖ XGCSynthesizer
        • ❖ SegmentSynthesizer
        • * DayZSynthesizer
        • ❖ DPGCSynthesizer
        • ❖ DPGCFlexSynthesizer
        • CopulaGANSynthesizer
      • Customizations
        • Constraints
        • Preprocessing
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Multi Table Data
    • Data Preparation
      • Loading Data
        • Demo Data
        • CSV
        • Excel
        • ❖ AlloyDB
        • ❖ BigQuery
        • ❖ MSSQL
        • ❖ Oracle
        • ❖ Spanner
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • * DayZSynthesizer
        • * IndependentSynthesizer
        • HMASynthesizer
        • * HSASynthesizer
      • Customizations
        • Constraints
        • Preprocessing
      • * Performance Estimates
    • Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Sequential Data
    • Data Preparation
      • Loading Data
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • PARSynthesizer
      • Customizations
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
  • Concepts
    • Metadata
      • Sdtypes
      • Metadata API
      • Metadata JSON
    • Constraints
      • Predefined Constraints
        • Positive
        • Negative
        • ScalarInequality
        • ScalarRange
        • FixedIncrements
        • FixedCombinations
        • ❖ FixedNullCombinations
        • ❖ MixedScales
        • OneHotEncoding
        • Inequality
        • Range
        • * ChainedInequality
      • Custom Logic
        • Example: IfTrueThenZero
      • ❖ Constraint Augmented Generation (CAG)
        • ❖ CarryOverColumns
        • ❖ CompositeKey
        • ❖ ForeignToForeignKey
        • ❖ ForeignToPrimaryKeySubset
        • ❖ PrimaryToPrimaryKey
        • ❖ PrimaryToPrimaryKeySubset
        • ❖ SelfReferentialHierarchy
        • ❖ ReferenceTable
        • ❖ UniqueBridgeTable
  • Support
    • Troubleshooting
      • Help with Installation
      • Help with SDV
    • Versioning & Backwards Compatibility Policy
Powered by GitBook

Copyright (c) 2023, DataCebo, Inc.

On this page
  • Parameters
  • Example
  • FAQs
  1. Concepts
  2. Constraints
  3. Predefined Constraints

ScalarInequality

Compatibility: A single numerical or datetime column

The ScalarInequality constraint enforces that all values in a column are greater or less than a fixed (scalar) value. That is, it enforces a lower or upper bound to the synthetic data.

Some models already learn the min and max values of every column in the real dataset and enforce the bounds in the synthetic dataset. For such models, you do not need to add this constraint.

Parameters

(required) column_name: The name of the column that must follow the constraint

(required) relation: The inequality relation between the column name and the value

'>'

The column is greater than the value

'>='

The column is greater than or equal to the value

'<'

The column is less than the value

'<='

The column is less than or equal to the value

(required) value: The value that the column should be compared against

<int or float>

A numerical value

<string>

A string representing a datetime value

Example

Define your constraint using the parameters and then add it to a synthesizer.

my_constraint = {
    'constraint_class': 'ScalarInequality',
    'table_name': 'guests', # for multi table synthesizers
    'constraint_parameters': {
        'column_name': 'checkin_date',
        'relation': '>=',
        'value':  '01 Jan 2020'
    }
}

my_synthesizer.add_constraints(constraints=[
    my_constraint
])

FAQs

Why am I getting a privacy warning?

Adding this constraint exposes the min or max value of the data. This may be private information. Here are a few questions to ask yourself:

  • Are the min and max values of the real data well-known, or did I discover them by looking at the real data?

  • Can someone use the min and max values to uncover other sensitive attributes?

  • Who do I plan to share the synthetic data with?

Always evaluate the privacy risk before sharing your synthetic data broadly.

What happens to missing values?

This constraint ignores missing values. The constraint considered is valid as long as the numerical values (non-missing values) follow the scalar inequality.

What if I don't have a single, fixed value to compare to?
PreviousNegativeNextScalarRange

Last updated 11 months ago

Shortcuts Available. If you want to enforce a lower bound of 0, use the . For an upper bound of 0, use the . If you want to enforce both upper and lower bounds, use the .

Use the constraint to compare to values between two different columns.

Positive constraint
Negative constraint
ScalarRange constraint
Inequality