Synthetic Data Vault
GitHubSlackDataCebo
  • Welcome to the SDV!
  • Tutorials
  • Explore SDV
    • SDV Community
    • SDV Enterprise
      • ⭐Compare Features
    • SDV Bundles
      • ❖ AI Connectors
      • ❖ CAG
      • ❖ Differential Privacy
      • ❖ XSynthesizers
  • Single Table Data
    • Data Preparation
      • Loading Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • GaussianCopulaSynthesizer
        • CTGANSynthesizer
        • TVAESynthesizer
        • ❖ XGCSynthesizer
        • ❖ SegmentSynthesizer
        • * DayZSynthesizer
        • ❖ DPGCSynthesizer
        • ❖ DPGCFlexSynthesizer
        • CopulaGANSynthesizer
      • Customizations
        • Constraints
        • Preprocessing
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Multi Table Data
    • Data Preparation
      • Loading Data
        • Demo Data
        • CSV
        • Excel
        • ❖ AlloyDB
        • ❖ BigQuery
        • ❖ MSSQL
        • ❖ Oracle
        • ❖ Spanner
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • * DayZSynthesizer
        • * IndependentSynthesizer
        • HMASynthesizer
        • * HSASynthesizer
      • Customizations
        • Constraints
        • Preprocessing
      • * Performance Estimates
    • Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Sequential Data
    • Data Preparation
      • Loading Data
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • PARSynthesizer
      • Customizations
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
  • Concepts
    • Metadata
      • Sdtypes
      • Metadata API
      • Metadata JSON
    • Constraints
      • Predefined Constraints
        • Positive
        • Negative
        • ScalarInequality
        • ScalarRange
        • FixedIncrements
        • FixedCombinations
        • ❖ FixedNullCombinations
        • ❖ MixedScales
        • OneHotEncoding
        • Inequality
        • Range
        • * ChainedInequality
      • Custom Logic
        • Example: IfTrueThenZero
      • ❖ Constraint Augmented Generation (CAG)
        • ❖ CarryOverColumns
        • ❖ CompositeKey
        • ❖ ForeignToForeignKey
        • ❖ ForeignToPrimaryKeySubset
        • ❖ PrimaryToPrimaryKey
        • ❖ PrimaryToPrimaryKeySubset
        • ❖ SelfReferentialHierarchy
        • ❖ ReferenceTable
        • ❖ UniqueBridgeTable
  • Support
    • Troubleshooting
      • Help with Installation
      • Help with SDV
    • Versioning & Backwards Compatibility Policy
Powered by GitBook

Copyright (c) 2023, DataCebo, Inc.

On this page
  • Parameters
  • Example
  • FAQs
  1. Concepts
  2. Constraints
  3. Predefined Constraints

ScalarRange

Compatibility: A single numerical or datetime column

The ScalarRange constraint enforces that all the values in a column are in between two known, fixed values. That is, it enforces upper and lower bounds to the data.

Some models already learn the min and max values of every column in the real dataset and enforce the bounds in the synthetic dataset. For such models, you do not need to add this constraint.

Parameters

(required) column_name: The name of the column that must follow the constraint

(required) low_value: The lower bound of the range

(required) high_value: The upper bound of the range

strict_boundaries: Whether the column must be strictly in between the low and high values

(default) True

The column must be strictly greater than the low value, and strictly less than the high value.

False

The column must be greater than or equal to the low value, and less than or equal to the high value.

Example

Define your constraint using the parameters and then add it to a synthesizer.

my_constraint = {
    'constraint_class': 'ScalarRange',
    'table_name': 'guests', # for multi table synthesizers
    'constraint_parameters': {
        'column_name': 'amenities_fee',
        'low_value': 0.0,
        'high_value': 500.0,
        'strict_boundaries': False
    }
}

my_synthesizer.add_constraints(constraints=[
    my_constraint
])

FAQs

Why am I getting a privacy warning?

Adding this constraint exposes the min and max values of the data. This may be private information. Here are a few questions to ask yourself:

  • Are the min and max values of the real data well-known, or did I discover them by looking at the real data?

  • Can someone use the min and max values to uncover other sensitive attributes?

  • Who do I plan to share the synthetic data with?

Always evaluate the privacy risk before sharing your synthetic data broadly.

What happens to missing values?

This constraint ignores missing values. The constraint considered is valid as long as the numerical values (non-missing values) are within the range.

What if I want one bound to be strict but not the other?

PreviousScalarInequalityNextFixedIncrements

Last updated 12 months ago

This constraint can only be used when both the upper and lower bounds are strict, or both are not strict. If they are different, use the constraint twice; once for the lower bound and once for the upper bound.

ScalarInequality