Synthetic Data Vault
GitHubSlackDataCebo
  • Welcome to the SDV!
  • Tutorials
  • Explore SDV
    • SDV Community
    • SDV Enterprise
      • ⭐Compare Features
    • SDV Bundles
      • ❖ AI Connectors
      • ❖ CAG
      • ❖ Differential Privacy
      • ❖ XSynthesizers
  • Single Table Data
    • Data Preparation
      • Loading Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • GaussianCopulaSynthesizer
        • CTGANSynthesizer
        • TVAESynthesizer
        • ❖ XGCSynthesizer
        • ❖ SegmentSynthesizer
        • * DayZSynthesizer
        • ❖ DPGCSynthesizer
        • ❖ DPGCFlexSynthesizer
        • CopulaGANSynthesizer
      • Customizations
        • Constraints
        • Preprocessing
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Multi Table Data
    • Data Preparation
      • Loading Data
        • Demo Data
        • CSV
        • Excel
        • ❖ AlloyDB
        • ❖ BigQuery
        • ❖ MSSQL
        • ❖ Oracle
        • ❖ Spanner
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • * DayZSynthesizer
        • * IndependentSynthesizer
        • HMASynthesizer
        • * HSASynthesizer
      • Customizations
        • Constraints
        • Preprocessing
      • * Performance Estimates
    • Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Sequential Data
    • Data Preparation
      • Loading Data
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • PARSynthesizer
      • Customizations
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
  • Concepts
    • Metadata
      • Sdtypes
      • Metadata API
      • Metadata JSON
    • Constraints
      • Predefined Constraints
        • Positive
        • Negative
        • ScalarInequality
        • ScalarRange
        • FixedIncrements
        • FixedCombinations
        • ❖ FixedNullCombinations
        • ❖ MixedScales
        • OneHotEncoding
        • Inequality
        • Range
        • * ChainedInequality
      • Custom Logic
        • Example: IfTrueThenZero
      • ❖ Constraint Augmented Generation (CAG)
        • ❖ CarryOverColumns
        • ❖ CompositeKey
        • ❖ ForeignToForeignKey
        • ❖ ForeignToPrimaryKeySubset
        • ❖ PrimaryToPrimaryKey
        • ❖ PrimaryToPrimaryKeySubset
        • ❖ SelfReferentialHierarchy
        • ❖ ReferenceTable
        • ❖ UniqueBridgeTable
  • Support
    • Troubleshooting
      • Help with Installation
      • Help with SDV
    • Versioning & Backwards Compatibility Policy
Powered by GitBook

Copyright (c) 2023, DataCebo, Inc.

On this page
  1. Sequential Data

Data Preparation

PreviousVisualizationNextLoading Data

Last updated 10 days ago

Sequential data represents ordered records, such as in a timeseries. The entire table may contain records for a single entity (such as a user or patient). Alternatively, your table may also contain multiple, independent sequences belonging to different entities.

Before you begin creating synthetic data, it's important to have your data ready in the right format:

  1. Data, a dictionary that maps every table name to a pandas DataFrame object containing the actual data

Click to see the table's metadata
{
    "METADATA_SPEC_VERSION": "SINGLE_TABLE_V1",
    "sequence_key": "Patient ID",
    "sequence_index": "Time",
    "columns": {
        "Patient ID": { "sdtype": "id", "regex_format": "ID_[0-9]{3}" },
        "Address": { "sdtype": "address", "pii": True },
        "Smoker": { "sdtype": "boolean" },
        "Time": { "sdtype": "datetime", "datetime_format": "%m/%d/%Y" },
        "Heart Rate": { "sdtype": "categorical" },
        "Systolic BP": { "sdtype": "numerical" }
    }
}

Learn More

Metadata, a object that describes your table. It includes the data types in each column, keys and other identifiers.

Metadata

Get started with a demo dataset or load your own data.

Create an object to describe the different columns in your data. Save it for future use.

Loading Data
Creating Metadata
This example shows sequential data related to vital signs. The table contains multiple sequences, each corresponding to a different patient. For each sequences, health measurements change over time.