Sequential Metadata JSON

This guide describes the sequential metadata JSON spec.

Click to see the metadata JSON
{
    "METADATA_SPEC_VERSION": "SINGLE_TABLE_V1",
    "sequence_key": "Patient ID",
    "sequence_index": "Time",
    "columns": {
        "Patient ID": { "sdtype": "id", "regex_format": "ID_[0-9]{3}" },
        "Address": { "sdtype": "physical_address", "pii": true },
        "Smoker": { "sdtype": "boolean" },
        "Time": { "sdtype": "datetime", "datetime_format": "%m/%d/%Y" },
        "Heart Rate": { "sdtype": "categorical" },
        "Systolic BP": { "sdtype": "numerical" }
    }
}

Create your metadata programmatically. Use the Python API to automatically detect the metadata based on your data.

Overview

The JSON for a sequential table is the same as the single table JSON with some added elements.

See Single Table JSON for more details.

Sequence Specific Concepts

You can optionally include additional metadata related to your sequences.

"sequence_key": A column name of the sequence key, if you have multi-sequence data

The sequence key is a column that identify which row(s) belong to which sequences. This is usually an ID column but it may also be a PII sdtype (such as "phone_number").

This is important for tables that contain multiple sequences. In our example, the sequence key is 'Patient ID' because this column is used to break up the sequences.

If you don't supply a sequence key, the SDV assumes that your table only contains a single sequence. Note: The SDV sequential models do not fully support single sequence data.

"sequence_index": A column name of the sequence index, if you have sequential data

The sequence index determines the spacing between the rows in a sequence. Use this if you have an explicit index such as a timestamp. If you don't supply a sequence index, the SDV assumes there is equal spacing of an unknown unit.

Last updated

Copyright (c) 2023, DataCebo, Inc.