Sequential Metadata JSON

This guide describes the sequential metadata JSON spec.

An example of sequential data. There are multiple sequences (one for each Patient ID). Within each sequence is an ordered set of rows.
chevron-rightClick to see the metadata JSONhashtag
{
    "METADATA_SPEC_VERSION": "SINGLE_TABLE_V1",
    "sequence_key": "Patient ID",
    "sequence_index": "Time",
    "columns": {
        "Patient ID": { "sdtype": "id", "regex_format": "ID_[0-9]{3}" },
        "Address": { "sdtype": "physical_address", "pii": true },
        "Smoker": { "sdtype": "boolean" },
        "Time": { "sdtype": "datetime", "datetime_format": "%m/%d/%Y" },
        "Heart Rate": { "sdtype": "categorical" },
        "Systolic BP": { "sdtype": "numerical" }
    }
}
circle-check

Overview

circle-check

Sequence Specific Concepts

You can optionally include additional metadata related to your sequences.

"sequence_key": A column name of the sequence key, if you have multi-sequence data

circle-info

The sequence key is a column that identify which row(s) belong to which sequences. This is usually an ID column but it may also be a PII sdtype (such as "phone_number").

This is important for tables that contain multiple sequences. In our example, the sequence key is 'Patient ID' because this column is used to break up the sequences.

If you don't supply a sequence key, the SDV assumes that your table only contains a single sequence. Note: The SDV sequential models do not fully support single sequence data.

"sequence_index": A column name of the sequence index, if you have sequential data

circle-info

The sequence index determines the spacing between the rows in a sequence. Use this if you have an explicit index such as a timestamp. If you don't supply a sequence index, the SDV assumes there is equal spacing of an unknown unit.

Last updated