Sequential Metadata

Use this guide to write a description for a single data table that represents sequential data, for example, a timeseries. In sequential data, rows have a specific order. Your data table may contain multiple, independent sequences belonging to different entities. See the diagram below for an illustration of sequential data.

Your data description is called metadata. SDMetrics expects metadata as a Python dictionary object.

Click to see the sequential table's metadata

This is the metadata dictionary for the illustrated sequential table

{
    "sequence_key": "Patient ID",
    "sequence_index": "Time",
    "columns": {
        "Patient ID": {
            "sdtype": "id",
            "regex_format": "ID_[0-9]{3}"
        },
        "Address": {
            "sdtype": "address",
            "pii": True
        },
        "Smoker": {
            "sdtype": "boolean"
        },
        "Time": {
            "sdtype": "datetime",
            "datetime_format": "%m/%d/%Y"
        },
        "Heart Rate": {
            "sdtype": "categorical"
        },
        "Systolic BP": {
            "sdtype": "numerical"
        }
    }
}

Metadata Specification

The file is an object can have multiple keys:

  • "primary_key": the column name used to identify a row in your table

  • "sequence_key": the name of a column that identifies each unique sequence in your data

  • "sequence_index": the column name used to order the rows in the table

  • (required) "columns": a dictionary description of each column

{
    "sequence_key": "Patient ID",
    "sequence_index": "Time",
    "columns": { <column information> }
}

Column Information

Inside "columns", you will describe each column. You'll start with the name of the column. Then you'll specify the type of data and any other information about it.

There are specific data types to choose from. Expand the options below to learn about the data types.

Boolean columns represent True or False values.

"active": { 
    "sdtype": "boolean"
}

Properties (None)

Saving & Loading Metadata

After creating your dictionary, you can save it as a JSON file. For example, my_metadata_file.json.

import json

with open('my_metadata_file.json', 'w') as f:
    json.dump(my_metadata_dict, f)

In the future, you can load the Python dictionary by reading from the file.

import json 

with open('my_metadata_file.json') as f:
    my_metadata_dict = json.load(f)

# use my_metadata_dict in the SDMetrics library

Last updated