Use this guide to write a description for multi table data. You have multi table data if your data is present in multiple tables that have rows and columns. Usually the tables are connected to each other through primary and foreign key references.
Your data description is called metadata. SDMetrics expects metadata as a Python dictionary object.
Click to see the metadata
This is the metadata dictionary for the illustrated table
The file is an object that includes a dictionary named "tables".
{
"tables": {
<tables information>
},
}
Tables
The "tables" dictionary contains the information about each individual table of your application. Its keys are the table names and the values are dictionaries that describe each single table. This includes:
"primary_key": the column name used to identify a row in your table
(required) "columns": a dictionary description of each column
Inside "columns", you will describe each column. You'll start with the name of the column. Then you'll specify the type of data and any other information about it.
There are specific data types to choose from. Expand the options below to learn about the data types.
computer_representation: A string that represents how you'll ultimately store the data. This determines the min and max values allowed
Available options are: 'Float', 'Int8', 'Int16', 'Int32', 'Int64', 'UInt8', 'UInt16', 'UInt32', 'UInt64'
ID columns represent identifiers that do not have any special mathematical or semantic meaning