What's included?
The diagnostic report captures the Validity, Structure and Relationship Validity. This guide contains some technical details about each property.
Data Validity
Does each column in the data contain valid data?

Methodology
This property applies metrics based on the column types.
numerical, datetime
Continuous values in the synthetic data must adhere to the min/max range in the real data
categorical, boolean
Discrete values in the synthetic data must adhere to the same categories as the real data.
This yields a separate score for every column. The final Data Validity score is the average of all columns.
Data Structure
Does each table have the same overall structure as the real data? The structure includes the column names.

Methodology
This property applies the TableStructure metric to each table of the dataset. This checks to see that there are the same set of column names in the synthetic vs. the real data.
Relationship Validity
Does the synthetic data contain valid relationships between different tables?

Methodology
Every relationship in your dataset is determined by a primary/foreign key connection. This property applies two metrics to the relationship to determine the validity:
ReferentialIntegrity: Does each foreign key refer to an existing primary key? If a foreign key refers to a non-existent primary key, it is known as an orphaned child, which is invalid in most databases.
CardinalityBoundaryAdherence: Does each primary key have the correct number of children? The correct number is based on the min/max bounds that are present in the real data.
The final Relationship Validity score is the average of all the sub scores.
Last updated