BoundaryAdherence
Last updated
Last updated
This metrics measures whether a synthetic column respects the minimum and maximum values of the real column. It returns the percentage of synthetic rows that adhere to the real boundaries.
Numerical : This metric is meant for numerical data
Datetime : This metric converts datetime values into numerical values
If you have missing values in the real data, then the metric will consider them valid in the synthetic data. Otherwise, they will be marked as out-of-bounds.
(best) 1.0: All values in the synthetic data respect the min/max boundaries of the real data
(worst) 0.0: No value in the synthetic data is in between the min and max value of the real data
The graph below shows an example of some fictional real and synthetic data (black and green, respectively) with BoundaryAdherence=0.912.
This metric computes the min and max values of the real column. Then, it computes the frequency of synthetic values that are in the [min, max] range.
Recommended Usage: The Diagnostic Report applies this metric to applicable columns.
To manually apply this metric, access the single_column
module and use the compute
method.
Parameters
(required) real_data
: A pandas.Series object with the column of real data
(required) synthetic_data
: A pandas.Series object with the column of synthetic data