Defining your metric

Use this guide to define your metric. It's important to think through the abstractions and functionality of your metric before adding it.
When you're ready to add your metric, please file an issue with the relevant details. We recommend waiting for feedback before you begin implementing your metric.


All metrics in this library are model-agnostic. Anyone who wants to use your metric should already have:
  1. 1.
    A real dataset
  2. 2.
    A synthetic dataset, which could be created using any model

Base Metric

The base version of your metric takes in real and synthetic data with the smallest possible unit of data. The base metric is a class with a compute method. The method takes in the minimal unit of real data, synthetic data and any other keyword args you want to add. It returns a score, represented as floating point value.
from sdmetrics.column_pairs import YourMetricName
real_data[['column_1', 'column_2']],
synthetic_data[['column_1', 'column_2']],
kwarg1= ...

(Optional) Iterative Application

In many cases, you may want to iterate through the entire dataset to apply the base metric to different columns, pairs of columns, tables, etc. You can write a convenience method called apply_to_table that performs this iteration.
This method takes in the full real data, synthetic data, keyword args and metadata. According to the metadata, you can determine where to apply the base metric. The metric returns a dictionary of results, keyed by the base unit.
('column_1', 'column_2'): 0.1234,
('column_1', 'column_3'): 0.5678,
('column_2', 'column_3'): 0.9012,
You can write multiple levels of iteration. For example, it may be possible to run your metric on multi table datasets too. In this case, the breakdown is further keyed on the table name.
'users': {
('column_1', 'column_2'): 0.1234,
('column_1', 'column_3'): 0.5678
'sessions': ...

Metric Description

Every metric includes a detailed description in these docs. It can be helpful to think through this before implementation.
Data Compatibility
Can it run on numerical, categorical, boolean, datetime or ID columns? What about a column with missing values?
We recommend the final score to range from 0 (worst) to 1 (best). Consider which aspect of synthetic data your metric evaluates (eg. privacy, quality, etc.). You may need to invert or flip the value to make 1 the best score.
How does it work?
Provide a description of how your metric works. Include any relevant mathematical formulas and citations.
Provide the API for the base usage. Describe any extra parameters, their possible values and defaults.
Provide answers to any questions a user may have. For example: If there are similar metrics that already exist, when should yours be used? When should the parameters be changed?

Other Considerations

  • Metrics vs. parameters. If your metric is extremely similar to another, consider combining them and introducing a parameter instead.
  • External dependencies. If your metric introduces new dependencies, consider whether they are necessary. New dependencies make it harder to maintain the overall SDMetrics package and may leave the software vulnerable if the external library is not being regularly updated or used.
  • Determinism. If your metric is not deterministic, explore why this is the case. If the score varies highly between successive runs, it may be hard to interpret your metric.