Interpreting Results

Benchmark results are available for every synthesizer and dataset pair. The returned results are a pandas DataFrame object.

Synthesizer                Dataset   Dataset_Size_MB   Model_Time   Peak_Memory_KB   Model_Size_MB    Sample_Time    Evaluate_Time   Diagnostic_Score  Quality_Score   NewRowSynthesis
GaussianCopulaSynthesizer  alarm     34.5              123.56       300101           0.981            2012.1         1001.2          1.00000           0.9991991       0.998191        
GaussianCopulaSynthesizer  census    130.2             23546.12     201011           1.232            2012.2         101012.1        1.00000           0.689101        1.0
CTGANSynthesizer           alarm     34.5              NaN          99999999         NaN              NaN            NaN             1.00000           NaN             NaN
CTGANSynthesizer           census    130.2             9919331      9929188110       12.10            123.31         NaN             1.00000           NaN             NaN
IdentitySynthesizer        alarm     34.5              0.00001      10               0.010            2012.2         1000            1.00000           1.0             0.0
IDentitySynthesizer        census    130.2             2            2012.2           0.031            1003           0.321           1.00000           1.0             0.0

Returned Results

The results provide a summary of the benchmarking setup, performance during the execution and the overall evaluation. Browse through the tabs below to learn more about what each result means.

These results summarize the setup of your benchmarking run.

Synthesizer: The name of the synthesizer used to model and create the synthetic data
Dataset: The name of the dataset that the synthesizer learned to create
Dataset_Size_MB: The overall size of the dataset when loaded into Python, in MB

These results track the execution of the benchmarking script.

Model_Time: The time it took for the synthesizer to learn from the real data and train a model, in seconds
Peak_Memory_MB: The maximum memory that the model training took, in MB
Model_Size_MB: An estimate of the final size of the trained model, in MB
Sample_Time: The time it took to generate synthetic data using the trained model, in seconds

These results summarize the evaluation of the synthetic data against the real data.

Evaluate_Time: The time it took for any additional evaluation of the synthetic data, in seconds
Diagnostic_Score: An overall score that summarizes the basic usability of the synthetic data, where 0 is the worst and 1 is the best.
Quality_Score: An overall estimate of the synthetic data quality, where 0 is the worst and 1 is the best.
<other results>: Any other metrics that you apply will appear as additional results. Refer to the SDMetrics library for more details about what the metric means.

Errors

If the synthesizer crashed at any point in the process, you will see a NaN value from that point onwards. For example, if your synthesizer ran out of memory during the training phase, you'll see NaN values for the model size, sample time, evaluation time and other metrics.

If you had the setting selected, your detailed_results_folder should contain more information about the exact error message.

FAQs

How is the quality score computed?

We compute the quality score by measuring:

Whether the individual column shapes in the synthetic data match the real data, and
Whether the correlations between pairs of columns are the same between the real and synthetic data

A score of 1 indicates a perfect match, or high quality. A score of 0 indicates that the data is as different as can be. For more information, see the SDMetrics Quality Report.

Last updated 1 year ago