Interpreting Results

Benchmark results are available for every synthesizer and dataset pair. The returned results are a pandas DataFrame object.
Synthesizer Dataset Dataset_Size_MB Model_Time Peak_Memory_KB Model_Size_MB Sample_Time Evaluate_Time Quality Score NewRowSynthesis
FASTMLPreset alarm 34.5 45.45 100201 0.340 2012.2 1001.2 0.71882 0.99901
FASTMLPreset census 130.2 200.691 100231 0.450 2012.2 1012.2 0.88191 1.0
GaussianCopulaSynthesizer alarm 34.5 123.56 300101 0.981 2012.1 1001.2 0.9991991 0.998191
GaussianCopulaSynthesizer census 130.2 23546.12 201011 1.232 2012.2 101012.1 0.689101 1.0
CTGANSynthesizer alarm 34.5 NaN 99999999 NaN NaN NaN NaN NaN
CTGANSynthesizer census 130.2 9919331 9929188110 12.10 123.31 NaN NaN NaN
IdentitySynthesizer alarm 34.5 0.00001 10 0.010 2012.2 1000 1.0 0.0
IDentitySynthesizer census 130.2 2 2012.2 0.031 1003 0.321 1.0 0.0

Returned Results

The results provide a summary of the benchmarking setup, performance during the execution and the overall evaluation. Browse through the tabs below to learn more about what each result means.
These results summarize the setup of your benchmarking run.
  • Synthesizer: The name of the synthesizer used to model and create the synthetic data
  • Dataset: The name of the dataset that the synthesizer learned to create
  • Dataset_Size_MB: The overall size of the dataset when loaded into Python, in MB
These results track the execution of the benchmarking script.
  • Model_Time: The time it took for the synthesizer to learn from the real data and train a model, in seconds
  • Peak_Memory_MB: The maximum memory that the model training took, in MB
  • Model_Size_MB: An estimate of the final size of the trained model, in MB
  • Sample_Time: The time it took to generate synthetic data using the trained model, in seconds
These results summarize the evaluation of the synthetic data against the real data.
  • Evaluate_Time: The time it took for any additional evaluation of the synthetic data, in seconds
  • Quality_Score: An overall estimate of the synthetic data quality, where 0 is the worst and 1 is the best.
  • <other results>: Any other metrics that you apply will appear as additional results. Refer to the SDMetrics library for more details about what the metric means.


If the synthesizer crashed at any point in the process, you will see a NaN value from that point onwards. For example, if your synthesizer ran out of memory during the training phase, you'll see NaN values for the model size, sample time, evaluation time and other metrics.
If you had the setting selected, your detailed_results_folder should contain more information about the exact error message.


How is the quality score computed?
We compute the quality score by measuring:
  • Whether the individual column shapes in the synthetic data match the real data, and
  • Whether the correlations between pairs of columns are the same between the real and synthetic data
A score of 1 indicates a perfect match, or high quality. A score of 0 indicates that the data is as different as can be. For more information, see the SDMetrics Quality Report.
Last modified 4mo ago
© Copyright 2023, DataCebo, Inc.