Running a Benchmark

Benchmark single table synthesizers using a variety of different datasets.

import sdgym

results = sdgym.benchmark_single_table()

See Interpreting Results for a detailed description of the benchmarking results.

Optional Parameters

Every step of the benchmarking process is customizable. Use the optional parameters to control the setup, execution and evaluation.

Setup

Use these parameters to control which synthesizers and datasets to include in the benchmark.

synthesizers: Control which SDV synthesizers to use by supplying a list of strings with the synthesizer names

(default) ['GaussianCopulaSynthesizer', 'CTGANSynthesizer']
Options include 'GaussianCopulaSynthesizer', 'CTGANSynthesizer', 'TVAESynthesizer' and 'CopulaGANSynthesizer'. See SDV Synthesizers for more details.

sdgym.benchmark_single_table(synthesizers=['GaussianCopulaSynthesizer', 'TVAESynthesizer'])

custom_synthesizers: Supply your own custom synthesizers and variants using a list of classes.

(default) None: Do not run the benchmark on any custom synthesizers
To create your own class, see CustomSynthesizers. You can also create a variant of an SDV Synthesizer.

sdv_datasets: Control which of the SDV demo datasets to use by supplying their names as a list of strings.

(default) ['adult', 'alarm', 'census', 'child', 'expedia_hotel_logs', 'insurance', 'intrusion', 'news', 'covtype']
See Datasets for more options

additional_datasets_folder: You can also supply the name of a folder containing your own datasets, either as a local filepath or an AWS S3 bucket path.

(default) None: Do not run the benchmark for any additional datasets.
<string>: The path to your folder that contains additional datasets. Make sure your datasets are in the correct format and that you have the proper authentications to access the folder. See Custom Datasets for more details.

Execution

Use these parameters to control speed and flow of the benchmarking.

limit_dataset_size: Set this boolean to limit the size of every dataset. This will yield faster results but may affect the overall quality.

(default) False: Use the full datasets for benchmarking.
True: Limit the dataset size before benchmarking. For every dataset selected, use only 100 rows (randomly sampled) and the first 10 columns.

timeout: The maximum number of seconds to give to each synthesizer to train and sample a dataset

(default) None: Do not set a maximum. Allow the synthesizer to take as long as it needs.
<integer>: Allow a synthesizer to run on the integer number of seconds for each dataset. If the synthesizer is exceeding the time, the benchmark will report a TimeoutError.

run_on_ec2: Control whether to run the benchmark on your local machine or your Amazon EC2 isntance

(default) False: Run the benchmark on your local machine
True: Launch an EC2 instance for running the benchmark. Note that if you provide this option, please make sure the output_filepath parameter is an S3 filepath. For more information, see the docs for AWS Integration.

output_filepath: Supply the name of a file if you'd like to save your results.

(default) None: Do not save the results to a file. Note that the results will still be returned by the method in your Python script.
<string>: Save the the final results at this location. The results are available as a csv file, so please make sure the filename ends with '.csv'. You may also supply an Amazon S3 filepath. For more information, see the docs for AWS Integration.

detailed_results_folder: Supply the name of the folder if you'd like to save detailed results during the benchmarking computation.

(default) None: Do not save the detailed results anywhere, just store them in memory and use to compute the final results.
<string>: Store the detailed results (as multiple files) within the provided folder. The final results will be computed from these files, including evaluation breakdowns and any Error messages. You may also supply an Amazon S3 filepath. For more information, see the docs for AWS Integration.

If your script crashes during execution, you can view the detailed results for any successfully completed runs.

show_progress: Show the incremental progress of running the script

(default) False: Do not show the progress. Nothing will be printed on the screen.
True: Print a progress bar to indicate the completion of the benchmarking.

Evaluation

Use the evaluation parameters to control what to measure when benchmarking.

The SDGym benchmark will always measure performance (time and memory). Use additional parameters to evaluate other aspects of the synthetic data after it's created.

compute_diagnostic_store:Set this boolean to generate an overall diagnostic score for every synthesizer and dataset. This may increase the benchmarking time.

(default) True: Compute an overall diagnostic score. See the SDMetrics Diagnostic Report for more details.
False: Do not compute a diagnostic score.

compute_quality_score: Set this boolean to generate an overall quality score for every synthesizer and dataset. This may increase the benchmarking time.

(default) True: Compute an overall quality score. See the SDMetrics Quality Report for more details.
False: Do not compute a quality score.

sdmetrics: Provide a list of strings to compute additional metrics. The metric names must correspond to metrics in the SDMetrics library.

To pass in optional parameters, specify a tuple with the metric name followed by a dictionary of parameters and values.

(default) [('NewRowSynthesis', {'synthetic_sample_size': 1_000 })]: Apply the NewRowSynthesis metric with a parameter to control the sample size.
See the SDMetrics library for more metric options

Examples

Running a quick trial for performance testing only:

import sdgym

results = sdgym.benchmark_single_table(
    limit_dataset_size=True,
    timeout=600,
    compute_quality_score=False,
    sdmetrics=None
)

Running a detailed benchmarking with custom evaluation task:

import sdgym

results = sdgym.benchmark_single_table(
    output_filepath='benchmarking_output.csv',
    detailed_results_folder='results/',
    show_progress=True,
    compute_diagnostic_score=False,
    compute_quality_score=False,
    sdmetrics=[
        ('NewRowSynthesis', {'synthetic_sample_size': 1_000}),
        'MissingValueSimilarity',
        'RangeCoverage',
        'BoundaryAdherence',
        ('CorrelationSimilarity', {'coefficient': 'Spearman'})
    ]
)

Last updated 11 months ago