SDV Synthesizers
The SDV library offers a variety of synthesizers that you can use for creating synthetic data and benchmarking it. Pass the string names into the synthesizers
parameter.
import sdgym
sdgym.benchmark_single_table(
synthesizers=['GaussianCopulaSynthesizer', 'FastMLPreset']
)
The table below contains a full list of SDV Synthesizers.
GaussianCopulaSynthesizer
This synthesizer uses classical statistical methods to model the data
CTGANSynthesizer
This synthesizer uses a GAN to model the data
TVAESynthesizer
This synthesizer uses a variational auto encode to model the data
[Experimental!] CopulaGANSynthesizer
This synthesizer combines classical statistical methods and GANs to model the data
Create an SDV variant
Many of the SDV synthesizers can be tuned by setting different parameters. You can test these parameters by creating a variant of the synthesizer.
create_sdv_synthesizer_variant
Use this method to create a variant of an SDV synthesizer.
from sdgym import create_sdv_synthesizer_variant
GammaCopulaSynthesizer = create_sdv_synthesizer_variant(
synthesizer_class='GaussianCopulaSynthesizer',
synthesizer_parameters={ 'default_distribution': 'gamma' }
display_name='GammaCopulaSynthesizer'
)
Parameters
(required)
synthesizer_class
: A string with the name of the synthesizer. This must be one of the predefined synthesizers:'GaussianCopulaSynthesizer'
,'CTGANSynthesizer'
,'TVAESynthesizer'
,'CopulaGANSynthesizer'
(required)
synthesizer_parameters
: A dictionary mapping the name of each parameter to the value that you'd like to set for it. The parameters and values may be different for each synthesizer. For more information, see the SDV API.(required)
display_name
: A string that identifies this variant. The display name will appear in the benchmarking results.
Returns A synthesizer class that you can use directly in the benchmarking script
Using your synthesizer variant
To use your synthesizer variant for benchmarking, provide the class using the custom_synthesizers
parameter. For example:
import sdgym
sdgym.benchmark_single_table(
custom_synthesizers=[GammaCopulaSynthesizer]
)
Results
Results from your synthesizer variant will be labeled by the provided display_name
.
Synthesizer Dataset Dataset_Size_MB Model_Time Peak_Memory_KB Model_Size_MB Sample_Time Evaluate_Time Quality Score NewRowSynthesis
Variant:GammaCopulaSynthesizer alarm 34.5 45.45 100201 0.340 2012.2 1001.2 0.71882 0.99901
See Interpreting Results for more details.
Last updated