* Performance Estimates
Last updated
Last updated
How well will SDV synthesizers be able to model your full data schema? Use this feature to get some estimates with only your metadata.
Simulate the performance of different multi-table synthesizers using your metadata.
This function uses the DayZSynthesizer to create random data. Then it runs the random data through the different multi-table synthesizers to estimate their performance, as well as the different evaluation reports.
Parameters:
(required) metadata
: A Metadata object
(required) synthesizers
: A list of strings representing the multi-table synthesizers that you want to test. Options are: 'HMASynthesizer'
, 'HSASynthesizer'
or 'IndependentSynthesizer'
(required) output_folder
: A destination folder where the random data, results, and other artifacts will be saved
default_num_rows
: An integer with the number of rows to create by default for all tables
(default) 1000: Create 1000 rows for every table
num_rows_per_table
: A dictionary that maps each table name to the number of rows to create for only that table. Values here will override the default num rows set in the previous parameter
(default) None
: Do not override the default number of rows for any individual table
timeout
: The maximum number of seconds to give to each synthesizer to train and sample the dataset
(default) None
: Do not set a maximum. Allow the synthesizer to take as long as it needs.
<integer>
: Allow a synthesizer to run on the integer number of seconds for each dataset. If the synthesizer is exceeding the time, the output will include a TimeoutError.
Output A pandas DataFrame with detailed performance results from each synthesizer
Your results include detailed timings for training, sampling, and evaluations.
Your output folder contains the final results in results.csv
, the random DayZ data, as well as each diagnostic reports for each synthesizer.
*SDV Enterprise Feature. This feature is only available for licensed, enterprise users. For more information, visit our page to .