Sample Realistic Data
Create realistic synthetic data data that follows the same format and mathematical properties as the real data.
sample
Use this function to create synthetic data that follows the same format and mathematical properties as the real data.
Parameters
(required)
num_rows
: An integer >0 that specifies the number of rows to synthesizebatch_size
: An integer >0, describing the number of rows to sample at a time. If you are sampling a large number of rows, setting a smaller batch size allows you to see and save incremental progress. Defaults to the same asnum_rows
.max_tries_per_batch
: An integer >0, describing the number of sampling attempts to make per batch. If you have included constraints, it may take multiple batches to create valid data. Defaults to100
.output_file_path
: A string describing a CSV filepath for writing the synthetic data. Specify toNone
to skip writing to a file. Defaults toNone
.
Returns A pandas DataFrame object with synthetic data. The synthetic data mimics the real data.
reset_sampling
Use this function to reset any randomization in sampling. After calling this, your synthesizer will generate the same data as before. For example in the code below, synthetic_data1
and synthetic_data2
are the same.
Parameters None
Returns None. Resets the synthesizer.
If you have your synthesizer, it will reset sampling automatically for you. The next time you load and sample, you will receive the same synthetic data.
Last updated