Links

Sample Realistic Data

Create realistic synthetic data data that follows the same format and mathematical properties as the real data.

sample

Use this function to create synthetic data that mimics the real data
synthetic_data = synthesizer.sample(
num_sequences=100,
sequence_length=None
)
Parameters
  • (required) num_sequences: An integer >0 describing the number of sequences to sample
  • sequence_length: An integer >0 describing the length of each sequence. If you provide None, the synthesizer will determine the lengths algorithmically, and the length may be different for each sequence. Defaults to None.
Returns A pandas DataFrame object with synthetic data. The synthetic data mimics the real data.

reset_sampling

Use this function to reset any randomization in sampling. After calling this, your synthesizer will generate the same data as before. For example in the code below, synthetic_data1 and synthetic_data2 are the same.
synthesizer.reset_sampling()
synthetic_data1 = synthesizer.sample(num_sequences=1000)
synthesizer.reset_sampling()
synthetic_data2 = synthesizer.sample(num_sequences=1000)
Parameters None
Returns None. Resets the synthesizer.
Copyright (c) 2023, DataCebo, Inc.