❖ Targeted Sampling

The Targeted Sampling bundle allows you to create synthetic data that is specific and targeted to your usage. You can use targeted sampling to:

Create minimum viable synthetic test sets for software testing and QA. Define the exact values that you need across multiple tables, and let SDV handle the rest.

De-bias and rebalance your datasets. Adjust the proportions of values to generate in the synthetic data to create a more balanced, fairer representation of the data you'd like to use for ML development.

Generate hypothetical scenarios. Create edge cases and situations that don't exist in the real data — but are theoretically possible to observe.

Included Features

The Targeted Sampling bundle currently supports multi-table conditional sampling for the HSA and Independent Synthesizers.

from sdv.sampling import Condition, MultiTableCondition

# Step 1: Create Single-Table Conditions
resort_hotels = Condition(
    num_rows=10,
    table_name='hotels',
    column_values={'classification': 'RESORT'})

suite_guests_with_rewards = Condition(
    table_name='guests'
    column_values={'room_type': 'SUITE', 'has_rewards': True})

# Step 2: Compose Multi-Table Conditions    
suites_in_resorts = MultiTableCondition(
    conditions=[resort_hotels, suite_guests_with_rewards])

# Step 3: Sample Synthetic Data
synthetic_data = synthesizer.sample_from_conditions([suite_guests_with_rewards])

Installation

Use your SDV Enterprise credentials to install SDV Enterprise and all bundles that you have access to.

% pip install sdv-installer --upgrade
% sdv-installer install --upgrade
Username: <email>
License Key: ********************************

Installing SDV Enterprise:
sdv-enterprise (version 0.30.0) - Installed!

Installing Bundles:
bundle-cag - Installed!
bundle-xsynthesizers - Installed!

Success! All packages have been installed. You are ready to use SDV Enterprise.

Last updated