Evaluation

As a final step to your synthetic data project, you can evaluate and visualize the synthetic data against the real data. Using the SDV, you can diagnose any problems in the synthetic data, evaluate the data quality and visualize the data. Click the sections below to learn more.

from sdv.evaluation.single_table import run_diagnostic, evaluate_quality
from sdv.evaluation.single_table import get_column_plot

# 1. perform basic validity checks
diagnostic = run_diagnostic(real_data, synthetic_data, metadata)

# 2. measure the statistical similarity
quality_report = evaluate_quality(real_data, synthetic_data, metadata)

# 3. plot the data
fig = get_column_plot(
    real_data=real_data,
    synthetic_data=synthetic_data,
    metadata=metadata,
    column_name='amenities_fee'
)
    
fig.show()

Need more evaluation options?

See the SDMetrics library.

This library includes many more metrics (some experimental) that you can apply based on your goals. All you need is your real data, synthetic data and metadata to get started.

Last updated

Copyright (c) 2023, DataCebo, Inc.