Loading Data

Load your data into Python to use it for SDV modeling. SDV supports many different types of data formats for import and export.

Don't have any data yet? The SDV library contains many different demo datasets that you can use to get started. To learn more, see the SDV Demo Data page.

Local Data

If your data is already available as local files (on your own machine), load them into SDV using the functions below.

Load multiple CSV files into Python.

Load an entire Excel spreadsheet into Python.

Connect to a database

If your data is available in a database, you create a connection to import some data for SDV. Later you can use the same functionality to export synthetic data.

Connect to a BigQuery database and import a subset of data.

Coming soon!

More database connectors are coming soon

Do you have data in other formats?

The SDV uses the pandas library for data manipulation and synthesizing. If your data is in any other format, load it in as a pandas.DataFrame object to use in the SDV. For multi table data, make sure you format your data as a dictionary, mapping each table name to a different DataFrame object.

multi_table_data = {
    'table_name_1': <pandas.DataFrame>,
    'table_name_2': <pandas.DataFrame>,
    ...
}

Pandas offers many methods to load in different types of data. For example: SQL table or JSON string.

import pandas as pd

data_table_1 = pd.read_json('file://localhost/path/to/table_1.json')
data_table_2 = pd.read_json('file://localhost/path/to/table_2.json')

For more options, see the pandas reference.

Last updated

Copyright (c) 2023, DataCebo, Inc.