Loading Data

Load your data into Python to use it for SDV modeling. SDV supports many different types of data formats for import and export.

Don't have any data yet? The SDV library contains many different demo datasets that you can use to get started. To learn more, see the SDV Demo Data page.

Local Data

If your data is already available as local files (on your own machine), load them into SDV using the functions below.

Connect to a database (AI Connectors)

SDV Enterprise Bundle. This feature is available as part of the AI Connectors Bundle, an optional add-on to SDV Enterprise. For more information, please visit the AI Connectors Bundle page.

If your data is available in a database, use our AI Connectors feature to directly import some data for SDV. Later you can use the same connector to export synthetic data into a new database.

Do you have data in other formats?

The SDV uses the pandas library for data manipulation and synthesizing. If your data is in any other format, load it in as a pandas.DataFrame object to use in the SDV. For multi table data, make sure you format your data as a dictionary, mapping each table name to a different DataFrame object.

multi_table_data = {
    'table_name_1': <pandas.DataFrame>,
    'table_name_2': <pandas.DataFrame>,
    ...
}

Pandas offers many methods to load in different types of data. For example: SQL table or JSON string.

import pandas as pd

data_table_1 = pd.read_json('file://localhost/path/to/table_1.json')
data_table_2 = pd.read_json('file://localhost/path/to/table_2.json')

For more options, see the pandas reference.

Last updated