❖ MSSQL
Last updated
Last updated
If your data is available in a Microsoft SQL database, you can directly connect to it in order to extract the data and metadata. Later, you can use the same connections to write the data back to a database.
This functionality is in Beta! Beta functionality may have bugs and may change in the future. Help us out by testing this functionality and letting us know if you encounter any issues.
To use this feature, please make sure you have installed the bundle with the optional db-mssql
dependency. For more information, see the SDV Enterprise Installation guide.
Microsoft SQL also requires that you have the ODBC Driver installed on your machine. For more information, see instructions from Microsoft for Windows, MacOS, and Linux.
Use this connector to create a connection to your database.
Parameters (None)
Output A connector object that you can use to import data and metadata
Import your real data from a database to use with SDV.
Use this function to authenticate into the project and database you'd like to import from.
Parameters
(required) auth
: A dictionary with your authentication credentials (for details, see Authentication)
schema
: A string with the name of your schema. Please add this if your database tables are in a named schema.
(default) You can omit this parameter if you have not set up a schema name. In this case, the connector will use whatever schema name your database assigns as a default. For MSSQL, this is 'dbo'
, the default database owner. For more info, see the Microsoft docs.
Output (None)
Use this function to create metadata based on the connection to your database.
Parameters
table_names
: A list of strings representing the table names that you want to create metadata for
(default) None
: Create metadata for all the tables in the database
infer_sdtypes
: A boolean describing whether to infer the sdtypes of each column
(default) True
: Infer the sdtypes based on the data
False
: Do not infer the sdtypes. All columns will be marked as unknown
, ready for you to manually update
infer_keys
: A string describing whether to infer the primary and/or foreign keys. Options are:
(default) 'primary_and_foreign'
: Infer the primary keys in each table, and the foreign keys in other tables that refer to them
'primary_only'
: Infer only the primary keys of each table
None
: Do not infer any keys
Output A Metadata object representing your metadata
The detected metadata is not guaranteed to be accurate or complete. Be sure to carefully inspect the metadata and update information. For more information, see the Metadata API.
Use this function to import a random subset of your data from the database and inferred metadata. The size of the subset is automatically determined.
Parameters
(required) metadata
: A Metadata object that describes the data you want to import
fixed_seed
: A boolean that controls the determinism of the random subset
(default) False
: Different random data will be imported every time you call the function
True
: The same data will be imported every time you import
verbose
: A boolean describing whether to print out details about the progress of the import
(default) True
: Print out the table names and number of rows being imported
False
: Do not print any details
Output A dictionary that maps each table name of your database (string) to the data, represented as a pandas DataFrame.
Use this function to import a subset of your data, optimized specifically for a given table. You can also control the size.
Parameters
(required) metadata
: A Metadata object that describes the data you want to import
(required) main_table_name
: A string containing the name of the most important table of your database. This table will generally represent the entity that is most critical to your application or business. It must be one of the tables listed in your metadata object.
num_rows
: The number of rows to sample from the main table. The size of every other table is automatically determined by its connection to the main table.
fixed_seed
: A boolean that controls the determinism of the random subset
(default) False
: Different random data will be imported every time you call the function
True
: The same data will be imported every time you import
verbose
: A boolean describing whether to print out details about the progress of the import
(default) True
: Print out the table names and number of rows being imported
False
: Do not print any details
Output A dictionary that maps each table name of your database (string) to the data, represented as a pandas DataFrame.
After importing the data and metadata, you are now ready to create an SDV synthesizer.
Export synthetic data into a new database.
We recommend using the same connector as your import. This connector object already knows about the specifics of your database schema. It will ensure that the exported data schema has the same format.
Use this function to specify which project and database you'd like to export data to. Also provide your authentication credentials.
Parameters
(required) auth
: A dictionary with your authentication credentials (for details, see Authentication)
schema
: A string with the name of your schema. Please add this if your database tables are in a named schema.
(default) You can omit this parameter if you have not set up a schema name. In this case, the connector will use whatever schema name your database assigns as a default. For MSSQL, this is 'dbo'
, the default database owner. For more info, see the Microsoft docs.
Output (None)
Use this function to export your synthetic data into a database
Parameters
(required) synthetic_data
: A dictionary that maps each table name to the synthetic data, represented as a pandas DataFrame
mode
: The mode of writing to use during the export
(default) 'write'
: Write a new database from scratch. If the database or data already exists, then the function will error out.
'append'
: Append rows to existing tables in the database
'overwrite'
: Remove any existing tables in the database and replace them with this data
verbose
: A boolean describing whether to print out details about export
(default) True
: Print the details
False
: Do not print anything
Output (None) Your data will be written to the database and ready for use by your downstream application!
Provide an auth
dictionary with your credentials. See below for the required keys and values.
❖ SDV Enterprise bundle. This feature is available for purchase as an SDV Enterprise bundle. For more information, visit our page to .