Creating Metadata
Last updated
Last updated
If you don't already have a metadata object, we recommend auto-detecting it based on your data.
Use this function to automatically detect metadata from your data that you've loaded as a pandas.DataFrame objects.
Parameters:
(required) data
: Your data, represented as a dictionary. The keys are your table names and values are the pandas.DataFrame objects containing your data.
infer_sdtypes
: A boolean describing whether to infer the sdtypes of each column
(default) True
: Infer the sdtypes of each column based on the data.
False
: Do not infer the sdtypes. All columns will be marked as unknown, ready for you to manually update.
infer_keys
: A string describing whether to infer the primary and/or foreign keys.
(default) 'primary_and_foreign'
: Infer the primary keys in each table, and the foreign keys in other tables that refer to them
'primary_only'
: Infer the primary keys in each table. You can manually add the foreign key relationships later.
None
: Do not infer any primary or foreign keys. You can manually add these later.
foreign_key_inference_algorithm
: The algorithm to use when inferring the foreign key connections to primary keys
(default) 'column_name_match'
: Match up foreign and primary key columns that have the same names
*(default, SDV Enterprise) 'data_match'
: Match up foreign and primary key columns based on the data that they contain
Output A Metadata object that describes the data
from sdv.metadata import Metadata
metadata = Metadata.detect_from_dataframes(
data={
'hotels': hotels_dataframe,
'guests': guests_dataframe
})
The detected metadata is not guaranteed to be accurate or complete. Be sure to carefully inspect the metadata and update information.
For more information about inspecting and updating your metadata, see the Metadata API reference.
metadata.update_column(
column_name='age',
sdtype='numerical',
table_name='users'
)
metadata.validate()
You can save the metadata object as a JSON file and load it again for future use.
Use this to save the metadata object to a new JSON file that will be compatible with SDV 1.0 and beyond. We recommend you write the metadata to a new file every time you update it.
Parameters
(required) filepath
: The location of the file that will be created with the JSON metadata
mode
: A string describing the mode to use when creating the JSON file
(default) 'write'
: Write the metadata to the file, raising an error if the file already exists
'overwrite'
: Write the metadata to the file, replacing the contents if the file already exists
Output (None)
metadata.save_to_json(filepath='my_metadata_v1.json')
Use this method to load your file as a Metadata object.
Parameters
(required) filepath
: The name of the file containing the JSON metadata
Output: A Metadata object.
metadata = Metadata.load_from_json(filepath='my_metadata_v1.json')