Creating Metadata

This guide will walk you through creating the metadata using the Python API.

Auto Detect Metadata

Once you have loaded your data into Python, you can auto-detect your actual data.

detect_from_dataframe

Use this function to automatically detect metadata from your data that you've loaded as a pandas.DataFrame object.

Parameters:

  • (required) data: Your pandas DataFrame object that contains the data

  • table_name: A string describing the name of your table. SDV will use the table name when referring to your table in the metadata, as well as any warnings or descriptive error messages.

    • (default) By default, we'll name your data table 'table'

Output A Metadata object that descibes the data

from sdv.metadata import Metadata

metadata = Metadata.detect_from_dataframe(
    data=my_dataframe,
    table_name='hotel_guests')

Updating Metadata

The detected metadata is not guaranteed to be accurate or complete. Be sure to carefully inspect the metadata and update it so it accurately represents your data.

For more information about inspecting and updating your metadata, see the Metadata API reference.

metadata.update_column(
    column_name='start_date',
    sdtype='datetime',
    datetime_format='%Y-%m-%d')
    
metadata.update_column(
    column_name='user_cell',
    sdtype='phone_number',
    pii=True)
    
metadata.validate()

Saving, Loading & Sharing Metadata

You can save the metadata object as a JSON file and load it again for future use.

save_to_json

Use this to save the metadata object to a new JSON file that will be compatible with SDV 1.0 and beyond. We recommend you write the metadata to a new file every time you update it.

Parameters

  • (required) filepath: The location of the file that will be created with the JSON metadata

Output (None)

metadata.save_to_json(filepath='my_metadata_v1.json')

load_from_json

Use this method to load your file as a Metadata object.

Parameters

  • (required) filepath: The name of the file containing the JSON metadata

Output: A Metadata object.

Last updated

Copyright (c) 2023, DataCebo, Inc.