❖ ForeignToForeignKey

❖ SDV Enterprise Bundle. This feature is available as part of the CAG Bundle, an optional add-on to SDV Enterprise. For more information, please visit the CAG Bundle page.

Use the ForeignToForeignKey constraint when you have foreign keys in multiple tables but no primary key to attach them to. This may happen if your tables come from different domains and are linked together by the same concept.

Constraint API

Create a ForeignToForeignKey constraint.

Parameters:

(required) columns: A list of dictionaries representing representing all the foreign key columns that are encoding the same concept. Each dictionary should have
- A 'table_name' key mapping to the string name of the table, and
- A 'foreign_key' key mapping to the string name of the column. (If you have a composite key, provide a tuple of multiple strings.)
foreign_key_generation: A string that describes whether the synthetic data for the foreign keys should contain brand new values, or reuse the ones that exist in your database
- (default) 'new': Create new values in the synthetic data. These new values will be consistent everywhere in the database. In our example above, the synthetic data would have brand-new Warehouse IDs representing new, synthetic warehouses. These warehouses would be consistent between the Products and Suppliers table.
- 'reuse': Reuse the values from the synthetic data. In our example above, the synthetic data would have the same set of Warehouse IDs as the real data, within both the Products and Suppliers tables. These would represent the same warehouses.

from sdv.cag import ForeignToForeignKey

my_constraint = ForeignToForeignKey(
    columns=[{
        'table_name': 'Products',
        'foreign_key': 'Warehouse ID'
    },{
        'table_name': 'Shipments',
        'foreign_key': 'Warehouse ID'
    }],
    foreign_key_generation='new'
)

Make sure that all the tables and columns you provide are listed in your Metadata.

Usage

Apply the constraint to any SDV synthesizer. Then fit and sample as usual.

synthesizer = HSASynthesizer(metadata)
synthesizer.add_constraints([my_constraint])

synthesizer.fit(data)
synthetic_data = synthesizer.sample()

For more information about using predefined constraints, please see the Constraint-Augmented Generation tutorial.

Previous❖ FixedNullCombinations Next❖ ForeignToPrimaryKeySubset

Last updated 1 month ago