# CardinalityShapeSimilarity

If you have multi table, connected tables, this metric measures whether the cardinality of the parent table is the same between the real and synthetic datasets. The cardinality is defined as the number of child rows for each parent.

## Data Compatibility

* **Primary and Foreign Keys**: This metic is meant to be used on primary and foreign keys. Primary key IDs must be unique while foreign key IDs can repeat.

## Score

**(best) 1.0**: The cardinality values are the same in the real and synthetic data

**(worst) 0.0**: The cardinality values are as different as can be

The example below shows a distribution of cardinality values for real and synthetic data (black and green, respectively). The CardinalityShapeSimilarity score is 0.85, indicating that the cardinalities are mostly similar with some key differences.

![This graph shows the distribution of the cardinality for the real and synthetic data. In the real data, a vast majority of rows have a cardinality of 1. In the synthetic data, the cardinality is more evenly distributed in the \[0,3\] range.](/files/aPuXogf8qODRZmo8uBMT)

## How does it work?

In a multi table setup, there is a parent and child table. The parent contains a primary key that uniquely identifies every row while the child contains a foreign key that refers to a parent row. The foreign keys may repeat, as multiple children can reference the same parent.

![The parent table contains primary keys while the child table has foreign keys that refers to them. Each parent row has a different number of children based on the references. For example, User\_00 has 1 child row, User\_01 has 2, user\_02 has 0 and so on.](/files/EJ7B8f4vIvMLijv2YlNq)

This metric computes the cardinality \[1] of each parent row. That is, it computes the number of children that each parent rows has so that each parent row is associated with an integer ≥ 0.

This yields a numerical distribution for both the real and synthetic data. The CardinalityShapeSimilarity metric computes and returns the [KSComplement](/sdmetrics/data-metrics/quality/kscomplement.md) score of these distributions.

## Usage

Access this metric from the `multi_table` module and use the `compute_breakdown` method.

```python
from sdmetrics.multi_table import CardinalityShapeSimilarity

CardinalityShapeSimilarity.compute_breakdown(
    real_data={
      'user': real_user_table,
      'sessions': real_sessions_table,
      'transactions': real_transactions_table
    },
    synthetic_data={
      'users': synthetic_user_table,
      'sessions': real_sessions_table,
      'transactions': real_transactions_table
    },
    metadata=multi_table_metadata_dict
)
```

```
{
    ('users', 'sessions'): 0.78891,
    ('sessions', 'transactions'): 0.588211
}
```

**Parameters**

* (required) `real_data`: A dictionary mapping table names to pandas.DataFrame objects that contain the real data
* (required) `synthetic_data`: A dictionary mapping the same table names to pandas.DataFrame objects that contain the synthetic data
* (required) `metadata`: A metadata dictionary describing the relationships between the tables (see [Multi Table Metadata](/sdmetrics/getting-started/metadata/multi-table-metadata.md))

**Returns** A dictionary that maps each relationship to its `CardinalityShapeSimilarity` score.

## References

\[1] <https://en.wikipedia.org/wiki/Cardinality_(data_modeling)>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.sdv.dev/sdmetrics/data-metrics/quality/cardinalityshapesimilarity.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
