# KeyUniqueness

This metric measures whether the keys in a particular dataset are unique. We expect that certain types of keys, such as primary keys, are always unique in order to be valid.

## Data Compatibility

* **Primary Key**: This metric validates that the primary key values are unique. *There may be multiple columns in the primary key, as in the case of a composite key.*

## Score

**(best) 1.0**: All of the key values in the synthetic data are unique

**(worst) 0.0**: None of the key values in the synthetic data are unique

## How does it work?

This metric measures how many values in the synthetic data, *s*, are duplicates, meaning that there is another value that is exactly the same. Call this set *Ds*. The score is the proportion of values that are *not* duplicates.

$$
score = 1 - \frac{|D\_s|}{|s|}
$$

If the primary key is composite, meaning that multiple columns together make up the primary key, then the metric looks at the overall combinations of column values when determining duplicates.

## Usage

{% hint style="success" %}
**Recommended Usage:** The [Diagnostic Report ](/sdmetrics/data-metrics/diagnostic/diagnostic-report.md)applies this metric to applicable keys (primary and alternate keys).
{% endhint %}

To manually run this metric, access the `single_column` module and use the `compute` method.

```python
from sdmetrics.single_column import KeyUniqueness

KeyUniqueness.compute(
    real_data=real_table[['primary_key_name']],
    synthetic_data=synthetic_table[['primary_key_name']]
)
```

**Parameters**

* (required) `real_data`: A pandas.DataFrame object with the column of real data. *For a composite key, provide multiple columns in the pandas.DataFrame object.*
* (required) `synthetic_data`: A pandas.DataFrame object with the column of synthetic data. *For a composite key, provide multiple columns in the pandas.DataFrame object.*&#x20;

## FAQ

<details>

<summary>Should the score always be 1?</summary>

If you are running this score on a primary key, then the score should always be 1. Primary keys are expected to be unique.

If you are running this score on a foreign key, then the score may not be 1, as foreign keys are allowed to repeat. For foreign keys, we recommend using the [ReferentialIntegrity](/sdmetrics/data-metrics/diagnostic/referentialintegrity.md) metric instead.

</details>

<details>

<summary>Does this metric use the real data?</summary>

This metric checks to see if the real data also has unique values and alerts you if this is not the case. However, the final score is only based on the synthetic data.

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.sdv.dev/sdmetrics/data-metrics/diagnostic/keyuniqueness.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
