# Visualization Utilities

Use the utilities below to visualize the comparison between real and synthetic data. You can access these from the `sdmetrics.visualization` module.&#x20;

{% hint style="success" %}
**Tip!** All visualizations are interactive. If you're using an iPython notebook, you can zoom, pan, toggle legends and take screenshots.
{% endhint %}

### Compare a synthetic column & real column (1D)

**get\_column\_plot**

Use this utility to visualize a real column against the same synthetic column. You can plot any column of type: `boolean`, `categorical`, `datetime` or `numerical`.&#x20;

* (required) `real_data`: A [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) containing the table of your real data. *To skip plotting the real data, input `None`.*&#x20;
* (required) `synthetic_data`: A [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) containing the synthetic data. *To skip plotting the synthetic data, input `None`.*&#x20;
* (required) `column_name`: The name of the column you want to plot.
* `plot_type`: The type of plot to create
  * (default) `None`: Determine the type of plot to create based on the data.
  * `'distplot'`: Plot the data as a smooth, continuous distribution. Use this for continuous columns.
  * `'bar'`: Plot the data as discrete bars. Use this for discrete columns.

Returns: A [plotly.Figure](https://plotly.com/python-api-reference/generated/plotly.graph_objects.Figure.html) object

```python
from sdmetrics.visualization import get_column_plot

fig = get_column_plot(
    real_data=real_table,
    synthetic_data=synthetic_table,
    column_name='high_perc',
    plot_type='distplot'
)

fig.show()
```

<figure><img src="https://2284413265-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FrNLha4DaPNwVJ930KhmB%2Fuploads%2FSqv1M4pqnxbfGdlj1aWp%2FScreen%20Shot%202022-07-20%20at%206.46.46%20PM.png?alt=media&#x26;token=09ebc093-9926-43c5-a402-3af691b952e6" alt=""><figcaption></figcaption></figure>

### Compare a pair of synthetic columns & real columns (2D)

**utils.get\_column\_pair\_plot**

Use this utility to visualize the trends between a pair of columns for real and synthetic data. You can plot any 2 columns of type: `boolean`, `categorical`, `datetime` or `numerical`. The columns do not have to the be the same type.

* (required) `real_data`: A [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) containing the table of your real data. *To skip plotting the real data, input `None`.*&#x20;
* (required) `synthetic_data`: A [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) containing the synthetic data. *To skip plotting the synthetic data, input `None`.*&#x20;
* (required) `column_names`: A list containing the names of the 2 columns you want to plot.&#x20;
* `plot_type`: The type of plot to create
  * (default) `None`: Determine the type of plot to create based on the data.
  * `'scatter'`: Plot each data point in 2D space as a scatter plot. Use this to compare a pair of continuous columns.
  * `'box'`: Plot the data as one or more box plot. Use this to compare a continuous column with a discrete column.
  * `'violin'`: Create a violin plot to show distribution of a continuous column broken down by a discrete column.  This is an alternative to using `'box'` .
  * `'heatmap'`: Plot a side-by-side headmap of the data's categories. Use this to compare a pair of discrete columns.

Returns: A [plotly.Figure](https://plotly.com/python-api-reference/generated/plotly.graph_objects.Figure.html) object

```python
from sdmetrics.visualization import get_column_pair_plot

fig = get_column_pair_plot(
    real_data=real_table,
    synthetic_data=synthetic_table,
    column_names=['mba_perc', 'degree_perc'],
    plot_type='scatter'
    
)

fig.show()
```

Various types of plots are possible based on the types of data you provide

<figure><img src="https://2284413265-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FrNLha4DaPNwVJ930KhmB%2Fuploads%2FuK0YRiAE7BnIUFISWNFu%2FQuality%20Report_%20Plotting%20Column%20Pairs.png?alt=media&#x26;token=1f96139c-c08a-4cf0-b6f3-39fb1477b061" alt=""><figcaption></figcaption></figure>

### Visualize the cardinality of a relationship

**utils.get\_cardinality\_plot**

Use this utility to visualize the cardinality of parent-child relationship. The cardinality is the # of children that each parent row has. Your cardinality may be fixed (eg. every parent has exactly 2 children) or variable (eg. every parent has 1-3 children).

* (required) `real_data`:  A dictionary mapping the name of each table to a [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) containing the real data for that table. *To skip plotting the real data, input `None`.*&#x20;
* (required) `synthetic_data`: A dictionary mapping the name of each table to a [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) containing the synthetic data for that table. *To skip plotting the synthetic data, input `None`.*&#x20;
* (required) `parent_table_name`: The string name of the parent table in the relationship
* (required) `child_table_name`: The string name of the child table in the relationship
* (required) `parent_primary_key`: The string name of the parent table's primary key
* (required) `child_foreign_key`: The string name of the column in the child table that refers to the parent's primary key
* `plot_type`: The type of plot to create
  * (default) `None`: Determine the type of plot to create based on the data.
  * `'distplot'`: Plot the data as a smooth, continuous distribution
  * `'bar'`: Plot the data as discrete bars

Returns: A [plotly.Figure](https://plotly.com/python-api-reference/generated/plotly.graph_objects.Figure.html) object

```python
from sdmetrics.visualization import get_cardinality_plot

fig = get_cardinality_plot(
    real_data=real_tables,
    synthetic_data=synthetic_tables,
    parent_table_name='users',
    child_table_name='sessions',
    parent_primary_key='user_id',
    child_foreign_key='user_id',
    plot_type='bar'
)

fig.show()
```

<figure><img src="https://2284413265-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FrNLha4DaPNwVJ930KhmB%2Fuploads%2F6MFXl102NjOJWOacUmsz%2FVisualization%20Cardinality.png?alt=media&#x26;token=1eb63947-7036-4c7f-87c0-690017f2cc42" alt=""><figcaption></figcaption></figure>
