# Public SDV Datasets

The SDGym library includes a variety of public, demo datasets that you can use from benchmarking. These come from the overall SDV ecosystem.

These datasets are stored in a publicly readable S3 bucket created by DataCebo. For more information, see the [Dataset Format](https://docs.sdv.dev/sdgym/customization/datasets/dataset-format) guide.

## Using the demo datasets in SDGym

SDGym is configured to use the demo datasets by default.

### Exploring Datasets

The default DatasetExplorer reads the SDV demo datasets. For more information, see the [Explore Datasets](https://docs.sdv.dev/sdgym/customization/datasets/explore-datasets) guide.

```python
from sdgym import DatasetExplorer

explorer = DatasetExplorer()
summary = explorer.summarize_datasets(modality='single_table')
```

### Benchmarking

The benchmark functions are set to run on the recommended demo datasets by default. You can update these using the `sdv_datasets` parameter. For more information see the guide for [Running a Benchmark (AWS)](https://docs.sdv.dev/sdgym/benchmarking/run/aws).

```python
import sdgym

results = sdgym.benchmark_single_table_aws(
    sdv_datasets=['adult', 'alarm', 'census', 'child', 'expedia_hotel_logs'],
    output_destination='s3://my_results_bucket/'
)
```
