# CSV

### CSVHandler <a href="#bigqueryconnector" id="bigqueryconnector"></a>

Use this object to create a handler for reading and writing local CSV files.

```python
from sdv.io.local import CSVHandler

connector = CSVHandler()
```

**Parameters** (None)

**Output** A CSVHandler object you can use to read and write CSV files

### read

Use this function to read multiple CSV files form your local machine

```python
data = connector.read(
    folder_name='project/data/',
    file_names=['users.csv', 'transactions.csv', 'sessions.csv'],
    read_csv_parameters={
        'parse_dates': False,
        'encoding':'latin-1'
    }
)
```

**Parameters**

* (required) `folder_name`: A string name of the folder that contains your CSV files
* `file_names`: A list of strings with the exact file names to read
  * (default) `None`: Read all the CSV files that are in the specified folder
  * `<list>`: Only read the list of CSV files that are in the list
* `keep_leading_zeros`: A boolean that describes whether any values with leading zeros should be kept, or whether they can be safely removed and converted to ints/floats.
  * (default) `True`: Keep any leading zeros. This is especially helpful when the data is not truly numerical, for example `"02446"` is a postal code in Boston. It is not valid to read this as a number `2,446`.
  * `False`: Do not keep any leading zeros. Select this for faster read times if you know your data contains numbers.
* `read_csv_parameters`: A dictionary with additional parameters to use when reading the CSVs. The keys are any of the parameter names of the [pands.read\_csv](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html) function and the values are your inputs.
  * (default) `{ 'parse_dates': False, 'low_memory': False, 'on_bad_lines': 'warn'}`: Do not infer any datetime formats, assume low memory, or error if it's not possible to read a line. (Use all the other defaults of the `read_csv` function.)

**Output** A dictionary that contains all the CSV data found in the folder. The key is the name of the file (without the `.csv` suffix) and the value is a [pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) containing the data.

### write

Use this function to write synthetic data as multiple CSV files

```python
connector.write(
  synthetic_data,
  folder_name='project/synthetic_data',
  to_csv_parameters={
      'encoding': 'latin-1',
      'index': False
  },
  file_name_suffix='_v1', 
  mode='x')
)
```

**Parameters**

* (required) `synthetic_data`: You data, represented as a dictionary. The key is the name of each table and the value is a [pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) containing the data.&#x20;
* (required) `folder_name`: A string name of the folder where you would like to write the synthetic data
* `to_csv_parameters`: A dictionary with additional parameters to use when writing the CSVs. The keys are any of the parameter names of the [pandas.to\_csv](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html) function and the values are your inputs.
  * (default) `{ 'index': False }`: Do not write the index column to the CSV. (Use all the other defaults of the `to_csv` function.)
* `file_name_suffix`: The suffix to add to each filename. Use this if to add specific version numbers or other info.
  * (default) `None`: Do not add a suffix. The file name will be the same as the table name with a `.csv` extension
  * \<string>: Append the suffix after the table name. Eg. a suffix `'_synth1'` will write a file as `table_synth1.csv`&#x20;
* `mode`: A string signaling which mode of writing to use
  * (default) `'x'`: Write to new files, raising errors if any existing files exist with the same name
  * `'w'`: Write to new files, clearing any existing files that exist
  * `'a'`: Append the new CSV rows to any existing files

**Output** (None) The data will be written as CSV files


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.sdv.dev/sdv/multi-table-data/data-preparation/loading-data/csv.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
