Synthetic Data Vault
GitHubSlackDataCebo
  • Welcome to the SDV!
  • Tutorials
  • Explore SDV
    • SDV Community
    • SDV Enterprise
      • ⭐Compare Features
    • SDV Bundles
      • ❖ AI Connectors
      • ❖ CAG
      • ❖ Differential Privacy
      • ❖ XSynthesizers
  • Single Table Data
    • Data Preparation
      • Loading Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • GaussianCopulaSynthesizer
        • CTGANSynthesizer
        • TVAESynthesizer
        • ❖ XGCSynthesizer
        • ❖ SegmentSynthesizer
        • * DayZSynthesizer
        • ❖ DPGCSynthesizer
        • ❖ DPGCFlexSynthesizer
        • CopulaGANSynthesizer
      • Customizations
        • Constraints
        • Preprocessing
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Multi Table Data
    • Data Preparation
      • Loading Data
        • Demo Data
        • CSV
        • Excel
        • ❖ AlloyDB
        • ❖ BigQuery
        • ❖ MSSQL
        • ❖ Oracle
        • ❖ Spanner
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • Synthesizers
        • * DayZSynthesizer
        • * IndependentSynthesizer
        • HMASynthesizer
        • * HSASynthesizer
      • Customizations
        • Constraints
        • Preprocessing
      • * Performance Estimates
    • Sampling
    • Evaluation
      • Diagnostic
      • Data Quality
      • Visualization
  • Sequential Data
    • Data Preparation
      • Loading Data
      • Cleaning Your Data
      • Creating Metadata
    • Modeling
      • PARSynthesizer
      • Customizations
    • Sampling
      • Sample Realistic Data
      • Conditional Sampling
    • Evaluation
  • Concepts
    • Metadata
      • Sdtypes
      • Metadata API
      • Metadata JSON
    • Constraints
      • Predefined Constraints
        • Positive
        • Negative
        • ScalarInequality
        • ScalarRange
        • FixedIncrements
        • FixedCombinations
        • ❖ FixedNullCombinations
        • ❖ MixedScales
        • OneHotEncoding
        • Inequality
        • Range
        • * ChainedInequality
      • Custom Logic
        • Example: IfTrueThenZero
      • ❖ Constraint Augmented Generation (CAG)
        • ❖ CarryOverColumns
        • ❖ CompositeKey
        • ❖ ForeignToForeignKey
        • ❖ ForeignToPrimaryKeySubset
        • ❖ PrimaryToPrimaryKey
        • ❖ PrimaryToPrimaryKeySubset
        • ❖ SelfReferentialHierarchy
        • ❖ ReferenceTable
        • ❖ UniqueBridgeTable
  • Support
    • Troubleshooting
      • Help with Installation
      • Help with SDV
    • Versioning & Backwards Compatibility Policy
Powered by GitBook

Copyright (c) 2023, DataCebo, Inc.

On this page
  • Installation
  • ExcelHandler
  • read
  • write
  1. Multi Table Data
  2. Data Preparation
  3. Loading Data

Excel

This functionality is in Beta! Beta functionality may have bugs and may change in the future. Help us out by testing this functionality and letting us know if you encounter any issues.

Installation

To work with Excel files in SDV, please install SDV with the optional dependencies using:

pip install 'sdv[excel]'

ExcelHandler

Use this object to create a handler for reading and writing an local Excel spreadsheet

from sdv.io.local import ExcelHandler

connector = ExcelHandler()

Parameters (None)

Output An ExcelHandler object you can use to read and write an Excel spreadsheet

read

Use this function to read an Excel spreadsheet from your local machine

data = connector.read(
    filepath='project/data.xlsx',
    sheet_names=['guests', 'hotels']
)

Parameters

  • (required) filepath: A string name of the Excel spreadsheet, which must end in .xlsx

  • sheet_names : A list of sheet names to read from the overall spreadsheet. Each sheet is a tab in your spreadsheet file, and represents a different table.

    • (default) None: Read all the sheets of the overall spreadsheet

    • <list>: Read only the sheets listed from the overall spreadsheet

write

Use this function to write synthetic data into an Excel spreadsheet

connector.write(
  synthetic_data,
  filepath='project/synthetic_data.xlsx',
  mode='x'
)

Parameters

  • (required) filepath: A string name of the file where you would like to write the synthetic data

  • sheet_name_suffix: The suffix to add to each sheet name. Use this if to add specific version numbers or other info.

    • (default) None: Do not add a suffix. Each sheet will be named with table name

    • <string>: Append the suffix after each table name. Eg. a suffix '_synth1' will write a sheet as table_synth1

  • mode: A string signaling which mode of writing to use

    • (default) 'w': Write sheets to a new Excel file, clearing any existing file that may exist

    • 'a': Write the data as new sheets within an existing Excel file

Output (None) The data will be written as an Excel spreadsheet

PreviousCSVNext❖ AlloyDB

Last updated 9 months ago

Output A dictionary that contains all the data found the Excel file. The key is the name of the sheet and the value is a containing the data.

(required) synthetic_data: You data, represented as a dictionary. The key is the name of each table and the value is a containing the data.

pandas DataFrame
pandas DataFrame