RDT: Reversible Data Transforms
How much effort are you spending in cleaning and processing your data?
Cleaning and formatting raw data is a foundational element of RDT. But you can use the library to do much more.
Normalize your data using statistical processes. This is especially useful for data science and machine learning projects.
Protect sensitive data while preserving the overall data format. Using RDTs, you can remove and anonymize Personal Identifiable Information. Use it to generate random, fake values that look like the original ones.
Licensed users can extract deeper concepts that are embedded inside the data. This is particularly useful for complex data types that have a rich, real-world meaning.
We first created RDTs with the goal of generating synthetic data. The RDT library transforms the raw data for machine learning, and then reverse transforms machine-generated data to match the original. Synthetic data remains a top use case for RDT today.
We open sourced the RDT library because the transformers are useful beyond the synthetic data space. You can use RDT to:
- Preprocess your data for data science and analytics projects
- Sanitize datasets before publishing them broadly for research
- Translate machine output to human readable data