UnixTimestampEncodertransforms data that represents dates and times into numerical values using the Unix time (aka Epoch time). The transformed value is the number of nanoseconds that have passed since Jan 1, 1970 00:00:00.000000 UTC.
from rdt.transformers.datetime import UnixTimestampEncoder
transformer = UnixTimestampEncoder()
missing_value_replacement: Add this argument to replace missing values during the transform phase
model_missing_values: Use the
missing_value_generation: Add this argument to determine how to recreate missing values during the reverse transform phase
enforce_min_max_values: Add this argument to allow the transformer to learn the min and max allowed values from the data.
datetime_format: Add this argument to tell the transformer how to read your datetime column if it's in a specific format that isn't easy to identify.
from transformers.datetime import UnixTimestampEncoder
transformer = UnixTimestampEncoder(missing_value_replacement='mean',
datetime_format='%b %d, %Y %I:%M:%S %p')
The transformer should be able to automatically detect the most common datetime formats. If you are not sure whether your format can be detected, we recommend trying it without the format string first. If you see an error, supply the format.
Particular confusion might arise if your datetime values have uncommon formats. For example:
- You do not have leading 0's in your months or dates, such as
- You are using something other that hyphens, dashes or colons to separate out the date & time components. Such as
When setting the
model_missing_valuesparameter, consider whether the "missingness" of the data is something important. For example, maybe the user opted out of supplying the info on purpose, or maybe a missing value is highly correlated with another column your dataset. If "missingness" is something you want to account for, you should model missing values.