-
-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Refactor timestamp codecs #333
Comments
How naive of me: as always, time-related machinery is like a rabbit hole 😅 AmbiguityGiven
Current LocalDateTime-based encoder "output"
Suggested Instant-based encoder "output"
Both these outputs denote the same point in time. INT96 format itselfThe INT96 format is deprecated (see here) and even lacks any documentation (see discussion here). With all that said, wouldn't it be better to soft-deprecate default timestamp codecs in parquet4s and encourage users to choose INT64-based ones? Also, we could switch to an |
I guess there must be a small bug that's causing it. Definitely, it is undesirable because produces different data.
Yes, it is deprecated, but yet (!) it is still a default format in such priminent tools as Spark, Impala and many others. |
For now,
LocalDateTime
is a "base" type for INT96 timestamp encoding/decoding: conversion for aTimestamp
andInstant
goes viaLocalDateTime
. I see a little mismatch here:LocalDateTime
requires an additional piece of data to become UTC adjusted - timezone.Instant
is already UTC adjusted (andTimestamp
can be directly converted toInstant
).So, it may be a little confusing that you need to specify a timezone to encode/decode an
Instant
or aTimestamp
.Meanwhile, we could use an
Instant
as a "base" type as follows:In such a case, we specify timezone only for a
LocalDateTime
and theencodeInstant
method itself looks a bit simpler than theencodeLocalDateTime
.The text was updated successfully, but these errors were encountered: