What is the timestamp format of _hoodie_commit_time column in Apache Hudi? - apache-hudi

I'm exploring apache-hudi framework and following the quick guide. I'm trying out incremental query functionality, where we use the column _hoodie_commit_time for determining the incremental pull. I was wondering what is the timestamp format & time-zone for this column. Could anyone help me out here? Here is an example value of _hoodie_commit_time column: 20210730005516.

_hoodie_commit_time is the machines current timestamp at which the commit action performed. This time is decided on the spark driver.
Monotonically increasing timestamp is of YYYYMMDDhhmmss and timezone is machine's timezone.
Some additional concepts can be found here: https://hudi.apache.org/docs/concepts

Related

How to update a date with a time zone in postgresql?

I want to update a date with a timezone (+2 hours) but it ends up as UTC (0 hours)
Date type is 'timestamp-with-timezone'
Query...
update table set date = '2022-05-25 13:28+02:00'
will end up as this in the database.
2022-05-25 11:28:00+00
What's wrong here?
tl;dr
Nothing wrong. Postgres stores values of TIMESTAMP WITH TIME ZONE in UTC, always an offset from UTC of zero. Any submitted offset or zone is used to adjust to UTC.
Details
Date type is 'timestamp-with-timezone'
No such type in standard SQL, nor in Postgres.
I’ll assume you meant TIMESTAMP WITH TIME ZONE.
it ends up as UTC (0 hours)
Read the fine manual. You are seeing documented behavior.
Postgres always stores values in a column of type TIMESTAMP WITH TIME ZONE in UTC, that is, with an offset of zero hours-minutes-seconds.
Any time zone or offset provided with an input is used to adjust into UTC. That provided zone or offset is then discarded.
So the name of the type TIMESTAMP WITH TIME ZONE is a misnomer. First, the authors of the SQL were thinking in terms of offset, not real time zones. Second, any submitted time zone is not stored. A submitted zone is used to adjust and then discarded.
If you need to track the original offset or zone, add an extra column. You’ll have to add code to store the offset amount or the time zone name.
update table set date = '2022-05-25 13:28+02:00' will end up as this in the database. 2022-05-25 11:28:00+00 What's wrong here?
Nothing is wrong. That is a feature, not a bug. Both of those strings represent the very same simultaneous moment.
FYI, database engines vary widely in their behavior handling date-time types and behaviors.
Some do as Postgres does regarding TIMESTAMP WITH TIME ZONE, adjusting to UTC and then discarding any provided time zone or offset. Some others may not.
The SQL standard barely touches on the topic of date-time handling. It declares a few types, and does that poorly with incomplete coverage of all cases. And the standard neglects to define behavior.
So, be very careful when it comes to date-time handling in your database work. Read very carefully the documentation for your particular database engine. Do not make assumptions. Run experiments to validate your understanding. And know that writing portable SQL code for date-time may not be feasible.

How to Store timezone with timestamp in xapi?

I am currently exploring Learning locker which is a LRS and store XAPI statements .I see the timestamp in XAPI should follow ISO 8601 format .I see it can be represented as "2015-01-01T01:00Z" but how can I store the timezone information too like "2007-04-05T12:30−02:00".XAPI documentation suggests setting the timeozone but there is no clear way how we can do it .
Any clue on this will be helpful .
Generally the Z indicates a timezone, specifically the UTC timezone. It is preferred that all statement timestamps are stored using UTC and then if they need to be handled relative to a local timezone then that conversion happens at the point of need rather than via the stored data. If you must store it with the specific timezone then the format you showed with the -02:00 should work. NOTE: In the proposed (nearly accepted) version 2.x of the xAPI specification all timestamps will have to be recorded in UTC, see:
A Timestamp shall be formatted to UTC
(https://gitlab.com/IEEE-SA/xapi/9274.1.1/xapi-base-standard-documentation/-/blob/master/9274.1.1%20xAPI%20Base%20Standard%20for%20Content.md#527-additional-requirements-for-data-types)
So it would likely be better to go ahead and start following that pattern.
If the timezone is relevant for other purposes it would be better to track the timezone itself separately from the timestamp (likely as a result or context extension).

hsqldb TIMESTAMP field not accepting zone as America/Los Angeles

I have a TIMESTAMP field in an hsqldb table that I want to set to "2015-02-11 16:02:01.488 America/Los_Angeles", but the insert fails even if I set the column to TIMESTAMP WITH TIMEZONE, the reason being hsqldb seems to support '2008-08-08 20:08:08-8:00' format but not spelled out like America/Los_Angeles. Is there way to make the insert accept America/Los_Angeles type zones ?
Sorry, but hsqldb doesn't support working with IANA/Olson time zones directly. You are correct that TIMESTAMP WITH TIMEZONE only supports a time zone offset. You can review the hsqldb docs for confirmation.
Many databases do not support named time zone. Oracle and Postgres support them, but most others do not.
Consider also that while a named time zone can usually determine the offset, there are still cases of ambiguity around the fall-back daylight saving time transition. In other words, if you had "2015-11-01 01:30:00 America/Los_Angeles", you could not deterministically tell whether it was Pacific Daylight Time (UTC-07:00) or Pacific Standard Time (UTC-08:00). This is why usually just the offset is stored.
The converse is also true though. If you only store "-08:00" then you can't deterministically know that it came from "America/Los_Angeles".
Here's a general guideline that will help:
If the local time is unimportant, then just store a TIMESTAMP based on UTC.
If the local time is important, but the value will never be modified, then store a TIMESTAMP WITH TIMEZONE, using the local time and it's associated time zone offset.
If the local time is important AND the value can be modified, then store a TIMESTAMP WITH TIMEZONE in one column, and the time zone name (ie. "America/Los_Angeles") in a second VARCHAR column, or elsewhere in your database. During an edit operation, use the time zone name to calculate the offset of the new value. It might be the same, or it may be different.
See also DateTime vs DateTimeOffset, which presents a similar argument for .Net and/or SQL Server.

How to convert a unix timestamp (INT) to monetdb timestamp ('YYYY-MM-DD HH:MM:SS') local time format

Q1: I want to convert a unix timestamp (INT) to monetdb timestamp ('YYYY-MM-DD HH:MM:SS') format
but it is giving me the GMT time not my actual time.
When I do
select (epoch(cast(current_timestamp as timestamp))-epoch(timestamp '2013-04-25 11:49:00'))
where 2013-04-25 11:49:00 is my systems current time it gives the same difference
I tried using
set time zone interval '05:30' HOUR TO MINUTE;
but it did not change the result
How can I solve this problem??
Example Problem:
I wanted to convert unix timestamp 1366869289 which should be around "2013-04-25 11:25:00" but monetdb gives "2013-04-25 05:55:00"
Knowing nothing about MonetDB, but a lot about timezones, I decided to look in their documentation to see what kind of datatypes are supported and how conversions are handled.
I found this page on Temporal data types. Based on that, I can conclude that a timestamp in MonetDB is always intended to reference UTC/GMT time - which is consistent with other systems.
In order to get a value that is for a particular time zone, they offer the following example:
SET TIME ZONE INTERVAL '1' HOUR TO MINUTE
I assume this means to set the database to offset all times by 1 hour, effectively placing the values all in UTC+01:00, such as is the offset for British Summer Time.
The page also goes on to point out the problems that can arise with using just and offset to adjust time values (see TimeZone != Offset in the TimeZone tag wiki). It also offers a list of various named time zones. But it does not show how to set a time zone to one of the named values. Also, their list appears to be proprietary, and incomplete. While at first glance they appear to have similarities to the IANA/Olson time zone database - the identifiers they specify are not valid TZDB names.
There are some other functions listed on this page, without much explanation. One that looks promising for your needs is LOCALTIMESTAMP. Perhaps this will take the local time zone into account, which appears to be what you were looking for.
I could not find any additional details specific to MonetDB date/time/timezone handling. The documentation appears to be fairly incomplete. You might want to reach out to their mailing list.

SQL Reporting Services Daylight saving time query (pt 2)

I posted a question a couple of days ago (SQL Reporting Services Daylight saving time query) which was I received an answer for (thanks very much) but did not elaborate on the whole problem I am experiencing. Not only did I require the returned date time format to account for day light saving but I also need the search parameter #StartDate to allow for DST.
Currently if I key in a scheduled start time of 31/03/2010 11:00 and because the SQL DB has already taken the hours difference into consideration I get no results back. If I key in 31/03/2010 10:00 then the correct details are returned. Is there away using T-SQL or the like to get the search parameter to pass the adjusted time to the DB?
SQL Server 2008 supports the data type "TIMESTAMP WITH TIMEZONE". Might be worth a try.
You can download a 180-day trial version.
In any case, the timezone for an appointment application is what I'd call critical business information. I'd store it in the database, by one of these methods:
using the right data type
building a user-defined data type
adding a column for time zone info