Does the SQL type TIME WITH TIMEZONE make sense? - sql

While doing the mapping of some database columns into Java classes I stumbled onto this obscure SQL-92 Standard type (implemented by PostgreSQL, H2, and HyperSQL afaik). I haven't ever used it, but I wanted to understand how clearly map it to a Java type if I ever find it.
Here are the variants I can see:
Case A: The TIME type, such as 15:20:01. It's a "local time". The time zone is evident to the application so the database doesn't record it.
Case B: The TIME with offset, as in 15:20:01+04:00. It represents a "world time". This time can be converted trivially to UTC, or to any other world clock.
Case C: A TIME with a time zone, such as 15:20:01 EDT. Since the rules to interpret a time strongly depend on the specific date I can't really make any sense of it without the date; but then, if I add the date, it becomes a TIMESTAMP, and that's something totally different.
So, did the SQL Standard get it wrong? Or maybe "TIME with time zone" should be always interpreted as "time with offset" (case B)?

For lots of reasons, that you described well, interpreting a point in time with time of day and variable time zone but without a date is effectively undefined. There are use cases though, where you're establishing a policy within an international context this would be a helpful data type. Everyday at 15:20:01+04:00 the cats need to take a nap. Now the intention isn't to evaluate value in iosolation but within the context of adding it to a baseline date. Standards are all about supporting theoretical possibilities eaven if they're not super common.

Case C, a TIME with a time zone, such as 15:20:01 EDT, can be meaningful for things like store opening hours. Imagine you have a nationwide chain of stores. You want to store each store's standard opening hours in the database. The opening and closing time is a local time with an associated time zone. It isn't a time with a UTC offset (your case B), since it is defined in each store's local time zone, and hence daylight savings–or more rarely a change in the time zone definition–will change the UTC offset without actually changing the value of the opening time column. This store opens at 9am year round, but because its time zone has daylight savings, that is a different UTC offset at different times of year. But we aren't storing a date, because the standard opening/closing times are date-independent. (Maybe we'd have effective-from/effective-to dates, or similar, to track changes to standard opening hours over time.)
It isn't exactly case A, because imagine you have a table of stores, with opening_time and closing_time columns – if they are in different timezones, then case A would make those columns be a mix of data from different time zones, without being explicit about that. Now, given the poor support for case C in most databases, that's probably what happens – you'll probably store the time zone as an additional column. But Case C isn't useless in principle, unlike what many people think.

Related

How to update a date with a time zone in postgresql?

I want to update a date with a timezone (+2 hours) but it ends up as UTC (0 hours)
Date type is 'timestamp-with-timezone'
Query...
update table set date = '2022-05-25 13:28+02:00'
will end up as this in the database.
2022-05-25 11:28:00+00
What's wrong here?
tl;dr
Nothing wrong. Postgres stores values of TIMESTAMP WITH TIME ZONE in UTC, always an offset from UTC of zero. Any submitted offset or zone is used to adjust to UTC.
Details
Date type is 'timestamp-with-timezone'
No such type in standard SQL, nor in Postgres.
I’ll assume you meant TIMESTAMP WITH TIME ZONE.
it ends up as UTC (0 hours)
Read the fine manual. You are seeing documented behavior.
Postgres always stores values in a column of type TIMESTAMP WITH TIME ZONE in UTC, that is, with an offset of zero hours-minutes-seconds.
Any time zone or offset provided with an input is used to adjust into UTC. That provided zone or offset is then discarded.
So the name of the type TIMESTAMP WITH TIME ZONE is a misnomer. First, the authors of the SQL were thinking in terms of offset, not real time zones. Second, any submitted time zone is not stored. A submitted zone is used to adjust and then discarded.
If you need to track the original offset or zone, add an extra column. You’ll have to add code to store the offset amount or the time zone name.
update table set date = '2022-05-25 13:28+02:00' will end up as this in the database. 2022-05-25 11:28:00+00 What's wrong here?
Nothing is wrong. That is a feature, not a bug. Both of those strings represent the very same simultaneous moment.
FYI, database engines vary widely in their behavior handling date-time types and behaviors.
Some do as Postgres does regarding TIMESTAMP WITH TIME ZONE, adjusting to UTC and then discarding any provided time zone or offset. Some others may not.
The SQL standard barely touches on the topic of date-time handling. It declares a few types, and does that poorly with incomplete coverage of all cases. And the standard neglects to define behavior.
So, be very careful when it comes to date-time handling in your database work. Read very carefully the documentation for your particular database engine. Do not make assumptions. Run experiments to validate your understanding. And know that writing portable SQL code for date-time may not be feasible.

Postgresql Performance: What Is the Best Way to Use pg_timezone_names?

We use only timestamps without time zone for a global application. However, some things have to be in local times for user convenience. In order for that to work, we have to deal with the conversion from local to UTC, including handling daylight savings. We don't need precision below that of minute.
pg_timezone_names contains everything we need, including the unambiguous long string for time zone name (e.g., 'US/Eastern'), the interval utc_offset, and the boolean is_dst. (I am assuming the latter two values change as dst boundaries are crossed.)
I am trying to figure out the best performance model, assuming we ultimately have millions of users. Here are the options being considered:
TZ name string ('US/Eastern') in the table for the location. Every time a time transformation (from local to UTC or back) is needed, we directly call pg_timezone_names for the utc_offset of that time zone. (This is assuming that view is well-indexed.) Index on the string in the location table, of course.
Local table time_zones replicating pg_timezone_names, but adding id and boolean in_use columns (and dropping the abbreviation.) Include tz_id in the location table as a foreign key instead of the string.
In the case of a local table, use a procedure that fires around the clock at one minute after every hour over the 26 hours or so that time zones can change, that checks the list of time zones in_use that have just passed two AM Sunday (based on the locally-stored offset,) and calls pg_timezone_names for the updated offset and is_dst values. Trigger updates on the local table check whenever a zone goes into use and makes sure it has the correct values.
The question is whether it is faster to evaluate the indexed string in the location table and then pull the offset from pg_timezone_names every time it is needed, or use a local time_zones table to pull the offset with the FK. I'm thinking the second will be much faster, because it avoids the initial string handling, but it really depends on the speed of the view pg_timezone_names.
After researching this more and discussing with a colleague, I've realized a flaw in the second option above. That option would indeed be quite a bit faster, but it only works if one wishes to pull the current utc_offset for a time zone. If one needs to do it for a timestamp that is not current or a range of timestamps, the built-in postgres view needs to be called, so each timestamp can be called at timezone, which will make the appropriate Daylight Savings conversion for that particular timestamp.
It's slower, but I don't think it can be improved, unless one is only interested in the current timestamp conversion, which is extremely unlikely.
So I am back to the first option, and indexing the time zone string in the local table is no longer necessary, as it would never be searched or sorted on.

Time zone conversion between UTC and local time + possibly daylight saving

I'm struggling to deal with time zone and daylight saving when querying SAP HANA. The datetime stamp is in the form of NVARCHAR, eg 20210304132500 YYYYMMDDHHMISS in UTC, which means local time is 14:25:00(GMT +01:00) but my query returns 13:25:00 (UTC). How do I edit my results to match local time? Sample query below if that helps.
SELECT DATE_TIME,LOCATION,PART_NUMB
FROM "PUBLIC"."internal.sap.datamodel::ACTIVITY"
WHERE SUBSTRING(DATE_TIME,9,2) IN ('08','11')`
The desired result is local date_time in any format.
HANA comes with timezone conversion functions (UTCTOLOCAL) that can perform the necessary calculations.
These functions require that the data/time input is in either SQL date/time format or that it can be converted to that. They also require that the timezone data has been set up and maintained in the HANA DB. This is the actual information about which timezone has which offsets and daylight saving begin and end times.
For your example, it may make sense to expose the DATE_TIME as a type converted field DATE_TIME_UTC that is already in sql-date time:
to_seconddate (DATE_TIME, 'YYYYMMDDHHMISS') as DATE_TIME_UTC
With this conversion done, you can convert the timezone like this:
UTCTOLOCAL (DATE_TIME_UTC, 'Berlin', 'platform') as LOCAL_DATE_TIME
Note, that the target time zone name may be something like "GMT+1" but this is really just a name and not a calculation instruction. If "GMT+1" is not found in the list of timezone conversions, HANA won't just add an hour - it won't perform the calculation.
With this data type and timezone conversion done, you could have a WHERE clause like this:
WHERE
HOUR(LOCAL_DATE_TIME) IN (8, 11)
This order of transformations (data type -> time zone -> hour component) is of course rather expensive. It may be worthwhile to check whether the resulting query performance is satisfactory on realistic data volume.
Also important to note is that time zone conversion only works on complete date-time information, not just the time. That is to say, if the date is unknown, it cannot be determined which offset rule between two time zone applies. So, simply separating the hours and date components won't help in this case.
Finally, I've written quite a bit about handling date, time, and time zones in HANA, you may want to have a look at that:
The time is now, isn’t it?
Trouble with time?
You got the time?

SQL equals does not work for timestamps?

My table has a category 'timestamp' where the timestamps are formatted 2015-06-22 18:59:59
However, using DBVisualizer Free 9.2.8 and Vertica, when I try to pull up rows by timestamp with a
SELECT * FROM table WHERE timestamp = '2015-06-22 18:59:59';
(directly copy-pasting the stamp), nothing comes up. Why is this happening and is there a way around it?
FYI, saying "the timestamps are formatted 2015-06-22 18:59:59" is incorrect if you are indeed using a TIMESTAMP type. Such types have their own internal representation of a date-time value, almost always a count since epoch. In your case with Vertica, 8 bytes are used for such storage. The formatting of the date-time value happens when a string representation is generated. Never confuse the string representation with the date-time value. Conflating the two may well be related to your problem/confusion.
A few different thoughts about possible problems…
String Literals
Are you sure Vertica takes strings as timestamp literals? That format you used is common SQL format. But given that Vertica seems to be a specialized database, I would double-check that.
If strings are not allowed, you may need to call some kind of function to transform the string into a date-time values.
Fractional Second
As the comment by Martin Smith points out, the doc for Timestamp-related data types in Vertica 7.1 says those types can have a fractional second to resolution of microseconds. That means up to 6 decimal places of a fraction.
So if you are searching for "2015-06-22 18:59:59" but the stored value is "2015-06-22 18:59:59.012345", no match on the query.
Half-Open
The fractional seconds issue described above is often the cause of problems people have when handling a span of time. If you naïvely try to pinpoint the ending time, you are likely to have problems. Seeing the "59:59" in your example string makes me think this applies to you.
The better approach to spans of time is "Half-Open" (or Half-Closed, whatever) where the beginning is inclusive while the ending is exclusive. Common notation for this is [). In comparison logic this means: value >= start AND value < stop. Notice the lack of EQUALS SIGN in the stop comparison. In English we would say "look for an hour's worth of invoices starting at 2:00 PM and going up to, but not including, 3:00 PM".
Half-Open for a week means Monday-Monday, for a month the first of one month to the first of the next month, and for a year the January 1 of one year to January 1 of the following year.
Half-Open means not using BETWEEN in SQL. SQL's BETWEEN has often be criticized. Instead do something like the following to look for an hour's worth of invoices. Notice the Z on the end of string literal which means "UTC time zone" ("Z" for "Zulu"). (But verify, as my SQL syntax may need fixing.)
SELECT *
FROM some_table_
WHERE invoice_received_ >= '2015-06-22 18:00:00Z'
AND invoice_received_ < '2015-06-22 19:00:00Z'
;
This query will catch any values such as '2015-06-22 18:59:59.654321" which seems to be eluding you.
Reserved Word
I hope you have not really named your table 'table' and your column 'timestamp'. Such use of keywords and reserved words can cause explicit errors or more subtle weird problems.
Tip: The easy way to avoid any of the over a thousand reserved words in various databases is to append a trailing underscore. The SQL standard explicitly promises to never using a trailing underscore in its reserved words. So use "timestamp_" rather than "timestamp". Another example: "invoice_" table and "received_" column. I recommend doing that as a habit on everything your name in SQL: columns, tables, constraints, indexes, and so on.
Time Zone
You are using the TIMESTAMP which is short for TIMESTAMP WITHOUT TIME ZONE. Or so I presume; the Vertica doc is vague but that is the common usage as seen in the Postgres doc, and may even be standard SQL.
Anyways, TIMESTAMP WITHOUT TIME ZONE is usually the wrong type for most business purposes. The WITH time zone is misnamed and often misunderstood as a consequence: It means "with respect for time zone" where data inputs that include an offset or other time zone information from UTC are adjusted to UTC during the INSERT/UPDATE operations. The WITHOUT type simply ignores any such offset or time zone information.
The WITHOUT type should only be used for the concept of a date-time generally without being tied to any one locality. For example, saying "Christmas this year starts at beginning of December 25, 2015". That means in any time zone rather than a specific time zone. Obviously Christmas starts earlier in Paris, for example, than in Montréal.
If you are timestamping legal documents such as invoices, or booking appointments with people across time zones, or scheduling shipments in various localities, you should be using WITH time zone type.
So back to your possible problem: Test how Vertica or your client app or your database driver is handling your input string. It may be adjusting time zones as part of the parsing of the string using your client machine’s current default time zone. When sent to the database, that value will not match the stored value if during storage no adjustment to UTC was made.
Tip: Generally best practice is to do all your storage and business logic in UTC, adjusting to local time zones only where expected by user.

Oracle Date field - Time issues

We have two databases, in two separate locations. One of the databases resides in a separate time zone than our users.
The problem is that when the database that is located in a separate time zone is updated, with a Date value, the database automatically subtracts 1:00 hour from the Date it was passed.
The issue is that, when passing a NULL date (12:00:00), the DAY value is changed to a previous day.
The updates are done via stored procedures, and the front end is a VB.NET smartclient.
How would you handle this the proper way? I basically don't even want to store the TIME at all, but I can't seem to figure out how to do that.
Not clear on what datetime you want in the database, or what the application is passing.
Assume the user's PC is telling him it is Tuesday, 12:30am, and the clock on the Db server is saying Monday, 11:30pm.
If you insert a value for the 'current date' (eg TRUNC(SYSDATE)) then, as far as the database is concerned, it is still Monday.
If you insert a value for the 'current time (eg SYSDATE), it is also still Monday.
if you insert a value for the session's current time (eg CURRENT_TIMESTAMP) and timezone and ask the database to store it in the database, it will store 11:30pm.
If you ask the database to store the datetime '2009-12-31 14:00:00', then that is what it will store. If you ask it to store the datetime/timezone '2009-12-31 14:00:00 +08:00', then you are in the advanced manual. You can ask the database to store timestamps with timezone data. Also consider daylight saving
I would investigate using the TRUNC function in your stored proc method that updates the table. If the data type in the method (that updates the table) is not a DATE type then use the to_date function in conjunction with the TRUNC function.
This is outside of the scope of the question you are asking, but I would recommend in ALL cases where users are accessing a database from different time zones, the server and database clocks time zone should be set to UTC. It is probably too late for that, but setting the datbase server to UTC eliminates the problems caused by daylight savings time and different time zones.
In my opionion, Date/Time data can and should always be stored in UTC. This data can be converted to local time at the point where it is presented to the user. Oracle actually makes this easy with the TIMESTAMP with TIME ZONE data type. It allows you to access the data either as UTC (SYS_EXTRACT_UTC) or local time (Local to the database server.)
It is never the same day all places in the world, so dates cannot be considered without time.
Of course another of my opinions is that Daylight Savings time should be eliminated. But that is another topic.