Pretty simple, I need to disable time zone conversion for specific columns. I will handle any TZ conversion manually, but I need Rails 3 to forego conversion in both writing and reading, and any AREL functions. But, I don't want to disable the conversion for non-specified attributes.
Ok, I know how to disable it for reading:
self.skip_time_zone_conversion_for_attributes = [:test_timestamp]
But this only works for reading. When writing the attribute, it still converts to UTC (yes, I tested this in 3.2.8).
As you note, skip_time_zone_conversion_for_attributes only works for reading, which makes the whole feature pretty useless.
There are two possible solutions:
1.- Accept that with times will be written in UTC, and read accordingly:
def starts_at # override reader method
attributes['starts_at'].in_time_zone(whatever_timezone)
end
cons: overriden method is bypased when using MyModel.pluck(:starts_at).
2.- Store time values as strings, taking care of writing the values in the right format, and reading them in the desired timezone.
def starts_at
DateTime.strptime(attributes['starts_at'], whatever_format).in_time_zone(whatever_timezone)
end
cons: one loses the ability to query the database by using date operators (less than, greater than).
Related
We use only timestamps without time zone for a global application. However, some things have to be in local times for user convenience. In order for that to work, we have to deal with the conversion from local to UTC, including handling daylight savings. We don't need precision below that of minute.
pg_timezone_names contains everything we need, including the unambiguous long string for time zone name (e.g., 'US/Eastern'), the interval utc_offset, and the boolean is_dst. (I am assuming the latter two values change as dst boundaries are crossed.)
I am trying to figure out the best performance model, assuming we ultimately have millions of users. Here are the options being considered:
TZ name string ('US/Eastern') in the table for the location. Every time a time transformation (from local to UTC or back) is needed, we directly call pg_timezone_names for the utc_offset of that time zone. (This is assuming that view is well-indexed.) Index on the string in the location table, of course.
Local table time_zones replicating pg_timezone_names, but adding id and boolean in_use columns (and dropping the abbreviation.) Include tz_id in the location table as a foreign key instead of the string.
In the case of a local table, use a procedure that fires around the clock at one minute after every hour over the 26 hours or so that time zones can change, that checks the list of time zones in_use that have just passed two AM Sunday (based on the locally-stored offset,) and calls pg_timezone_names for the updated offset and is_dst values. Trigger updates on the local table check whenever a zone goes into use and makes sure it has the correct values.
The question is whether it is faster to evaluate the indexed string in the location table and then pull the offset from pg_timezone_names every time it is needed, or use a local time_zones table to pull the offset with the FK. I'm thinking the second will be much faster, because it avoids the initial string handling, but it really depends on the speed of the view pg_timezone_names.
After researching this more and discussing with a colleague, I've realized a flaw in the second option above. That option would indeed be quite a bit faster, but it only works if one wishes to pull the current utc_offset for a time zone. If one needs to do it for a timestamp that is not current or a range of timestamps, the built-in postgres view needs to be called, so each timestamp can be called at timezone, which will make the appropriate Daylight Savings conversion for that particular timestamp.
It's slower, but I don't think it can be improved, unless one is only interested in the current timestamp conversion, which is extremely unlikely.
So I am back to the first option, and indexing the time zone string in the local table is no longer necessary, as it would never be searched or sorted on.
While doing the mapping of some database columns into Java classes I stumbled onto this obscure SQL-92 Standard type (implemented by PostgreSQL, H2, and HyperSQL afaik). I haven't ever used it, but I wanted to understand how clearly map it to a Java type if I ever find it.
Here are the variants I can see:
Case A: The TIME type, such as 15:20:01. It's a "local time". The time zone is evident to the application so the database doesn't record it.
Case B: The TIME with offset, as in 15:20:01+04:00. It represents a "world time". This time can be converted trivially to UTC, or to any other world clock.
Case C: A TIME with a time zone, such as 15:20:01 EDT. Since the rules to interpret a time strongly depend on the specific date I can't really make any sense of it without the date; but then, if I add the date, it becomes a TIMESTAMP, and that's something totally different.
So, did the SQL Standard get it wrong? Or maybe "TIME with time zone" should be always interpreted as "time with offset" (case B)?
For lots of reasons, that you described well, interpreting a point in time with time of day and variable time zone but without a date is effectively undefined. There are use cases though, where you're establishing a policy within an international context this would be a helpful data type. Everyday at 15:20:01+04:00 the cats need to take a nap. Now the intention isn't to evaluate value in iosolation but within the context of adding it to a baseline date. Standards are all about supporting theoretical possibilities eaven if they're not super common.
Case C, a TIME with a time zone, such as 15:20:01 EDT, can be meaningful for things like store opening hours. Imagine you have a nationwide chain of stores. You want to store each store's standard opening hours in the database. The opening and closing time is a local time with an associated time zone. It isn't a time with a UTC offset (your case B), since it is defined in each store's local time zone, and hence daylight savings–or more rarely a change in the time zone definition–will change the UTC offset without actually changing the value of the opening time column. This store opens at 9am year round, but because its time zone has daylight savings, that is a different UTC offset at different times of year. But we aren't storing a date, because the standard opening/closing times are date-independent. (Maybe we'd have effective-from/effective-to dates, or similar, to track changes to standard opening hours over time.)
It isn't exactly case A, because imagine you have a table of stores, with opening_time and closing_time columns – if they are in different timezones, then case A would make those columns be a mix of data from different time zones, without being explicit about that. Now, given the poor support for case C in most databases, that's probably what happens – you'll probably store the time zone as an additional column. But Case C isn't useless in principle, unlike what many people think.
I have a string in the following format:
14:41:21 Dec 15, 2015 PST
I want to convert that to my server's local time, but I think I'm creating an extra step that can be avoided:
Dim testdate As Date
DateTime.TryParseExact(dateinput, "HH:mm:ss MMM dd, yyyy PST", CultureInfo.InvariantCulture, DateTimeStyles.None, testdate)
testdate = TimeZoneInfo.ConvertTimeToUtc(testdate, TimeZoneInfo.FindSystemTimeZoneById("Pacific Standard Time"))
testdate = testdate.ToLocalTime()
I've played around with this but always off by a couple hours either way, and the above is what I've found to work but just wanted to know if there was a better way. Also note it could be deployed on multiple servers, so I don't want to specify the timezone to convert it to explicitly, reason for localtime.
A few things:
If you're going to include fixed text in a format string, put it in single-tick quotes so it can't get misinterpreted as a formatting token. ('PST')
In the general case, time zone abbreviations should only be used for display purposes. They should not be parsed as input, as they could be ambiguous. For example, there are 5 different interpretations of CST. It might be US Central Standard Time, but it could also be China Standard Time, or one of the others. See the list on Wikipedia.
If you have a limited number of time zone abbreviations you want to support, then you could extract it from the string and use a dictionary, select/case statements, or conditional logic to map them. Just be certain you know the entire set of abbreviations you want to support and exactly which time zones you want them to map to. Also be sure to account for daylight time abbreviations, such as PDT.
Note that some older standards, such as RFC 2822 §4.3 indeed hardcode a few abbreviations, so you may choose to support those if you are parsing that particular format. (Yours is similar, but not quite a match.)
Your code is mostly ok, but you should probably check the result of TryParseExact. Otherwise you might as well use ParseExact which will throw an exception on failure instead of just returning false.
You could use ConvertTime with TimeZoneInfo.Local as the destination zone if you wanted to do the conversion in a single step. The code would be slightly smaller, though would have no technical differences.
Are you sure you really want to do this? Relying on the system's local time zone should usually should not be done in server-based applications. That's something more appropriate for desktop and mobile. In general, server-side code should not rely on the system time zone to be anything in particular. Avoid "local time" APIs, including DateTime.Now, TimeZoneInfo.Local, ToLocalTime, and ToUniversalTime (when it assumes the input is local time). It is better to supply the applicable time zone in your business logic or application configuration.
My table has a category 'timestamp' where the timestamps are formatted 2015-06-22 18:59:59
However, using DBVisualizer Free 9.2.8 and Vertica, when I try to pull up rows by timestamp with a
SELECT * FROM table WHERE timestamp = '2015-06-22 18:59:59';
(directly copy-pasting the stamp), nothing comes up. Why is this happening and is there a way around it?
FYI, saying "the timestamps are formatted 2015-06-22 18:59:59" is incorrect if you are indeed using a TIMESTAMP type. Such types have their own internal representation of a date-time value, almost always a count since epoch. In your case with Vertica, 8 bytes are used for such storage. The formatting of the date-time value happens when a string representation is generated. Never confuse the string representation with the date-time value. Conflating the two may well be related to your problem/confusion.
A few different thoughts about possible problems…
String Literals
Are you sure Vertica takes strings as timestamp literals? That format you used is common SQL format. But given that Vertica seems to be a specialized database, I would double-check that.
If strings are not allowed, you may need to call some kind of function to transform the string into a date-time values.
Fractional Second
As the comment by Martin Smith points out, the doc for Timestamp-related data types in Vertica 7.1 says those types can have a fractional second to resolution of microseconds. That means up to 6 decimal places of a fraction.
So if you are searching for "2015-06-22 18:59:59" but the stored value is "2015-06-22 18:59:59.012345", no match on the query.
Half-Open
The fractional seconds issue described above is often the cause of problems people have when handling a span of time. If you naïvely try to pinpoint the ending time, you are likely to have problems. Seeing the "59:59" in your example string makes me think this applies to you.
The better approach to spans of time is "Half-Open" (or Half-Closed, whatever) where the beginning is inclusive while the ending is exclusive. Common notation for this is [). In comparison logic this means: value >= start AND value < stop. Notice the lack of EQUALS SIGN in the stop comparison. In English we would say "look for an hour's worth of invoices starting at 2:00 PM and going up to, but not including, 3:00 PM".
Half-Open for a week means Monday-Monday, for a month the first of one month to the first of the next month, and for a year the January 1 of one year to January 1 of the following year.
Half-Open means not using BETWEEN in SQL. SQL's BETWEEN has often be criticized. Instead do something like the following to look for an hour's worth of invoices. Notice the Z on the end of string literal which means "UTC time zone" ("Z" for "Zulu"). (But verify, as my SQL syntax may need fixing.)
SELECT *
FROM some_table_
WHERE invoice_received_ >= '2015-06-22 18:00:00Z'
AND invoice_received_ < '2015-06-22 19:00:00Z'
;
This query will catch any values such as '2015-06-22 18:59:59.654321" which seems to be eluding you.
Reserved Word
I hope you have not really named your table 'table' and your column 'timestamp'. Such use of keywords and reserved words can cause explicit errors or more subtle weird problems.
Tip: The easy way to avoid any of the over a thousand reserved words in various databases is to append a trailing underscore. The SQL standard explicitly promises to never using a trailing underscore in its reserved words. So use "timestamp_" rather than "timestamp". Another example: "invoice_" table and "received_" column. I recommend doing that as a habit on everything your name in SQL: columns, tables, constraints, indexes, and so on.
Time Zone
You are using the TIMESTAMP which is short for TIMESTAMP WITHOUT TIME ZONE. Or so I presume; the Vertica doc is vague but that is the common usage as seen in the Postgres doc, and may even be standard SQL.
Anyways, TIMESTAMP WITHOUT TIME ZONE is usually the wrong type for most business purposes. The WITH time zone is misnamed and often misunderstood as a consequence: It means "with respect for time zone" where data inputs that include an offset or other time zone information from UTC are adjusted to UTC during the INSERT/UPDATE operations. The WITHOUT type simply ignores any such offset or time zone information.
The WITHOUT type should only be used for the concept of a date-time generally without being tied to any one locality. For example, saying "Christmas this year starts at beginning of December 25, 2015". That means in any time zone rather than a specific time zone. Obviously Christmas starts earlier in Paris, for example, than in Montréal.
If you are timestamping legal documents such as invoices, or booking appointments with people across time zones, or scheduling shipments in various localities, you should be using WITH time zone type.
So back to your possible problem: Test how Vertica or your client app or your database driver is handling your input string. It may be adjusting time zones as part of the parsing of the string using your client machine’s current default time zone. When sent to the database, that value will not match the stored value if during storage no adjustment to UTC was made.
Tip: Generally best practice is to do all your storage and business logic in UTC, adjusting to local time zones only where expected by user.
I've worked with various ORMs and database abstractions designed to make it easy to work with multiple databases, both relational and not. The more comprehensive solutions will usually give you access to some date functions that boil down to actual SQL (or whatever, in the case of non-SQL dbs). On the other hand, many of these abstractions don't provide direct access to SQL functions and you lose the ability to deal with dates directly. Instead, you're expected to use the upper-level language (PHP, Python, whatever) to do your date-wrangling, and finally only insert, select, what-have-you the formatted date.
So my question is this: if the SQL server never gets to do anything with the date itself, am I better off just using an int and putting epoch timestamps in it, or is there additional value to the database server "knowing" it's a date?
If you are using dates, store them as dates.
Not only does this make it easier to translate between the database and application, but when you need to do anything based on the dates (and you will, otherwise why have dates stored at all?).
That is, when you need to sort or query using the dates, you will not need to go trough special effort to re-convert to dates.
Other than what #Oded said, if you never ever use any date related functions, Still there are some issues;
At the moment, you cannot store epoch timestamp in milliseconds into an INT field (overflows).
Timestamp without milliseconds will overflow INT on Tue Jan 19 2038 # 03:14:08 GMT+0000 (GMT) as it will be greater than 2147483647.
BUT, Integer takes 4 bytes and Datetime takes 8 bytes. You are better off 4 bytes if you are within above two limitations.