Converting string with US date and time format to UK format - sql

I have an application that stores date and time in a string field in an SQL Server 2008 table.
The application stores the date and time according to the regional settings of the PC that is running and we can’t change this behavior.
The problem is that some PCs have to be in UK date format with 12h time (eg. 22/10/2011 1:22:35 pm) some with UK date format with 24h time (eg. 22/10/2011 13:22:25) and some have to be US date format (eg. 10/22/2011 1:22:35 pm) and (eg. 10/22/2011 13:22:25).
Is there any automatic way to change the string every time it changing/added to the table to UK 24h format so it will be always the same format in the database?
Can it be done using some trigger on update or insert? Is there any built-in function that already does that?
Even a script to run it from time to time may be do the job...
I’m thinking to break apart the string to day, month , year, hour, minute, second , AM/PM and then put the day and month part in dd/mm order and somehow change the hour part to 24h if PM, get rid of the “am” and “pm” and then put the modified date/time back to the table.
For example the table has
id datestring value Location
1 15/10/2011 11:55:01 pm BLAHBLAH UK
2 15/10/2011 13:12:20 BLAKBLAK GR
3 10/15/2011 6:00:01 pm SOMESTUFF US
4 10/15/2011 20:16:43 SOMEOTHERSTUFF US
and we want it to be
id datestring value Location
1 15/10/2011 23:55:01 BLAHBLAH UK
2 15/10/2011 13:12:20 BLAKBLAK GR
3 15/10/2011 18:00:01 SOMESTUFF US
4 15/10/2011 20:16:43 SOMEOTHERSTUFF US
We can display the date parts (day,month,year) correctly using the datepart function but with the time part we have problems because it changes too many ways.
Edited to explain some more
mr. p.campbell thanks for the edit .. i didn't know how to beautify it :)
and mr. Matthew, thank you for your quick reply..
We can tell if it is UK date or US date because we have another field i didn't mention with the text "US", "UK", "GR", "IT" according to where the PLC machine is located.
I'm sorry i didn't explain it to well. My english are not so good.
There are two different and independent applications. And they don't have direct relation with the sql server.
The application that only writes data to the database ..lets call it "the writer" for short.. and a different application that reads the data .. lets call it "the reader".
"The writer" is an internal application of a PLC machine that stores values every 1 min to the database that's why we can't change its behavior. It uses the string data type to store the date and the time at the same field according to the regional settings of the pc that a daemon application runs and does the communication between the pc and the PLC machine.
Now "the reader" expects the date and time to be in the format "dd/mm/yyyy 23:23:01" or "yyyy/mm/dd 23:23:01" and the only thing it does for now is doing some calculations with the data in the value field between given dates. eg. from 10/09/2011 10:00:00 to 15/09/2011 14:00:00.
we just need to do something like this ...
select * from table1 where datestring between "10/09/2011 10:00:00" and "15/09/2011 14:00:00"
I could post some of the code but it will be very long post.

At first, I agreed with Matthew, but then I realized that, given the information presented, this actually was possible (well, sorta).
However, some caveats;
You are doing nobody any favors by storing and maintaining the database this way. Your best bet is to change the application to have it give an actual Datetime value, not this mangled string.
This data CANNOT be meaningfully sorted by date or time (not without performing expensive string manipulation).
You appear to be storing all times as local times, but do not appear to be storing a TimeZone or related information. Without this information, you will NOT be able to (completely) correctly translate times 'globally'. For instance, which is later - 4PM in London, or 11AM in New York (for, say, an international conference call)? The answer is that you don't know: it depends on the time of year.
You are storing local times, period. This only works so long as local time is correct. What happens when somebody sets their clock to 1900? You should be storing time based off of the SERVER'S clock.
Your stored timestamp is based on a formatted string. If the user changes how their time is displayed, your data correctness (potentially) goes out the window. For instance, what if somebody removes the am/pm symbols, thinking "I'll look out the window - if the sun is out, it's 'am'"?
Please keep all of that in mind.
As to how to do this....
I'm not going to actually write out the SQL statement for this. Mostly because storing the information this way is pretty terrible. But also because it's going to take a lot of work I'd rather not do. I really recommend stressing to whomever has the keys at your place to get that application changed.
So instead, I'm going to give you a really big clue - and this will only work for so long as your timestamp format remains the same; You should be able to tell what format the date and time are in based on the presence and absence of 'am' and 'pm' in the string (if you don't have both, you're flat-out toast). As Matthew has pointed out, the formatting is also likely different for the date, as well as the time - you will need to translate both. However, this will immediately give you problems due to comparative timestamps (please see point 4, above); any attempt to run scheduling or auditing with this data id pretty much doomed to failure ("When did that happen?" "Well, it's in the UK date format, so..." "But that makes it 1AM here, and he was dead then!").
Most beneficial answer: Change how the information is stored in the database
EDIT:
And then it hits me (especially in light of the new edits) - there are potentially other possibilities that could actually make this work....
First, change your database to actually store some sort of 'globalized' timestamp, based off of the server's clock.
This will of course break your existing application code - it would get a data-type mismatch error. To fix that, rename the table, then create a view, named the same as the original table, that will return the string formatted as indicated in the 'source' column. You'll need to create instead-of triggers for the view, to translate the formatted string to an actual datetime value. The best part is, the application code should never notice the difference. You seem to have indicated that you have sufficient control over the database to allow this to happen; this should allow you to 'fix' the data transparently.
This of course works best if the incoming datetime values are absolute (not local). Hopefully, the values are actually supposed to be 'insert time' - these could likely be safely ignored, in favor of using a special register (like NOW or CURRENT DATE or whatever).
Can't believe this didn't hit me earlier...

You stated that you cannot change the application behavior, thus this is not possible.
Your problem is that your database doesn't know the culture / timezone settings of the client and your client doesn't report it.
You will need to report this data or think of clever ways to infer this information before you can act on it.
EDIT: For example, without knowledge of the client's details how could you tell the difference between the strings:
10/1/2011 12:00:00 (October First, noon, US)
10/1/2011 12:00:00 (January Tenth, noon, UK)
?

Related

Is there a more elegant way to convert this time to my server's local time?

I have a string in the following format:
14:41:21 Dec 15, 2015 PST
I want to convert that to my server's local time, but I think I'm creating an extra step that can be avoided:
Dim testdate As Date
DateTime.TryParseExact(dateinput, "HH:mm:ss MMM dd, yyyy PST", CultureInfo.InvariantCulture, DateTimeStyles.None, testdate)
testdate = TimeZoneInfo.ConvertTimeToUtc(testdate, TimeZoneInfo.FindSystemTimeZoneById("Pacific Standard Time"))
testdate = testdate.ToLocalTime()
I've played around with this but always off by a couple hours either way, and the above is what I've found to work but just wanted to know if there was a better way. Also note it could be deployed on multiple servers, so I don't want to specify the timezone to convert it to explicitly, reason for localtime.
A few things:
If you're going to include fixed text in a format string, put it in single-tick quotes so it can't get misinterpreted as a formatting token. ('PST')
In the general case, time zone abbreviations should only be used for display purposes. They should not be parsed as input, as they could be ambiguous. For example, there are 5 different interpretations of CST. It might be US Central Standard Time, but it could also be China Standard Time, or one of the others. See the list on Wikipedia.
If you have a limited number of time zone abbreviations you want to support, then you could extract it from the string and use a dictionary, select/case statements, or conditional logic to map them. Just be certain you know the entire set of abbreviations you want to support and exactly which time zones you want them to map to. Also be sure to account for daylight time abbreviations, such as PDT.
Note that some older standards, such as RFC 2822 §4.3 indeed hardcode a few abbreviations, so you may choose to support those if you are parsing that particular format. (Yours is similar, but not quite a match.)
Your code is mostly ok, but you should probably check the result of TryParseExact. Otherwise you might as well use ParseExact which will throw an exception on failure instead of just returning false.
You could use ConvertTime with TimeZoneInfo.Local as the destination zone if you wanted to do the conversion in a single step. The code would be slightly smaller, though would have no technical differences.
Are you sure you really want to do this? Relying on the system's local time zone should usually should not be done in server-based applications. That's something more appropriate for desktop and mobile. In general, server-side code should not rely on the system time zone to be anything in particular. Avoid "local time" APIs, including DateTime.Now, TimeZoneInfo.Local, ToLocalTime, and ToUniversalTime (when it assumes the input is local time). It is better to supply the applicable time zone in your business logic or application configuration.

SQL equals does not work for timestamps?

My table has a category 'timestamp' where the timestamps are formatted 2015-06-22 18:59:59
However, using DBVisualizer Free 9.2.8 and Vertica, when I try to pull up rows by timestamp with a
SELECT * FROM table WHERE timestamp = '2015-06-22 18:59:59';
(directly copy-pasting the stamp), nothing comes up. Why is this happening and is there a way around it?
FYI, saying "the timestamps are formatted 2015-06-22 18:59:59" is incorrect if you are indeed using a TIMESTAMP type. Such types have their own internal representation of a date-time value, almost always a count since epoch. In your case with Vertica, 8 bytes are used for such storage. The formatting of the date-time value happens when a string representation is generated. Never confuse the string representation with the date-time value. Conflating the two may well be related to your problem/confusion.
A few different thoughts about possible problems…
String Literals
Are you sure Vertica takes strings as timestamp literals? That format you used is common SQL format. But given that Vertica seems to be a specialized database, I would double-check that.
If strings are not allowed, you may need to call some kind of function to transform the string into a date-time values.
Fractional Second
As the comment by Martin Smith points out, the doc for Timestamp-related data types in Vertica 7.1 says those types can have a fractional second to resolution of microseconds. That means up to 6 decimal places of a fraction.
So if you are searching for "2015-06-22 18:59:59" but the stored value is "2015-06-22 18:59:59.012345", no match on the query.
Half-Open
The fractional seconds issue described above is often the cause of problems people have when handling a span of time. If you naïvely try to pinpoint the ending time, you are likely to have problems. Seeing the "59:59" in your example string makes me think this applies to you.
The better approach to spans of time is "Half-Open" (or Half-Closed, whatever) where the beginning is inclusive while the ending is exclusive. Common notation for this is [). In comparison logic this means: value >= start AND value < stop. Notice the lack of EQUALS SIGN in the stop comparison. In English we would say "look for an hour's worth of invoices starting at 2:00 PM and going up to, but not including, 3:00 PM".
Half-Open for a week means Monday-Monday, for a month the first of one month to the first of the next month, and for a year the January 1 of one year to January 1 of the following year.
Half-Open means not using BETWEEN in SQL. SQL's BETWEEN has often be criticized. Instead do something like the following to look for an hour's worth of invoices. Notice the Z on the end of string literal which means "UTC time zone" ("Z" for "Zulu"). (But verify, as my SQL syntax may need fixing.)
SELECT *
FROM some_table_
WHERE invoice_received_ >= '2015-06-22 18:00:00Z'
AND invoice_received_ < '2015-06-22 19:00:00Z'
;
This query will catch any values such as '2015-06-22 18:59:59.654321" which seems to be eluding you.
Reserved Word
I hope you have not really named your table 'table' and your column 'timestamp'. Such use of keywords and reserved words can cause explicit errors or more subtle weird problems.
Tip: The easy way to avoid any of the over a thousand reserved words in various databases is to append a trailing underscore. The SQL standard explicitly promises to never using a trailing underscore in its reserved words. So use "timestamp_" rather than "timestamp". Another example: "invoice_" table and "received_" column. I recommend doing that as a habit on everything your name in SQL: columns, tables, constraints, indexes, and so on.
Time Zone
You are using the TIMESTAMP which is short for TIMESTAMP WITHOUT TIME ZONE. Or so I presume; the Vertica doc is vague but that is the common usage as seen in the Postgres doc, and may even be standard SQL.
Anyways, TIMESTAMP WITHOUT TIME ZONE is usually the wrong type for most business purposes. The WITH time zone is misnamed and often misunderstood as a consequence: It means "with respect for time zone" where data inputs that include an offset or other time zone information from UTC are adjusted to UTC during the INSERT/UPDATE operations. The WITHOUT type simply ignores any such offset or time zone information.
The WITHOUT type should only be used for the concept of a date-time generally without being tied to any one locality. For example, saying "Christmas this year starts at beginning of December 25, 2015". That means in any time zone rather than a specific time zone. Obviously Christmas starts earlier in Paris, for example, than in Montréal.
If you are timestamping legal documents such as invoices, or booking appointments with people across time zones, or scheduling shipments in various localities, you should be using WITH time zone type.
So back to your possible problem: Test how Vertica or your client app or your database driver is handling your input string. It may be adjusting time zones as part of the parsing of the string using your client machine’s current default time zone. When sent to the database, that value will not match the stored value if during storage no adjustment to UTC was made.
Tip: Generally best practice is to do all your storage and business logic in UTC, adjusting to local time zones only where expected by user.

When to use VARCHAR and DATE/DATETIME

We had this programming discussion on Freenode and this question came up when I was trying to use a VARCHAR(255) to store a Date Variable in this format: D/MM/YYYY. So the question is why is it so bad to use a VARCHAR to store a date. Here are the advantages:
Its faster to code. Previously I used DATE, but date formatting was a real pain.
Its more power hungry to use string than Date? Who cares, we live in the Ghz era.
Its not ethically correct (lolwut?) This is what the other user told me...
So what would you prefer to use to store a date? SQL VARCHAR or SQL DATE?
Why not put screws in with a hammer?
Because it isn't the right tool for the job.
Some of the disadvantages of the VARCHAR version:
You can't easily add / subtract days to the VARCHAR version.
It is harder to extract just month / year.
There is nothing stopping you putting non-date data in the VARCHAR column in the database.
The VARCHAR version is culture specific.
You can't easily sort the dates.
It is difficult to change the format if you want to later.
It is unconventional, which will make it harder for other developers to understand.
In many environments, using VARCHAR will use more storage space. This may not matter for small amounts of data, but in commercial environments with millions of rows of data this might well make a big difference.
Of course, in your hobby projects you can do what you want. In a professional environment I'd insist on using the right tool for the job.
When you'll have database with more than 2-3 million rows you'll know why it's better to use DATETIME than VARCHAR :)
Simple answer is that with databases - processing power isn't a problem anymore. Just the database size is because of HDD's seek time.
Basically with modern harddisks you can read about 100 records / second if they're read in random order (usually the case) so you must do everything you can to minimize DB size, because:
The HDD's heads won't have to "travel" this much
You'll fit more data in RAM
In the end it's always HDD's seek times that will kill you. Eg. some simple GROUP BY query with many rows could take a couple of hours when done on disk compared to couple of seconds when done in RAM => because of seek times.
For VARCHAR's you can't do any searches. If you hate the way how SQL deals with dates so much, just use unix timestamp in 32 bit integer field. You'll have (basically) all advantages of using SQL DATE field, you'll just have to manipulate and format dates using your choosen programming language, not SQL functions.
Two reasons:
Sorting results by the dates
Not sensitive to date formatting changes
So let's take for instance a set of records that looks like this:
5/12/1999 | Frank N Stein
1/22/2005 | Drake U. La
10/4/1962 | Goul Friend
If we were to store the data your way, but sorted on the dates in assending order SQL will respond with the resultset that looks like this:
1/22/2005 | Drake U. La
10/4/1962 | Goul Friend
5/12/1999 | Frank N. Stein
Where if we stored the dates as a DATETIME, SQL will respond correctly ordering them like this:
10/4/1962 | Goul Friend
5/12/1999 | Frank N. Stein
1/22/2005 | Drake U. La
Additionally, if somewhere down the road you needed to display dates in a different format, for example like YYYY-MM-DD, then you would need to transform all your data or deal with mixed content. When it's stored as a SQL DATE, you are forced to make the transform in code, and very likely have one spot to change the format to display all dates--for free.
Between DATE/DATETIME and VARCHAR for dates I would go with DATE/DATETIME everytime. But there is a overlooked third option. Storing it as a INTEGER unsigned!
I decided to go with INTEGER unsigned in my last project, and I am really satisfied with making that choice instead of storing it as a DATE/DATETIME. Because I was passing along dates between client and server it made the ideal type for me to use. Instead of having to store it as DATE and having to convert back every time I select, I just select it and use it however I want it. If you want to select the date as a "human-readable" date you can use the FROM_UNIXTIME() function.
Also a integer takes up 4 bytes while DATETIME takes up 8 bytes. Saving 50% storage.
The sorting problem that Berin proposes is also solved using integer as storage for dates.
I'd vote for using the date/datetime types, just for the sake of simplicity/consistency.
If you do store it as a character string, store it in ISO 8601 format:
http://www.iso.org/iso/date_and_time_format
http://xml.coverpages.org/ISO-FDIS-8601.pdf
http://www.cl.cam.ac.uk/~mgk25/iso-time.html
Among other things, ISO 8601 date/time string (A) collate properly, (B) are human readable, (C) are locale-indepedent, and (D) are readily convertable to other formats. To crib from the ISO blurb, ISO 8601 strings offer
representations for the following:
Date
Time of the day
Coordinated universal time (UTC)
Local time with offset to UTC
Date and time
Time intervals
Recurring time intervals
Representations can be in one of two formats: a basic format
that has a minimal number of characters and an extended format
that adds characters to enhance human readability. For example,
the third of January 2003 can be represented as either 20030103
or 2003-01-03.
[and]
offer the following advantages over many of the locally used
representations:
Easily readable and writeable by systems
Easily comparable and sortable
Language independent
Larger units are written in front of smaller units
For most representations the notation is short and of constant length
One last thing: If all you need to do is store a date, then storing it in the ISO 8601 short form YYYYMMDD in a char(8) column takes no more storage than a datetime value (and you don't need to worry about the 3 millisecond gap between the last tick of the one day and the first tick of the next. But that's a matter for another discussion. If you break it up into 3 columns — YYYY char(4), MM char(2), DD char(2) you'll use up the same amount of storage, and get more options for indexing. Even better, store the fields as a short for yyyy (4 bytes), and a tinyint for each of MM and DD — now you're down to 6 bytes for the date. The drawback, of course, to decomposing the date components into their constituent parts is that conversion to proper date/time data types is complicated.

Calculate Daylight Savings Time (DST) on SQL/Database level

My location in Sydney, Australia. The dates that I explain will be in UK or Australia date format.
Observe the following:
2010-04-15 04:30:00.000 => 15/04/2010 14:30:00 EST (UK date format - Add 10 hours)
2010-11-05 01:00:00.000 => 05/11/2010 12:00:00 EST (UK date format - Add 11 hours)
Both these times are retrieved from the database in UTC format and then calculated on the Web level whether +10 or +11 hour is applicable.
In Australia, Daylight Savings Time (DST) transition dates vary year by year. The transition dates are usually Early April and Late October.
So how accurate would the Web calculation be? If this year the transition date is a few days later (say 03/04/2010), but the Web calculation bases on a fixed date (say 01/04/2010), wouldn't that mean that the days in between will be off by 1 hour when displayed (due to the fixed calculation nature to a specific day of the month)?
I believe the transition dates is not pre-determined and is actually announced to the public. Is that assumption true?
If not (that means the DST dates are pre-determined), would I be able to do the calculation outside the Web level (on the SQL/Database level)?
The database is SQL Server 2005 and I'm using Report Definition Language (RDL) to display the fields in UTC time. If SQL/database level is not the best way, how do I work out +10 or +11 and format the time accordingly to show the right time?
Thank you.
The database is a bad choice for it: it has less information than c# or .net to work it out. .net uses the registry which is kept upto date periodically by patches. SQL Server would have to have a table with date ranges and offsets.
The transitions are fixed because of scheduling (flights, trains ,whatever). IIRC it only changed once at short notice recently in Australia for some Olympic games and it caused chaos around the world. In 2007 the US changed but this was known in advance.
By fixed, it's the "last Sunday" type fixed even if the date varies.
I would leave it in the web code: the DB does not know where your caller is for example, the web site can work it out.
The problem is whoever wrote this app does not quite understand UTC, its value, and how to use it. The database is the correct location for the dat, but the system is not using UTC as intended.
If you use UTC, then all your date arithmetic should use UTC. In the database. It is currently using a saved UTC and then converting at some (doesn't matter if seciond tier or third) other layer; some other library. Half UTC, and Half something else. Have you considered historic dates, as in what is the DATEDIFF() between 15 Feb 2010 and today ?
This eliminates the concern re DST in Australia or Greenland.; and concern re what date/time the changeover actually happens. Everyone is using Greenwich Mean Time for that particular day.
Do all you date arithmetic in the db, in UTC. And display the result (only) in the local time zone, which as you have it, is the web layer, based on the user.
Many systems have dropped that last step altogether, and display in UTC only, regardless of the user's time zone.
The database can handle DST for you. Use its time zone conversion functions to go from whatever zone you stored the dates in to whatever zone you want to get for the user.
MySQL has CONVERT_TZ(), I don't know what other RDBMS's have.

SQL Reporting Services Daylight saving time query (pt 2)

I posted a question a couple of days ago (SQL Reporting Services Daylight saving time query) which was I received an answer for (thanks very much) but did not elaborate on the whole problem I am experiencing. Not only did I require the returned date time format to account for day light saving but I also need the search parameter #StartDate to allow for DST.
Currently if I key in a scheduled start time of 31/03/2010 11:00 and because the SQL DB has already taken the hours difference into consideration I get no results back. If I key in 31/03/2010 10:00 then the correct details are returned. Is there away using T-SQL or the like to get the search parameter to pass the adjusted time to the DB?
SQL Server 2008 supports the data type "TIMESTAMP WITH TIMEZONE". Might be worth a try.
You can download a 180-day trial version.
In any case, the timezone for an appointment application is what I'd call critical business information. I'd store it in the database, by one of these methods:
using the right data type
building a user-defined data type
adding a column for time zone info