Handling day light savings across multiple time zones.- MS SQL Server - sql

I've been tasked with handling import jobs in to SQL server based on various time zones. Files arrive on a Windows Server from multiple regions for example Brazil, Singapore, Australia, various parts of the U.S. and also Europe.
Each file will be imported in to SQL tables by multiple stored procedures. Each stored procedure needs to be executed based on a scheduled time according to the time zone related to the origin of the file.
Working from a set time is proving tricky due to the fact that each region adjusts for day light saving at different times of the year. Say for example the UK moves it's clock forward for day light saving, Brazil may not move their time forward for another 3 weeks (don't quote me on that, I've used those times only for example purposes).
My question is; how can I schedule jobs to run on the same server based on multiple time zones?
I can see this may be possible if I were to create a timezone lookup table in SQL which shows the relationship between each time zone at each stage of the year but this seems quite cumbersome and will also take a considerable amount of time to populate the table.
Windows scheduler seems to use the date/time settings of the local server and although it does adjust for daylight saving, this will only be appropriate for one region. Has anyone had to handle this in SQL Server before? Or can anyone recommend a scheduling tool external to SQL Server that can initiate tasks based on different time zones?
Any help or advice would be greatly appreciated.

You won't be able to transparently and easy configure a single instance of SQL Server to run several sets of tasks in different timezones, by definition (the instance is single, all sets of tasks will be in the same timezone).
You are, however, able to write your own script in any language you like (for example, CLR .NET extension for MSSQL or just plain Transact-SQL), which is configured to to the following:
Iterate over the list of each region you want the task to be run
Convert the time of the region to server time and set the action to be executed (via sp_schedule for example).
Repeat the next period.
This task should of course be run at +12 UTC, thus definitely making it execute first on that date (as the time conveniently starts in Japan).
Implementing it this way would be pretty clear and reliable regardless of daylight savings, timezone updates and everything. Just make sure to keep the configuration of your partners timezones up to date.

Related

Who is responsible for modifying timestamp fields in the app: backend or db?

For example: we have 2 fields in certain entity: created_at and updated_at. We can update those fields manually on backend after create or update operations, or create a trigger on the DB side that will fill/update these fields for us automatically.
There are some cases to consider:
Usually, on the backend after create or update we return the json of the object. In this case it'd be nice to see those timestamp fields set up on return, however if a trigger makes the modifying for us, to see these updated timestamps, backend would have to make another select just to set up these timestamps to nicely return it to the client.
Sometimes backend engineers can forget to update these fields manually leading to null records.
Not a DBA specialist myself but what do you think of the cost of the triggers? Especially in high RPS. Should I not worry about the performance that triggers have for such simple updates in the high-load systems?
Who is responsible for modifying timestamp fields in the app: backend or db?
It depends, it can be either.
There is no one "right" answer for which times to use or exclude. Depending on your system, which actors perform time-based actions (users, devices, servers, triggers), any (or all) of the list below might make sense to incorporate.
Depending on your system, you might have one (or more) of the following:
time A – when a user performs an action
this is most likely local device time (whatever the phone or computer thinks is current time)
but: anything is possible, a client could get a time from who-knows-where and report that to you
could be when a user did something (tap a button) and not when the message was sent to the backend
could be 10-20 seconds (or more) after a user did something (tap a button), and gets assigned by the device when it sends out batched data
time B – when the backend gets involved
this is server time, and could be when the server receives the data, or after the server has received and processed the data, and is about to hand it off to the next player (database, another server, etc)
note: this is probably different from "time A" due to transit time between user and backend
also, there's no guarantee that different servers in the mix all agree on time.. they can and should, but should not be relied upon as truth
time C – when a value is stored in the database
this is different from server time (B)
a server might receive inbound data at B, then do some processing which takes time, then finally submits an insert to the database (which then assigns time C)
Another highly relevant consideration in capturing time is the accuracy (or rather, the likely inaccuracy) of client-reported time. For example, a mobile device can claim to have sent a message at time X, when in fact the clock is just set incorrectly and actual time is minutes, hours - even days - away from reported time X (in the future or in the past). I've seen this kind of thing occur where data arrives in a system, claiming to be from months ago, but we can prove from other telemetry that it did in fact arrive recently (today or yesterday). Never trust device-reported times. This applies to a device – mobile, tablet, laptop, desktop – all of them often have internal clocks that are not accurate.
Remote servers and your database are probably closer to real, though they can be wrong in various ways. However, even if wrong, when the database auto-assigns datetimes to two different rows, you can trust that one of them really did arrive after or before the other – the time might be inaccurate relative to actual time, but they're accurate relative to each other.
All of this becomes further complicated if you intend to piece together history by using timestamps from multiple origins (A, B and C). It's tempting to do, and sometimes it works out fine, but it can easily be nonsense data. For example, it might seem safe to piece together history using a user time A, then a server time B, and database time C. Surely they're all in order – A happened first, then B, then C; so clearly all of the times should be ascending in value. But these are often out of order. So if you need to piece together history for something important, it's a good idea to look for secondary confirmations of order of events, and don't rely on timestamps.
Also on the subject of timestamps: store everything in UTC – database values, server times, client/device times were possible. Timezones are the worst.

Sending Notification to different time zones

I have a server in Usa and I have clients in different parts of the world, Australia, South america, Usa, Canada, Europe.
So I need to send notification of events one hour before the event take place.
So In sql server I have a table with different events those events are stored in Utc(2015-12-27 20:00:00.0000000). and in other table the timezone that belongs to every event ("Australia/Sydney").
So how could I calculate in a query when to send the notifications? or maybe I would have to do it with a server side language.
Could any one could help me with a possible solution.
Thanks
You've asked very broadly, so I can only answer with generalities. If you need a more specific answer, please edit your question to be more specific.
A few things to keep in mind:
Time zone conversions are best done in the application layer. Most server-side application platforms have time zone conversion functions, either natively or via libraries, or both.
If you must convert at the database layer (such as when using SSRS or SSAS, or complex stored procs, etc.) and you are using SQL Server, then there are two approaches to consider:
SQL Server 2016 CTP 3.1 adds native support for time zone conversions via the AT TIME ZONE statement. However, they work with Windows time zone identifiers, such as "AUS Eastern Standard Time", rather than IANA/Olson identifiers, such as the "Australia/Sydney" you specified.
You might use third-party support for time zones, such as my SQL Server Time Zone Support project, which does indeed support IANA/Olson time zone identifiers. There are other similar projects out there as well.
Regardless of whether you convert at the DB layer or at the application layer, the time zone of your server should be considered irrelevant. Always get the current time in UTC rather than local time. Always convert between UTC and a specific time zone. Never rely on the server's local time zone setting to be anything in particular. On many servers, the time zone is intentionally set to UTC, but you should not depend on that.
Nothing in your question indicates how you plan on doing scheduling or notifications, but that is actually the harder part. Specifically, scheduling events into the future should not be based on UTC, but rather on the event's specific time zone. More about this here.
You might consider finding a library for your application layer that will handle most of this for you, such as Quartz (Java) or Quartz.Net (.NET). There are probably similar solutions for other platforms.
You should read the large quantity of material already available on this subject here on Stack Overflow, including the timezone tag wiki and Daylight saving time and time zone best practices.

SQL Server Remote Update

We have a process in which several site servers send data to a central server (through a Linked Server). A new site has seen the job duration more than double in three weeks, and a couple of the other sites often fail due to run time overlap.
It is a two step process:
Insert new records
Update changed records
The insert only takes a few seconds, but the update takes anywhere from 5 to 20 minutes, depending on the site. I am able to change the query that drives the update and get it down to only a couple seconds, but still when put into an UPDATE statement it takes several minutes.
I am leaning towards moving the jobs to a single job on the central server, so it is a pull operation which, based on the testing I have done, should be much faster. However, my question is: What is considered "best practice" in this situation? I am going to have to change quite a bit to get this working properly, so I might as well do it right.

How to know the TimeZone StandardName or DayLightName from TimeZoneOffset in Sql Server

I am using Sql Sever 2008 R2. Is there a way to identify the time zone Standard name or daylight name from timezoneoffSet.
For example I have "2013-09-26 03:00:00.0000000 -04:00" and need
"Eastern Daylight Time" from above.
How can I accomplish this in SQL server ?
Any suggestions will be appreciated.
If you're talking about getting the local system's time zone, I've investigated that heavily in the past, and it isn't possible without resorting to either "unsafe" SQL CLR calls, or unsupported xp_regread calls to read this out of the registry. The general recommendation is to do that in application code, and not in the database.
However, if what you are saying is that you have an offset of -4:00 in your input value, and you want to translate that to a time zone name, I'm afraid that isn't possible at all, neither in SQL, nor in your application code.
The reason is that there are many time zones that share the same offset. For example, see this Wikipedia page that shows all of the zones that use -04:00.
Even if you limit the scope to just the United States, it still won't work in certain cases due to daylight saving time. For example, consider the time stamp of 2013-11-03T01:00:00-05:00. Is this Eastern Standard Time? Or Central Daylight Time? There's no way to tell, because at this moment it could be either one.
In the USA (unlike Europe), each time zone transitions at 2AM in its own local time. So it's like a wave that moves from the east to the west. While the US time zones are normally one hour spaced apart, during the spring-forward transition they can be two hours apart, and during the fall-back transition they can have the same exact local time.

Load balancing weighted reports?

I work for a fleet tracking company and this question is specifically about how I plan to do reports. Let me explain our environment. We have 1x Database, 1x Load Distributing process, and 3x Report Processing servers (let's assume these are equal in every way). When a customer requests a report, all the parameters of that report go in the database. I'm currently working on a load distributing app that will take pending reports from the database and delegate them to the 3 report processing servers that build and email the reports. When a server finishes a report (or an error arises), it notifies the load distributing app. Reports can come in all sizes, from 1 days worth of GPS data for 1 vehicles to 3 months of GPS data for hundreds of vehicles.
I can think of a few ways to do the load balancing but I'm not quite happy with them. I could have each server only do 5 reports at most, but 1 server might get 5 small reports while another gets 5 large reports. I could do a "Round Robin" approach and just hand out the reports sequentially across the servers, but this still doesn't protect against overloading any of the servers.
The best idea I think I have right now is to keep a count of how much GPS data is needed by each report (an easy task to do) and as I assign reports to each server I keep a running total for each server. When a server finishes a report (and notifies the load balancer), subtract that report's amount of GPS data from the running total for that server. This way, I could assign the next report to the server with the smallest amount of GPS data to work with. I could also set a max so that a server cannot get over worked (the problem that is causing us to refactor our whole reports process to begin with). If there are more reports when all servers hit their max, it can just queue them up and attempt them later when the servers finish a few of their reports.
I'm not convinced it's the best approach for finishing reports as quickly as possible. These are just the best I have come up with so far.
How can I optimize my approach to load balancing reports of different sizes across multiple servers?
Assuming that you have only one major table which you select data from, then I would configure one server to do all the big reports first and leave the other two to do smallest to largest. Otherwise big reports might never get done.
For the smaller reports, you want to try, in the absence of anything better, to have them try and run 'similar' reports, meaning those that cluster around similar values in the index mainly used. For example if a server has just completed a report for June 2011, then the next best report to run is same period, not jumping to November 2012. This is dependent on the actual table though, but I am presuming you have lots of date ordered data comprising the bulk of the selection. All you are really trying to do is group reports that are likely to reuse cached indexes/etc as this should give best throughput.
I have a similar scheduling problem, and any queries that are directed to major tables go one server (slow queue) and anything else goes to another ( fast queue), with some exceptions for special cases.