Can I convert between timezones in SQL Server? - sql

Right now I'm storing a number of records in SQL Server with a DATETIME column that stores the current timestamp using GETUTCDATE(). This ensures that we're always storing the exact date without having to worry about questions like "well is this 2:00 my time or 2:00 your time?" Using UTC ensures that we know exactly when it happened regardless of timezone.
However, I have a query that essentially groups these records by date. A simplified version of this query looks something like this:
SELECT [created], SUM([amount]) AS [amount]
FROM (
SELECT [amount], LEFT(CONVERT(VARCHAR, [created], 120), 10) AS [created]
FROM (
SELECT [amount], DATEADD(HOUR, -5, [created]) AS [created]
FROM [sales]
WHERE [organization] = 1
) AS s
) AS s
GROUP BY [created]
ORDER BY [created] ASC
Obviously this query is far from ideal--the whole reason I'm here is to ask how to improve it. First of all, it does (for the most part) accomplish the goal of what I'm looking for here--it has things grouped by dates and the other values aggregated accordingly. But what it doesn't accomplish is handling Daylight Savings Time correctly.
I live in Madison, WI and we're on Central Time time, so between March and November we're UTC-5, otherwise we're UTC-6. That's why you see the -5 in the code there as a quick hack to get it working.
The problem is that if I run this query, and there are records that fall on both sides of the daylight savings time changeover, it could potentially group things incorrectly. So for instance, if the table looks something like this:
+----+--------+---------------------+
| id | amount | created |
+----+--------+---------------------+
| 1 | 100.00 | 2010-04-02 06:00:00 |
| 2 | 50.00 | 2010-04-02 04:30:00 |
| 3 | 75.00 | 2010-04-02 03:00:00 |
| 4 | 150.00 | 2010-03-02 07:00:00 |
| 5 | 25.00 | 2010-03-02 05:30:00 |
| 6 | 50.00 | 2010-03-02 04:00:00 |
+----+--------+---------------------+
My query will return this:
+------------+--------+
| created | amount |
+------------+--------+
| 2010-03-01 | 50.00 |
| 2010-03-02 | 175.00 |
| 2010-04-01 | 125.00 |
| 2010-04-02 | 100.00 |
+------------+--------+
However, ideally it SHOULD return this:
+------------+--------+
| created | amount |
+------------+--------+
| 2010-03-01 | 75.00 |
| 2010-03-02 | 150.00 |
| 2010-04-01 | 125.00 |
| 2010-04-02 | 100.00 |
+------------+--------+
The trouble is that if I just subtract a fixed -5, then April is correct but March is not, but if I instead subtract a fixed -6 then March is correct but April is not. What I really need to do is convert to the appropriate time zone in a way that is aware of Daylight Savings Time and can adjust accordingly. Can I do this with SQL query? How do I write this query?

None of the current date/time functions are DST aware.
Using an auxiliary calendar table may be your best bet:
http://web.archive.org/web/20070611150639/http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-calendar-table.html
You can store UTCOffset's by date and reference it in your select statement

If you were able to store your data in a datetimeoffset field instead of datetime
this might help.
http://msdn.microsoft.com/en-us/library/bb630289.aspx
This datatype and the corepsonding functions are a new feature of sql server 2008.

Related

Pandas sum values between two dates in the most efficient way?

I have a dataset which shows production reported every week and another reporting the production every hours over some subproduction. I would now like to compare the sum of all this hourly subproduction with the value reported every week in the most efficient way. How could I achieve this? I would like to avoid a for loop at all cost as my dataset is really large.
So my datasest looks like this:
Weekly reported data:
Datetime_text | Total_Production_A
--------------------------|--------------------
2014-12-08 00:00:00.000 | 8277000
2014-12-15 00:00:00.000 | 8055000
2014-12-22 00:00:00.000 | 7774000
Hourly data:
Datetime_text | A_Prod_1 | A_Prod_2 | A_Prod_3 | ...... | A_Prod_N |
--------------------------|-----------|-----------|-----------|-----------|-----------|
2014-12-06 23:00:00.000 | 454 | 9 | 54 | 104 | 4 |
2014-12-07 00:00:00.000 | 0 | NaV | 0 | 23 | 3 |
2014-12-07 01:00:00.000 | 54 | 0 | 4 | NaV | 20 |
and so on. I would like to a new table where the differnce between the weekly reported data and hourly reported data is calculated for all dates of weekly reported data. So something like this:
Datetime_text | Diff_Production_A
--------------------------|------------------
2014-12-08 00:00:00.000 | 10
2014-12-15 00:00:00.000 | -100
2014-12-22 00:00:00.000 | 1350
where Diff_Production_A = Total_Production_A - sum(A_Prod_1,A_Prod_2,A_Prod_3,...,A_Prod_N;over all datetimes of a week) How can I best achieve this?
Any help is this regard would be greatly appriciated :D
Best
fidu13
Store datetime as pd.Timestamp, then you can do all kinds of manipulation on the dates.
For your problem, they is to group the hourly data by week (starting on Mondays), then merge it with the weekly data and calculate the differences:
weekly["Datetime"] = pd.to_datetime(weekly["Datetime_Text"])
hourly["Datetime"] = pd.to_datetime(hourly["Datetime_Text"])
hourly["HourlyTotal"] = hourly.loc[:, "A_Prod_1":"A_Prod_N"].sum(axis=1)
result = (
hourly.groupby(pd.Grouper(key="Datetime", freq="W-MON"))["HourlyTotal"]
.sum()
.to_frame()
.merge(
weekly[["Datetime", "Total_Production_A"]],
how="outer",
left_index=True,
right_on="Datetime",
)
.assign(Diff=lambda x: x["Total_Production_A"] - x["HourlyTotal"])
)

MS Access: show count per hour (even if there are no records in the time slot)

I'm stuck, i think it should be simple but can't get it to work. I have a table 'tbTimeTable' with all the hours of the day.
tbTimeTable (only show the first 5 records, but it will end at 23:00 (24 records in total)
| ID | TimeStart | TimeStop |
|1 | 0:00 | 1:00 |
|2 | 1:00 | 2:00 |
|3 | 2:00 | 3:00 |
|4 | 3:00 | 4:00 |
|5 | 4:00 | 5:00 |
I have a totals query qryPartCountTotalsPerHour with the part count per hour.
| DateIn | PartCount | PeriodIn | PerdiodOut |
|19-5-2021 | 221 |0:00 | 1:00 |
|19-5-2021 | 203 |1:00 | 2:00 |
|19-5-2021 | 201 |2:00 | 3:00 |
|19-5-2021 | 215 |6:00 | 7:00 |
|19-5-2021 | 174 |7:00 | 8:00 |
What I want, is to show the part count result for all the hours of the day and if there are no records in that hour then show 0 in the part count. So every Date in the DateIn field should show at least 24 records.
I tried this:
SELECT qryPartCountTotalsPerHour.DateIn, qryPartCountTotalsPerHour.PartCount, qryPartCountTotalsPerHour.PeriodOut, qryPartCountTotalsPerHour.PeriodOut
FROM tbTimeTable LEFT JOIN qryPartCountTotalsPerHour ON tbTimeTable.TimeStart = qryPartCountTotalsPerHour.PeriodIn
ORDER BY qryPartCountTotalsPerHour.DateIn;
I also tried to convert the PeriodIn and TimeStart just to an 'Hour' with the Hour() function but nothing works. I make mistake somewhere but can't find it.
Edit: tried to clarify that the DateIn contains more than one date.
You are close. You need to aggregate by the first table:
SELECT t.PeriodIn, NZ(q.PartCount), q.PeriodOut, q.PeriodOut
FROM tbTimeTable as t LEFT JOIN
qryPartCountTotalsPerHour as q
ON t.TimeStart = qryPqrtCountTotalsPerHour.PeriodIn
ORDER BY t.PeriodIn;
This works for your sample data. It might get more complicated if more days are involved.
So I found the answer. This topic helped me to get the answer, specifically the answer from Ken Sheridan and his public database "Payments". This database should give you enough info to solve this or a similair problem. However I will give you a short brief of what I did.
So I created a query with the time table and all the log dates (no relationship between them) called qryCalendar. This resulted in the complete time table for each logging date.
Then I created a new query to show the final result:
SELECT qryCalendar.Date, qryCalendar.TimeStop, qryCalendar.TimeStart, Nz([qryPartCountTotalsPerHour].[PartCount],0) AS tPartCount
FROM qryCalendar LEFT JOIN qryPartCountTotalsPerHour ON (qryCalendar.TimeStop = qryPartCountTotalsPerHour.PeriodOut) AND (qryCalendar.Date = qryPartCountTotalsPerHour.DateIn)
ORDER BY qryCalendar.Date, qryCalendar.TimeStart;

ORACLE SQL query to get rows of intervals of 30 minutes based on two hours

I need to do a query that give me rows of 30 minutes of intervals based on two hours, start_hour and end_hour.
I have a table, in this table i have this columns "start_hour and end_hour".
Assuming that i have this
| start_hour | end_hour |
| 09:00AM | 08:00PM |
I need a query that gave a result like this.
| intervals |
| 09:00AM |
| 09:30AM |
| 10:00AM |
| 10:30AM |
| 11:00AM |
| 11:30AM |
| 12:00AM |
| 12:30AM |
...
...
...
| 07:30PM |
| 08:00PM |
And the rows need to finish in te end_hour value i have in the table, as shown in the example.
Someone can help me how to do it, i tried rounding the start_hour, but i don't have any result.
This is a bit clunky and will take a bit of editing based on your specific needs, but it's a very slightly modified bit of code I used a few years back that should work as a solid starting point for you:
select to_char(time_slot,'HH:MIPM')
from (select trunc(to_date('05/23/2019','MM/DD/YYYY'))+(rownum-1)*(30/24/60) time_slot
from dual
connect by level <= (24*2))
where to_char(time_slot,'HH24:MI') between
--start_hour
'09:00'
and
--end hour
'20:00';
OUTPUT
09:00AM
09:30AM
10:00AM
10:30AM
11:00AM
11:30AM
12:00PM
12:30PM
01:00PM
01:30PM
02:00PM
02:30PM
03:00PM
03:30PM
04:00PM
04:30PM
05:00PM
05:30PM
06:00PM
06:30PM
07:00PM
07:30PM
08:00PM

I think I need a loop in an MS Access Query

I have a table of login and logout times for users, table looks something like below:
| ID | User | WorkDate | Start | Finish |
| 1 | Bill | 07/12/2017 | 09:00:00 | 17:00:00 |
| 2 | John | 07/12/2017 | 09:00:00 | 12:00:00 |
| 3 | John | 07/12/2017 | 12:30:00 | 17:00:00 |
| 4 | Mary | 07/12/2017 | 09:00:00 | 10:00:00 |
| 5 | Mary | 07/12/2017 | 10:10:00 | 12:00:00 |
| 6 | Mary | 07/12/2017 | 12:10:00 | 17:00:00 |
I'm running a query to find out the length of the breaks that each user took by running a date diff between the Min of Finish, and Max of Start, then doing some other sums/queries to find out their break length.
This works where i have a maximum of two rows per User per WorkDate, so rows 1,2,3 give me workable data.
Rows 4,5,6 do not.
So long story short, how can i calculate the break times based on the above data in MS Access in a query. I'm assuming i'm going to need some looping statement but have no idea where to begin.
Here is a solution that comes to mind first.
First query to get the min/max start and end times.
Second query to calculate the total time worked for each day by using your Min(start time) and max(end time) query.
Third query to calculate the total time worked for each shift (time difference between start and end times) and then do a daily sum.
Forth query to calculate the difference between total time from the second query and the total time from the third query. The difference gives you the amount of break time they took.
If you need additional help, I can provide some screenshots of example queries.

Only Some Dates From SQL SELECT Being Set To "0" or "1969-12-31" -- UNIX_TIMESTAMP

So I have been doing pretty well on my project (Link to previous StackOverflow question), and have managed to learn quite a bit, but there is this one problem that has been really dogging me for days and I just can't seem to solve it.
It has to do with using the UNIX_TIMESTAMP call to convert dates in my SQL database to UNIX time-format, but for some reason only one set of dates in my table is giving me issues!
==============
So these are the values I am getting -
#abridged here, see the results from the SELECT statement below to see the rest
#of the fields outputted
| firstVst | nextVst | DOB |
| 1206936000 | 1396238400 | 0 |
| 1313726400 | 1313726400 | 278395200 |
| 1318910400 | 1413604800 | 0 |
| 1319083200 | 1413777600 | 0 |
when I use this SELECT statment -
SELECT SQL_CALC_FOUND_ROWS *,UNIX_TIMESTAMP(firstVst) AS firstVst,
UNIX_TIMESTAMP(nextVst) AS nextVst, UNIX_TIMESTAMP(DOB) AS DOB FROM people
ORDER BY "ref DESC";
So my big question is: why in the heck are 3 out of 4 of my DOBs being set to date of 0 (IE 12/31/1969 on my PC)? Why is this not happening in my other fields?
I can see the data quite well using a more simple SELECT statement and the DOB field looks fine...?
#formatting broken to change some variable names etc.
select * FROM people;
| ref | lastName | firstName | DOB | rN | lN | firstVst | disp | repName | nextVst |
| 10001 | BlankA | NameA | 1968-04-15 | 1000000 | 4600000 | 2008-03-31 | Positive | Patrick Smith | 2014-03-31 |
| 10002 | BlankB | NameB | 1978-10-28 | 1000001 | 4600001 | 2011-08-19 | Positive | Patrick Smith | 2011-08-19 |
| 10003 | BlankC | NameC | 1941-06-08 | 1000002 | 4600002 | 2011-10-18 | Positive | Patrick Smith | 2014-10-18 |
| 10004 | BlankD | NameD | 1952-08-01 | 1000003 | 4600003 | 2011-10-20 | Positive | Patrick Smith | 2014-10-20 |
It's because those DoB's are from before 12/31/1969, and the UNIX epoch starts then, so anything prior to that would be negative.
From Wikipedia:
Unix time, or POSIX time, is a system for describing instants in time, defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970, not counting leap seconds.
A bit more elaboration: Basically what you're trying to do isn't possible. Depending on what it's for, there may be a different way you can do this, but using UNIX timestamps probably isn't the best idea for dates like that.