SQL - Normalising timestamps to business hours - sql

My initial answer to this problem has been to script it. Instead of using SQL, I've dipped into Python and normalised them. I am curious whether anyone can come up with a solution using SQL though.
If a date occurs outside of business hours, I want to normalise the date to the next working day. I'll keep this really simple and say that business hours is 9am to 6pm Monday to Friday. Anything outside of those hours is outside of business hours.
What should happen the dates is that they are changed so that 2pm on Saturday becomes 9am on Monday morning (the first legitimate time in the business week). 7pm on a Wednesday becomes 9am Thursday morning. etc. etc. Let's ignore holidays.
Sample data:
mysql> select mydate from mytable ORDER by mydate;
+---------------------+
| mydate |
+---------------------+
| 2009-09-13 17:03:09 |
| 2009-09-14 09:45:49 |
| 2009-09-15 09:57:28 |
| 2009-09-16 21:55:01 |
+---------------------+
4 rows in set (0.00 sec)
The first date is a Sunday so it should be normalised to 2009-09-14 09:00:00
The second date is fine, it's at 9am on a Monday.
The third date is fine, it's at 9am on a Tuesday.
The fourth date is at 9pm (outside of our 9am to 6pm business hours) on a Wednesday and should be transformed to 9am Thursday morning.

I think you're better off with your Python solution ... but I like challenges :)
select mydate
, case dayadjust
-- BUG
-- when 0 then mydate
-- BUG
when 0 then case
when hour(mydate)<9
then date_add(from_days(to_days(mydate)),
INTERVAL 9 HOUR)
else mydate
end
-- BUG SQUASHED
else date_add(from_days(to_days(mydate) + dayadjust),
INTERVAL 9 HOUR)
end as mynewdate
from (
select mydate
, case
when addday>=moreday then addday
else moreday
end as dayadjust
from (
select mydate
, weekday(mydate) as w
, hour(mydate) as h
, case weekday(mydate)
when 6 then 1
when 5 then 2
when 4 then
case
when hour(mydate) >= 18 then 3
else 0
end
else 0
end as addday
, case when hour(mydate)>=18 then 1 else 0 end as moreday
from mytable
order by mydate
) alias1
) alias2
Tested on MySQL
$ mysql tmp < phil.sql
mydate mynewdate
2009-09-12 17:03:09 2009-09-14 09:00:00
2009-09-12 21:03:09 2009-09-14 09:00:00
2009-09-13 17:03:09 2009-09-14 09:00:00
2009-09-14 09:45:49 2009-09-14 09:45:49
2009-09-15 09:57:28 2009-09-15 09:57:28
2009-09-16 21:55:01 2009-09-17 09:00:00
2009-09-17 11:03:09 2009-09-17 11:03:09
2009-09-17 22:03:09 2009-09-18 09:00:00
2009-09-18 12:03:09 2009-09-18 12:03:09
2009-09-18 19:03:09 2009-09-21 09:00:00
2009-09-19 06:03:09 2009-09-21 09:00:00
2009-09-19 16:03:09 2009-09-21 09:00:00
2009-09-19 19:03:09 2009-09-21 09:00:00

Not sure why you want to do this, but if it needs to always be true of all data in your database, you need a trigger. I would set up a table to pull from that specifies the business hours and you can use that table to determine the next valid business hour day and time. (I might even consider making a table that tells you exactly what the next business day and hour is for each possibility, it's not like this changes a lot, might have to be updated once a year if you change holidays for the next year or if you change the overall business hours. By precalulating, you can probably save time in processing this.). I would also conmtinue to use your script becasue it's better to fix data before it gets entered, but you need the trigger to ensure that data from any source (and sooner or later there will be changes form sources other than your application) meets the data integrity rules.

I don't think you can do it in one query, but you can try this:
-- Mon-Thu, after 17:00
-- Set date = next day, 9:00
UPDATE
myTable
SET
mydate = DATE_ADD(DATE_ADD(DATE(date), INTERVAL 1 DAY), INTERVAL 9 HOUR)
WHERE
TIME(mydate) >= 17
AND DAYOFWEEK(mydate) IN (1,2,3,4)
-- Mon-Fri, before 9:00
-- Set date = the same day, 9:00
UPDATE
myTable
SET
mydate = DATE_ADD(DATE(date), INTERVAL 9 HOUR)
WHERE
TIME(mydate) < 9
AND DAYOFWEEK(mydate) IN (1,2,3,4,5)
-- Fri, after 17:00, Sat, Sun
-- Set date = monday, 9.00
UPDATE
myTable
SET
mydate = DATE_ADD(DATE_ADD(DATE(date), INTERVAL 3 DAY), INTERVAL 9 HOUR)
WHERE
(TIME(mydate) >= 17
AND DAYOFWEEK(mydate) = 5)
OR DAYOFWEEK(mydate) IN (6,7)

Related

how to return a specific set of data from multiple columns in a database in sql

I am new to sql and this is my first ever question. I am working with a sample database that I want to extract specific information from to display as a dashboard. The issue is that I can do this partially but I cannot seem to figure it out properly.
``SELECT
S_date as date,
p_time as time,
process_id as process,
sc_gun as scannumb,
sum(line_qty) as linetotal,
sum(area_qty) as areatotal
FROM dbfile6
WHERE
process_id in('0010','0020','0030')
and sc_gun in = ('10','20','30','40','50')
and s_date = curdate() - 1 and p_time between '22:00:00' and '23:59:59'
or s_date = curdate() and p_time between '00:00:00' and '06:00:00'
GROUP BY p_time, s_date, process_id, sc_gun
ORDER BY s_date, process_id
What do I want to display?
I can do partially where I want it to work to yesterdays date (s_date) recurring but I want this to only happen Monday to Friday, skipping the weekend so when we are on Monday, it looks at Fridays data from the database.
I want to show the time as a range, a night range. The range is 20:00:00 - 06:00:00. The range is tricky as it crosses over to the next day, this could work for Monday to Thursday but not Friday as there is no working weekend so what would I do here? In addition to this, I can sum up the values successfully and display it as averages for each process but then once I add the time in, it displays each result individually.
The table below is what it looks like in the database, however as mentioned earlier, the desired result is for each process to have the line_qty and area_qty summed up by time range and a day and night cycle.
s_date
p_time
process_id
sc_gun
line_qty
area_qty
04/05/2022
04:49:52
0010
10
2
12
03/05/2022
11:50:00
0010
10
5
14
03/05/2022
19:50:00
0010
10
7
16
03/05/2022
13:50:00
0020
20
4
6
03/05/2022
19:50:00
0010
10
7
16

How to bring future days to past date and then revert to same old days using postgresql?

I have a db with 6 tables. Each table has a list of date and datetime columns as shown below
Table 1 Table 2 .... Table 6
Date_of_birth Exam_date exam_datetime Result_date Result_datetime
2190-01-13 2192-01-13 2192-01-13 09:00:00 2194-04-13 2194-04-13 07:12:00
2184-05-21 2186-05-21 2186-05-21 07:00:00 2188-02-03 2188-02-03 09:32:00
2181-06-17 2183-06-17 2183-06-17 05:00:00 2185-07-23 2185-07-23 12:40:00
What I would like to do is shift all these future days back to the past date (definitely has to be less than the current date) but retain the same chronological order. Meaning, we can see that the person was born first, then he took the exam, and finally, he got his results.
In addition, I should be able to revert the changes and get back the future dates again.
I expect my output to be something like below
Stage 1 - shift back to old days (it can be any day but it has to be in the past and retain chronological order)
Table 1 Table 2 .... Table 6
Date_of_birth Exam_date exam_datetime Result_date Result_datetime
1990-01-13 1992-01-13 1992-01-13 09:00:00 1994-04-13 1994-04-13 07:12:00
1984-05-21 1986-05-21 1986-05-21 07:00:00 1988-02-03 1988-02-03 09:32:00
1981-06-17 1983-06-17 1983-06-17 05:00:00 1985-07-23 1985-07-23 12:40:00
Stage 2 - Shift forward to future days as how it was earlier
Table 1 Table 2 .... Table 6
Date_of_birth Exam_date exam_datetime Result_date Result_datetime
2190-01-13 2192-01-13 2192-01-13 09:00:00 2194-04-13 2194-04-13 07:12:00
2184-05-21 2186-05-21 2186-05-21 07:00:00 2188-02-03 2188-02-03 09:32:00
2181-06-17 2183-06-17 2183-06-17 05:00:00 2185-07-23 2185-07-23 12:40:00
Subtract two centuries:
update table1
set date_of_birth = date_of_birth - interval '200 year';
You can do something similar for all the other dates.

Extract weekend days from date

I have date field and from that date field i am trying to extract only weekends i.e. in my case Saturday and Sunday is weekend.
So how can i extract weekends from date?
If below dates are in weekend then should be like this:
Date day working hours
01/01/2019
02/01/2019
03/01/2019
04/01/2019
05/01/2019 weekend 24
06/01/2019 weekend 87
07/01/2019
08/01/2019
09/01/2019
10/01/2019
Data link: https://www.dropbox.com/s/xaps82qyyo6i0fa/ar.xlsx?dl=0
You can use WeekDay functon. This function accepts date value/field and return the day of the week. The returned value is in dual format - day name and day number.
So you can create additional field that checks if the day number is >= 5 (day numbers are starting from 0 so Saturday = 5 and Sunday = 6)
RawData:
LOAD
AttendanceDay,
if(WeekDay(AttendanceDay) >= 5, 1, 0) as isWeekend,
Employee_ID,
WorkingHours
FROM
[..\Downloads\ar.xlsx]
(ooxml, embedded labels, table is Attendances_20191119_0838)
;
Resulted table after the reload:

How to retrieve the min and max times of a timestamp column based on a time interval of 30 mins?

I am trying to pull a desired output that looks like this
Driver_ID| Interval_Start_Time | Interval_End_Time | Clocked_In_Time | Clocked_Out_Time | Score
232 | 2019-04-02 00:00:00.000 | 2019-04-02 00:30:00.000 | 2019-04-02 00:10:00.000 | 2019-04-02 00:29:00.000 | 0.55
My Goal is to pull the ID in 30 minute or per half hour time intervals, and their min or earliest clocked in time and max or latest clocked out time in that same 30 minute or half hour interval.
The query I have currently is
WITH TIME AS
(SELECT DISTINCT CASE
WHEN extract(MINUTE
FROM offer_time_utc)<30 THEN date_trunc('hour', offer_time_utc)
ELSE date_add('minute',30, date_trunc('hour', offer_time_utc))
END AS interval_start_time,
CASE
WHEN extract(MINUTE
FROM offer_time_utc)<30 THEN date_add('minute',30, date_trunc('hour', offer_time_utc))
ELSE date_add('hour',1, date_trunc('hour', offer_time_utc))
END AS interval_end_time
FROM integrated_delivery.trip_offer_fact offer
WHERE offer.business_day = date '2019-04-01' )
SELECT DISTINCT offer.Driver_ID,
offer.region_uuid,
interval_start_time,
interval_end_time,
min(sched.clocked_in_time_utc) AS clocked_in_time,
max(sched.clocked_out_time_utc) AS clocked_out_time,
cast(scores.acceptance_rate AS decimal(5,3)) AS acceptance_rate
FROM integrated_delivery.trip_offer_fact offer
JOIN TIME ON offer.offer_time_utc BETWEEN time.interval_start_time AND time.interval_end_time
JOIN integrated_delivery.courier_actual_hours_fact sched ON offer.Driver_ID = sched.Driver_ID
JOIN integrated_product.driver_score_v2 scores ON offer.Driver_ID = scores.courier_id
AND offer.region_uuid = scores.region_id
AND offer.region_uuid = sched.region_uuid
AND offer.business_day = date '2019-04-01'
AND sched.business_day = date '2019-04-01'
AND scores.extract_dt = 20190331
AND offer.region_uuid IN('930c534f-a6b6-4bc1-b26e-de5de8930cf9')
GROUP BY 1,2,3,4,7
But it does not seem to give me the correct min and max clocked in and clocked out time in that correct interval as below,
driver_uuid region_uuid interval_start_time interval_end_time clocked_in_time clocked_out_time score
232 bbv 2019-04-01 14:30:00.000 2019-04-01 15:00:00.000 2019-04-01 14:43:13.140 2019-04-01 22:30:46.043 0.173
When I add in these 2 lines,
JOIN TIME ON sched.clocked_in_time_utc BETWEEN time.interval_start_time AND time.interval_end_time
jOIN TIME ON sched.clocked_out_time_utc BETWEEN time.interval_start_time AND time.interval_end_time
iIt gives me an error as I dont think that is correct.
How can I correctly pull in the min and max clock in and clock out time for the correct interval? Meaning I only want the earliest clocked in time and the latest clocked out time in that per half hour interval start and end time.
I appreciate anybody looking ! Thanks

GROUP BY several hours

I have a table where our product records its activity log. The product starts working at 23:00 every day and usually works one or two hours. This means that once a batch started at 23:00, it finishes about 1:00am next day.
Now, I need to take statistics on how many posts are registered per batch but cannot figure out a script that would allow me achiving this. So far I have following SQL code:
SELECT COUNT(*), DATEPART(DAY,registrationtime),DATEPART(HOUR,registrationtime)
FROM RegistrationMessageLogEntry
WHERE registrationtime > '2014-09-01 20:00'
GROUP BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
ORDER BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
which results in following
count day hour
....
1189 9 23
8611 10 0
2754 10 23
6462 11 0
1885 11 23
I.e. I want the number for 9th 23:00 grouped with the number for 10th 00:00, 10th 23:00 with 11th 00:00 and so on. How could I do it?
You can do it very easily. Use DATEADD to add an hour to the original registrationtime. If you do so, all the registrationtimes will be moved to the same day, and you can simply group by the day part.
You could also do it in a more complicated way using CASE WHEN, but it's overkill on the view of this easy solution.
I had to do something similar a few days ago. I had fixed timespans for work shifts to group by where one of them could start on one day at 10pm and end the next morning at 6am.
What I did was:
Define a "shift date", which was simply the day with zero timestamp when the shift started for every entry in the table. I was able to do so by checking whether the timestamp of the entry was between 0am and 6am. In that case I took only the date of this DATEADD(dd, -1, entryDate), which returned the previous day for all entries between 0am and 6am.
I also added an ID for the shift. 0 for the first one (6am to 2pm), 1 for the second one (2pm to 10pm) and 3 for the last one (10pm to 6am).
I was then able to group over the shift date and shift IDs.
Example:
Consider the following source entries:
Timestamp SomeData
=============================
2014-09-01 06:01:00 5
2014-09-01 14:01:00 6
2014-09-02 02:00:00 7
Step one extended the table as follows:
Timestamp SomeData ShiftDay
====================================================
2014-09-01 06:01:00 5 2014-09-01 00:00:00
2014-09-01 14:01:00 6 2014-09-01 00:00:00
2014-09-02 02:00:00 7 2014-09-01 00:00:00
Step two extended the table as follows:
Timestamp SomeData ShiftDay ShiftID
==============================================================
2014-09-01 06:01:00 5 2014-09-01 00:00:00 0
2014-09-01 14:01:00 6 2014-09-01 00:00:00 1
2014-09-02 02:00:00 7 2014-09-01 00:00:00 2
If you add one hour to registrationtime, you will be able to group by the date part:
GROUP BY
CAST(DATEADD(HOUR, 1, registrationtime) AS date)
If the starting hour must be reflected accurately in the output (as 9, 23, 10, 23 rather than as 10, 0, 11, 0), you could obtain it as MIN(registrationtime) in the SELECT clause:
SELECT
count = COUNT(*),
day = DATEPART(DAY, MIN(registrationtime)),
hour = DATEPART(HOUR, MIN(registrationtime))
Finally, in case you are not aware, you can reference columns by their aliases in ORDER BY:
ORDER BY
day,
hour
just so that you do not have to repeat the expressions.
The below query will give you what you are expecting..
;WITH CTE AS
(
SELECT COUNT(*) Count, DATEPART(DAY,registrationtime) Day,DATEPART(HOUR,registrationtime) Hour,
RANK() over (partition by DATEPART(HOUR,registrationtime) order by DATEPART(DAY,registrationtime),DATEPART(HOUR,registrationtime)) Batch_ID
FROM RegistrationMessageLogEntry
WHERE registrationtime > '2014-09-01 20:00'
GROUP BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
)
SELECT SUM(COUNT) Count,Batch_ID
FROM CTE
GROUP BY Batch_ID
ORDER BY Batch_ID
You can write a CASE statement as below
CASE WHEN DATEPART(HOUR,registrationtime) = 23
THEN DATEPART(DAY,registrationtime)+1
END,
CASE WHEN DATEPART(HOUR,registrationtime) = 23
THEN 0
END