Elapsed time between two DateTime values - sql

Say you have a room with an indefinite number of light bulbs, and these are turning randomly on and off. Each time a bulb is turned on and then off, a record is entered in a table with TurnedOn and TurnedOff values.
How should the query look like if I am interested in how long (HH.mm.ss) was it visible in the room between two DateTime values?
e.g.
LightBulbId
TurnedOn
TurnedOff
1
2022-10-01 06:00:00
2022-10-01 11:00:00
2
2022-10-01 07:00:00
2022-10-01 10:00:00
3
2022-10-01 08:00:00
2022-10-01 09:00:00
4
2022-10-01 12:00:00
2022-10-01 13:00:00
5
2022-10-01 14:00:00
2022-10-01 15:00:00
So for the example above in the time period between 2022-10-01 06:00:00 and 2022-10-01 15:00:00 - 09 hours has passed and it was visible for 07 hours.
The bulb can be on for more than 24 hours.
One hour increments are put in the example for simplicity.
If at least one light bulb is on, you can see in the room.
If a Light is turned on, starting from that moment you can see in the room, and if a light is turned off starting from that moment you can not :-)
Another example with the same logic:
Say you have a machine that more than one person can work on at the same time. StartTime and EndTime is added to the table each time when a person starts and then stops working on a machine. I am interested in what was machines work time for a given time period?

select sign(on_off) as on_off
,sum(hour_diff) as hours
from
(
select *
,datediff(second, time, lead(time) over(order by time))/3600.0 as hour_diff
,sum(case when status = 'TurnedOn' then 1 else -1 end) over(order by time) as on_off
from t
unpivot (time for status in(TurnedOn, TurnedOff)) up
) t
group by sign(on_off)
on_off
hours
0
2.000000
1
7.000000
Fiddle

To expand on the correct answer above, this will give the results in the desired hh:mm:ss format
select
sign(on_off) as on_off,
RIGHT('0'+CONVERT(VARCHAR(2),SUM(secs_diff)/3600 ),2)+'h'
+RIGHT('0'+CONVERT(VARCHAR(2),SUM(secs_diff)/60 %60 ),2)+'m'
+RIGHT('0'+CONVERT(VARCHAR(2),SUM(secs_diff)%60),2)+'s'
from
(
select *
,datediff(second, time, lead(time) over(order by time))as secs_diff
,sum(case when status = 'TurnedOn' then 1 else -1 end) over(order by time) as on_off
from #Lights
unpivot (time for status in(TurnedOn, TurnedOff)) up
) t
group by sign(on_off)

Related

SUM of production counts for "overnight work shift" in MS SQL (2019)

I need some help regarding sum of production count for overnight shifts.
The table just contains a timestamp (that is automaticaly generated by SQL server during INSERT), the number of OK produced pieces and the number of NOT OK produced pieces in that given timestamp.
CREATE TABLE [machine1](
[timestamp] [datetime] NOT NULL,
[OK] [int] NOT NULL,
[NOK] [int] NOT NULL
)
ALTER TABLE [machine1] ADD DEFAULT (getdate()) FOR [timestamp]
The table holds values like these (just an example, there are hundreds of lines each day and the time stamps are not fixed like each hour or each 30mins):
timestamp
OK
NOK
2022-08-01 05:30:00.000
15
1
2022-08-01 06:30:00.000
18
3
...
...
...
2022-08-01 21:30:00.000
10
12
2022-08-01 22:30:00.000
0
3
...
...
...
2022-08-01 23:59:00.000
1
2
2022-08-02 00:01:00.000
7
0
...
...
...
2022-08-02 05:30:00.000
12
4
2022-08-02 06:30:00.000
9
3
The production works in shifts like so:
morning shift: 6:00 -> 14:00
afternoon shift: 14:00 -> 22:00
night shift: 22:00 -> 6:00 the next day
I have managed to get sums for the morning and afternoon shifts without issues but I can't figure out how to do the sum for the night shift (I have these SELECTs for each shift stored as a VIEW for easy access).
For the morning shift:
SELECT CAST(timestamp AS date) AS Morning,
SUM(OK) AS SUM_OK,
SUM(NOK) AS SUM_NOK
FROM [machine1]
WHERE DATEPART(hh,timestamp) >= 6 AND DATEPART(hh,timestamp) < 14
GROUP BY CAST(timestamp AS date)
ORDER BY Morning ASC
For the afternoon shift:
SELECT CAST(timestamp AS date) AS Afternoon,
SUM(OK) AS SUM_OK,
SUM(NOK) AS SUM_NOK
FROM [machine1]
WHERE DATEPART(hh,timestamp) >= 14 AND DATEPART(hh,timestamp) < 22
GROUP BY CAST(timestamp AS date)
ORDER BY Afternoon ASC
Since we identify the date of each shift by its start, my idea would be that the result for such SUM of night shift would be
Night
SUM_OK
SUM_NOK
2022-08-01
xxx
xxx
for interval 2022-08-01 22:00:00.000 -> 2022-08-02 05:59:59.999
2022-08-02
xxx
xxx
for interval 2022-08-02 22:00:00.000 -> 2022-08-03 05:59:59.999
2022-08-03
xxx
xxx
for interval 2022-08-03 22:00:00.000 -> 2022-08-04 05:59:59.999
2022-08-04
xxx
xxx
for interval 2022-08-04 22:00:00.000 -> 2022-08-05 05:59:59.999
...
...
...
After few days of trial and error I have probably managed to find the needed solution. Using a subquery I shift all the times in range 00:00:00 -> 05:59:59 to the previous day and then I use that result in same approach as for morning and afternon shift (because now all the production data from night shift are in the same date between 22:00:00 and 23:59:59).
In case anyone needs it in future:
SELECT
CAST(nightShift.shiftedTime AS date) AS Night,
SUM(nightShift.OK) AS SUM_OK,
SUM(nightShift.NOK) AS SUM_NOK
FROM
(SELECT
CASE WHEN (DATEPART(hh, timestamp) < 6 AND DATEPART(hh, timestamp) >= 4) THEN DATEADD(HOUR, -6, timestamp)
WHEN (DATEPART(hh, timestamp) < 4 AND DATEPART(hh, timestamp) >= 2) THEN DATEADD(HOUR, -4, timestamp)
WHEN (DATEPART(hh, timestamp) < 2 AND DATEPART(hh, timestamp) >= 0) THEN DATEADD(HOUR, -2, timestamp)
END AS shiftedTime,
[OK],
[NOK]
FROM [machine1]
WHERE (DATEPART(hh, cas) >= 0 AND DATEPART(hh, cas) < 6)) nightShift
WHERE DATEPART(hh,nightShift.shiftedTime) >= 22
GROUP BY CAST(nightShift.shiftedTime AS date)
ORDER BY Night ASC
PS: If there is anything wrong with this approach, please feel free to correct me as I'm just newbie in SQL. So far this seems to do exactly what I needed.

How can i create a new column count in SQL table where count=1 if hours column >=6 else count=0

I aim to first achieve this
id
employee
Datelog
TimeIn
TimeOut
Hours
Count
5
Two
2022-08-10
09:00:00
16:00:00
07:00:00
1
4
Two
2022-08-09
09:00:00
16:00:00
07:00:00
1
3
Two
2022-08-08
09:00:00
16:00:00
07:00:00
1
2
One
2022-08-05
09:00:00
16:00:00
07:00:00
1
1
Two
2022-08-04
09:00:00
10:00:00
01:00:00
0
and now my main objective here is to give a bonus of 2k to employees whose Totalcount per month >=3.
employee
Month
TotalCount
Bonus
Two
August
3
2000
One
August
1
0
Here's the answer using Postgres. It's pretty much generic other than extracting the month out of datelog that might have a slightly different syntax.
select employee
,max(date_part('month', datelog ))
,count(*)
,case when count(*) >= 3 then 2000 else 0 end as bonus
from t
where hours >= time '06:00:00'
group by employee
employee
max
count
bonus
Two
8
3
2000
One
8
1
0
Fiddle

SQL: Calculate duration based on dates and parameters (Change Log)

I have a dataset that is like a ticketing system change log, I am trying to calculate kind of like an SLA time across the records, like how long did this specific ticket sit with this Group 2 for before it was either resolved or moved to another group to resolve.
The data looks like so:
ID
field
value
start
end
1
assignment_group
Group 1
2022-03-21 08:00:00
2022-03-21 08:05:00
1
incident_state
Work in Progress
2022-03-21 08:05:00
2022-03-21 08:30:00
1
assignment_group
Group 2
2022-03-21 08:35:00
2022-03-21 08:50:00
1
assigned_to
User 1
2022-03-21 08:50:00
2022-03-21 08:51:00
1
incident_state
Work in Progress
2022-03-21 09:00:00
2022-03-21 09:30:00
1
incident_state
Resolved
2022-03-21 09:30:00
2022-03-21 09:31:00
2
assignment_group
Group 2
2022-01-21 11:30:00
2022-01-21 11:35:00
2
assigned_to
User 1
2022-01-21 11:35:00
2022-01-21 11:37:00
2
incident_state
Work in Progress
2022-01-21 11:40:00
2022-01-21 11:55:00
2
assignment_group
Group 3
2022-01-21 11:58:00
2022-01-21 12:00:00
2
assigned_to
User 2
2022-01-21 12:05:00
2022-01-21 12:06:00
2
incident_state
Resolved
2022-01-21 12:10:00
2022-01-21 12:07:00
The issue I am having is calculating the duration based on the start time the ticket was assigned to a specific group and the end time of when either the ticket was resolved by that group or moved to another group to resolve. For example, I am only interested in Group 2, the duration for ticket 1 for when the ticket was sitting with Group 2 till the ticket was resolved by Group 2 is 2022-03-21 08:35:00 to 2022-03-21 09:31:00, so duration is 1 hour and 1 minute. But for Ticket 2, the ticket sat with Group 2 from 2022-01-21 11:30:00 till it was transferred to another group to resolve at 2022-01-21 11:58:00.
My code looks like so at the moment, I join two tables to pull in the ticket information and then the ticket state changes (so every time an action is taken on that ticket). Then I am left with the table above. I kind of guess I need to use a lead function but I can't figure out how to get the correct end time for the correct record (When incident_state = resolved OR assignment_group = another group):
WITH incidents as
(
SELECT number, sys_id AS SYS_ID_INCIDENT
FROM tables.ServiceIncidents
),
changes as
(
SELECT id, start, field, field_value, value, `end`
FROM tables.IncidentInstances
),
incident_changes as
(
SELECT *, TIMESTAMP_DIFF(changes.`end`,changes.start, MINUTE) as Duration, row_number() over (partition by number order by start) as RN
FROM incidents
LEFT JOIN changes
ON (incidents.SYS_ID_INCIDENT = changes.id)
),
IAMtickets as
(
SELECT i.number, i.SYS_ID_INCIDENT, start, field, value, `end`, Duration, RN
FROM incident_changes i
INNER JOIN
(SELECT DISTINCT number FROM incident_changes WHERE value ='Group 2') r
ON i.number = r.number
),
cte as
(
SELECT *, lead(value) over (partition by IAMtickets.number order by start)
FROM IAMtickets
)
SELECT * FROM CTE
I want the output to be something like this:
ID
Assigned to
Duration
Outcome
1
User 1
1 hour 1 minute
Resolved
2
User 1
28 minutes
Transferred

(bigquery) how number of hours event is happening within multiple dates

So my data looks like this:
DATE TEMPERATURE
2012-01-13 23:15:00 UTC 0
2012-01-14 01:35:00 UTC 5
2012-01-14 02:15:00 UTC 6
2012-01-14 03:15:00 UTC 8
2012-01-14 04:15:00 UTC 0
2012-01-14 04:55:00 UTC 0
2012-01-14 05:15:00 UTC -2
2012-01-14 05:35:00 UTC 0
I am trying to calculate the amount of time a zip code temperature will drop to 0 or below on any given day. On the 13th, it only happens for a very short amount of time so we don't really care. I want to know how to calculate the number of minutes this happens on the 14th, since it looks like a significantly (and consistently) cold day.
I want the query to add two more columns.
The first column added would be the time difference between the rows on a given date. So row 3- row 2=40 mins and row 4-row3=60 mins.
The second column would total the amount of minutes for a whole day the minutes the temperature has dropped to 0 or below. Here row 2-4 would be ignored. From row 5-8, total time that the temperature was 0 or below would be about 90 mins
It should end up looking like this:
DATE TEMPERATURE MINUTES_DIFFERENCE TOTAL_MINUTES
2012-01-13 23:15:00 UTC 0 0 0
2012-01-14 01:35:00 UTC 5 140 0
2012-01-14 02:15:00 UTC 6 40 0
2012-01-14 03:15:00 UTC 8 60 0
2012-01-14 04:15:00 UTC 0 60 60
2012-01-14 04:55:00 UTC 0 30 90
2012-01-14 05:15:00 UTC-2 20 110
2012-01-14 05:35:00 UTC 0 20 130
Use below
select *,
sum(minutes_difference) over(order by date) total_minutes
from (
select *,
ifnull(timestamp_diff(timestamp(date), lag(timestamp(date)) over(order by date), minute), 0) as minutes_difference
from your_table
)
if applied to sample data in your question - output is
Update to answer updated question
select * except(new_grp, grp),
sum(if(temperature > 0, 0, minutes_difference)) over(partition by grp order by date) total_minutes
from (
select *, countif(new_grp) over(order by date) as grp
from (
select *,
ifnull(timestamp_diff(timestamp(date), lag(timestamp(date)) over(order by date), minute), 0) as minutes_difference,
ifnull(((temperature <= 0) and (lag(temperature) over(order by date) > 0)) or
((temperature > 0) and (lag(temperature) over(order by date) <= 0)), true) as new_grp
from your_table
)
)
with output

Count consecutive recurring values

I am struggling to find any info on this on the internet after a couple of hours of searching, trial, error and failure. We have the following table structure:
Name
EventDateTime
Mark
Dave
2021-03-24 09:00:00
Present
Dave
2021-03-24 14:00:00
Absent
Dave
2021-03-25 09:00:00
Absent
Dave
2021-03-26 09:00:00
Absent
Dave
2021-03-27 09:00:00
Present
Dave
2021-03-27 14:00:00
Absent
Dave
2021-03-28 09:00:00
Absent
Dave
2021-03-29 10:00:00
Absent
Dave
2021-03-30 13:00:00
Absent
Jane
2021-03-30 13:00:00
Absent
Basically registers for people for events. We need to pull a report to see who we have not had contact from for more x consecutive days. Consecutive meaning for the days that they have events in the data not consecutive calendar days. Also if there is a present on one of the days where they were also absent the count needs to start again from the next day they were absent.
The first issue I've got is getting distinct dates where there are only absences, then the 2nd is getting the number of consecutive days of absences - I've done the 2nd in MySQL with variables but struggled to migrate this over to PostgreSQL where the reporting is done from.
An example of the output I'd want is:
Name
EventDateTime
Mark
ConsecCount
Dave
2021-03-24 09:00:00
Present
0
Dave
2021-03-24 14:00:00
Absent
0
Dave
2021-03-25 09:00:00
Absent
1
Dave
2021-03-26 09:00:00
Absent
2
Dave
2021-03-27 09:00:00
Present
0
Dave
2021-03-27 14:00:00
Absent
0
Dave
2021-03-28 09:00:00
Absent
1
Dave
2021-03-29 10:00:00
Absent
2
Dave
2021-03-30 13:00:00
Absent
3
Jane
2021-03-30 13:00:00
Absent
0
This table is currently at 639931 records and they have been generated since 1st October and will continue to grow at this rate.
Any help, or advise on where to start that would be great.
This can be achieved using window functions as follows:
WITH with_row_numbers AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY EventDateTime) AS this_row_number,
(CASE WHEN Mark = 'Present' THEN ROW_NUMBER() OVER (PARTITION BY Name ORDER BY EventDateTime) ELSE 0 END) AS row_number_if_present
FROM events
)
SELECT
Name,
EventDateTime,
Mark,
GREATEST(0, this_row_number - MAX(row_number_if_present) OVER (PARTITION BY Name ORDER BY EventDateTime) - 1)
FROM with_row_numbers
Original answer with LATERAL join
WITH with_row_numbers AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY EventDateTime)
FROM events e
)
SELECT
t1.Name,
t1.EventDateTime,
t1.Mark,
GREATEST(0, t1.ROW_NUMBER - COALESCE(sub.prev_present_row_number, 0) - 1) AS ConsecCount
FROM with_row_numbers AS t1
CROSS JOIN LATERAL (
SELECT MAX(row_number) AS prev_present_row_number
FROM with_row_numbers t2
WHERE t2.Name = t1.Name
AND t2.EventDateTime <= t1.EventDateTime
AND t2.Mark = 'Present'
) sub