Getting date difference between consecutive rows in the same group - sql

I have a database with the following data:
Group ID Time
1 1 16:00:00
1 2 16:02:00
1 3 16:03:00
2 4 16:09:00
2 5 16:10:00
2 6 16:14:00
I am trying to find the difference in times between the consecutive rows within each group. Using LAG() and DATEDIFF() (ie. https://stackoverflow.com/a/43055820), right now I have the following result set:
Group ID Difference
1 1 NULL
1 2 00:02:00
1 3 00:01:00
2 4 00:06:00
2 5 00:01:00
2 6 00:04:00
However I need the difference to reset when a new group is reached, as in below. Can anyone advise?
Group ID Difference
1 1 NULL
1 2 00:02:00
1 3 00:01:00
2 4 NULL
2 5 00:01:00
2 6 00:04:00

The code would look something like:
select t.*,
datediff(second, lag(time) over (partition by group order by id), time)
from t;
This returns the difference as a number of seconds, but you seem to know how to convert that to a time representation. You also seem to know that group is not acceptable as a column name, because it is a SQL keyword.
Based on the question, you have put group in the order by clause of the lag(), not the partition by.

Related

count number of records by month over the last five years where record date > select month

I need to show the number of valid inspectors we have by month over the last five years. Inspectors are considered valid when the expiration date on their certification has not yet passed, recorded as the month end date. The below SQL code is text of the query to count valid inspectors for January 2017:
SELECT Count(*) AS RecordCount
FROM dbo_Insp_Type
WHERE (dbo_Insp_Type.CERT_EXP_DTE)>=#2/1/2017#);
Rather than designing 60 queries, one for each month, and compiling the results in a final table (or, err, query) are there other methods I can use that call for less manual input?
From this sample:
Id
CERT_EXP_DTE
1
2022-01-15
2
2022-01-23
3
2022-02-01
4
2022-02-03
5
2022-05-01
6
2022-06-06
7
2022-06-07
8
2022-07-21
9
2022-02-20
10
2021-11-05
11
2021-12-01
12
2021-12-24
this single query:
SELECT
Format([CERT_EXP_DTE],"yyyy/mm") AS YearMonth,
Count(*) AS AllInspectors,
Sum(Abs([CERT_EXP_DTE] >= DateSerial(Year([CERT_EXP_DTE]), Month([CERT_EXP_DTE]), 2))) AS ValidInspectors
FROM
dbo_Insp_Type
GROUP BY
Format([CERT_EXP_DTE],"yyyy/mm");
will return:
YearMonth
AllInspectors
ValidInspectors
2021-11
1
1
2021-12
2
1
2022-01
2
2
2022-02
3
2
2022-05
1
0
2022-06
2
2
2022-07
1
1
ID
Cert_Iss_Dte
Cert_Exp_Dte
1
1/15/2020
1/15/2022
2
1/23/2020
1/23/2022
3
2/1/2020
2/1/2022
4
2/3/2020
2/3/2022
5
5/1/2020
5/1/2022
6
6/6/2020
6/6/2022
7
6/7/2020
6/7/2022
8
7/21/2020
7/21/2022
9
2/20/2020
2/20/2022
10
11/5/2021
11/5/2023
11
12/1/2021
12/1/2023
12
12/24/2021
12/24/2023
A UNION query could calculate a record for each of 50 months but since you want 60, UNION is out.
Or a query with 60 calculated fields using IIf() and Count() referencing a textbox on form for start date:
SELECT Count(IIf(CERT_EXP_DTE>=Forms!formname!tbxDate,1,Null)) AS Dt1,
Count(IIf(CERT_EXP_DTE>=DateAdd("m",1,Forms!formname!tbxDate),1,Null) AS Dt2,
...
FROM dbo_Insp_Type
Using the above data, following is output for Feb and Mar 2022. I did a test with Cert_Iss_Dte included in criteria and it did not make a difference for this sample data.
Dt1
Dt2
10
8
Or a report with 60 textboxes and each calls a DCount() expression with criteria same as used in query.
Or a VBA procedure that writes data to a 'temp' table.

Calculate overlap time in seconds for groups in SQL

I have a bunch of timestamps grouped by ID and type in the sample data shown below.
I would like to find overlapped time between start_time and end_time columns in seconds for each group of ID and between each lead and follower combinations. I would like to show the overlap time only for the first record of each group which will always be the "lead" type.
For example, for the ID 1, the follower's start and end times in row 3 overlap with the lead's in row 1 for 193 seconds (from 09:00:00 to 09:03:13). the follower's times in row 3 also overlap with the lead's in row 2 for 133 seconds (09:01:00 to 2020-05-07 09:03:13). That's a total of 326 seconds (193+133)
I used the partition clause to rank rows by ID and type and order them by start_time as a start.
How do I get the overlap column?
row# ID type start_time end_time rank. overlap
1 1 lead 2020-05-07 09:00:00 2020-05-07 09:03:34 1 326
2 1 lead 2020-05-07 09:01:00 2020-05-07 09:03:13 2
3 1 follower 2020-05-07 08:59:00 2020-05-07 09:03:13 1
4 2 lead 2020-05-07 11:23:00 2020-05-07 11:33:00 1 540
4 2 follower 2020-05-07 11:27:00 2020-05-07 11:32:00 1
5 3 lead 2020-05-07 14:45:00 2020-05-07 15:00:00 1 305
6 3 follower 2020-05-07 14:44:00 2020-05-07 14:44:45 1
7 3 follower 2020-05-07 14:50:00 2020-05-07 14:55:05 2
In your example, the times completely cover the total duration. If this is always true, you can use the following logic:
select id,
(sum(datediff(second, start_time, end_time) -
datediff(second, min(start_time), max(end_time)
) as overlap
from t
group by id;
To add this as an additional column, then either use window functions or join in the result from the above query.
If the overall time has gaps, then the problem is quite a bit more complicated. I would suggest that you ask a new question and set up a db fiddle for the problem.
Tried this a couple of way and got it to work.
I first joined 2 tables with individual records for each type, 'lead' and 'follower' and created a case statement to calculate max start time for each lead and follower start time combination and min end time for each lead and follower end time combination. Stored this in a temp table.
CASE
WHEN lead_table.start_time > follower_table.start_time THEN lead_table.start_time
WHEN lead_table.start_time < follower_table.start_time THEN patient_table.start_time_local
ELSE 0
END as overlap_start_time,
CASE
WHEN follower_table.end_time < lead_table.end_time THEN follower_table.end_time
WHEN follower_table.end_time > lead_table.end_time THEN lead_table.end_time
ELSE 0
END as overlap_end_time
Then created an outer query to lookup the temp table just created to find the difference between start time and end time for each lead and follower combination in seconds
select temp_table.id,
temp_table.overlap_start_time,
temp_table.overlap_end_time,
DATEDIFF_BIG(second,
temp_table.overlap_start_time,
temp_table.overlap_end_time) as overlap_time FROM temp_table

Creating a new calculated column in SQL

Is there a way to find the solution so that I need for 2 days, there are 2 UD's because there are June 24 2 times and for the rest there are single days.
I am showing the expected output here:
Primary key UD Date
-------------------------------------------
1 123 2015-06-24 00:00:00.000
6 456 2015-06-24 00:00:00.000
2 123 2015-06-25 00:00:00.000
3 658 2015-06-26 00:00:00.000
4 598 2015-06-27 00:00:00.000
5 156 2015-06-28 00:00:00.000
No of times Number of days
-----------------------------
4 1
2 2
The logic is 4 users are there who used the application on 1 day and there are 2 userd who used the application on 2 days
You can use two levels of aggregation:
select cnt, count(*)
from (select date, count(*) as cnt
from t
group by date
) d
group by cnt
order by cnt desc;

Insert multiple rows from result of Average by date and id

I have a table with 1 result per day like this :
id | item_id | date | amount
-------------------------------------
1 1 2019-01-01 1
2 1 2019-01-02 2
3 1 2019-01-03 3
4 1 2019-01-04 4
5 1 2019-01-05 5
6 2 2019-01-01 1
7 2 2019-01-01 2
8 2 2019-01-01 3
9 2 2019-01-01 4
10 2 2019-01-01 5
11 3 2019-01-01 1
12 3 2019-01-01 2
13 3 2019-01-01 3
14 3 2019-01-01 4
15 3 2019-01-01 5
First I was trying to average the column amount for each day.
SELECT
x.item_id AS id,avg(x.amount) AS result
FROM
(SELECT
il.item_id, il.amount,
ROW_NUMBER() OVER (PARTITION BY il.item_id ORDER BY il.date DESC) rn
FROM
item_prices il) x
WHERE
x.rn BETWEEN 1 AND 50
GROUP BY
x.item_id
The result is going to be the following if calculated on 2019-01-05
item_id | average
1 3
2 3
3 3
or, if calculated 2019-01-04
item_id | average
1 2.5
2 2.5
3 2.5
My goal is to run the Average query , every day that would update the average automatically and insert it in 5th column "average" :
id | item_id | date | amount | average
5 1 2019-01-05 5 3
10 2 2019-01-05 5 3
15 3 2019-01-05 5 3
Issue is that every example i can find with Insert the Select they only update one row and they are over another table there is also the most recent date issue...
Can someone point me in the right direction?
Perhaps you want to see running average every day. Storing the value as a separate column is bound to cause problems especially when the rows are updated/deleted, the column also needs to be updated and hence will require complex triggers.
Simply create a View and run whenever you want to check the average directly from that View.
CREATE OR REPLACE VIEW v_item_prices AS
SELECT t.*,avg(t.amount) OVER ( PARTITION BY item_id order by date)
AS average FROM item_prices t
order by item_id,date
DEMO

SQL - Datediff between rows with Rank Applied

I am trying to work out how to to apply a datediff between rows where a rank is applied to the USER ID;
Example of how the data below;
UserID Order Number ScanDateStart ScanDateEnd Minute Difference Rank | Minute Difference Rank vs Rank+1
User1 10-24 10:20:00 10:40:00 20 1 | 5
User1 10-25 10:45:00 10:50:00 5 2 | 33
User1 10-26 11:12:00 11:45:00 33 3 | NULL
User2 10-10 00:09:00 00:09:20 20 1 | 4
User2 10-11 00:09:24 00:09:25 1 2 | 15
User2 10-12 00:09:40 00:10:12 32 3 | 3
User2 10-13 00:10:15 00:10:35 20 4 | NULL
What i'm looking for is how to code the final column of this table.
The rank is applied to UserID ordered by ScanDateStart.
Basically, i want to know the time between the ScanDateEnd of Rank 1, to ScanDateStart of Rank2, and so on, but for each user.... (calculating time between order processing etc)
Appreciate the help
This can be achieved by performing a LEFT JOIN to the same table on the UserID column and the Rank column, plus 1.
The following (simplified) pseudo-code should illustrate how to achieve this:
SELECT R.UserID,
R.Rank,
R1.Diff
FROM Rank R
LEFT JOIN Rank R1 ON R1.UserID = R.UserID AND R1.Rank = R.Rank + 1
Effectively, you are showing the UserID and Rank from the current row, but the Difference from the row of the same UserID with the Rank + 1.