SQL Server: query to get the data between two values from same columns and calculate time difference - sql

I have a requirement to get the number of hours between two values, say 20 and 25 or above (this will be user input values and not fixed). Below is the table with sample data.
Consider in the table on 01-09-2016 08:40 value_ID is 25 and it reaches back to 20 on 02-09-2016 13:20, I need to consider the number of hours between these two range ie 12 hours and 40 min it is .. Similarly 04-09-2016 13:20 it reached 26.3 (which is above 25 ) and '06-09-2016 16:20' reached 19.3 (below 20) and number of hours is 45 hours. I tried creating a function, however it's not working..
CODE TO CREATE TABLE:
CREATE TABLE [dbo].[NumOfHrs](
[ID] [float] NULL,
[Date] [datetime] NULL,
[Value_ID] [float] NULL
) ON [PRIMARY]
CODE to insert data :
INSERT INTO [dbo].[NumOfHrs]
([ID]
,[Date]
,[Value_ID])
VALUES
(112233,'8-31-2016 08:20:00',19.2),
(112233,'9-01-2016 08:30:00',24),
(112233,'9-01-2016 08:40:00',25),
(112233,'9-01-2016 09:20:00',26),
(112233,'9-02-2016 10:20:00',27),
(112233,'9-02-2016 10:20:00',24),
(112233,'9-02-2016 10:20:00',23),
(112233,'9-02-2016 11:20:00',22),
(112233,'9-02-2016 12:20:00',21),
(112233,'9-02-2016 13:20:00',20),
(112233,'9-03-2016 13:20:00',19.8),
(112233,'9-04-2016 13:20:00',21),
(112233,'9-04-2016 14:20:00',24),
(112233,'9-04-2016 16:20:00',24.6),
(112233,'9-04-2016 19:20:00',26.3),
(112233,'9-04-2016 23:20:00',27),
(112233,'9-05-2016 00:20:00',22),
(112233,'9-06-2016 16:20:00',19.3),
(112233,'9-07-2016 00:20:00',22),
(112233,'9-08-2016 00:20:00',21),
(112233,'9-09-2016 00:20:00',23),
(445566,'9-10-2016 00:20:00',24),
(445566,'9-11-2016 00:20:00',25),
(445566,'9-12-2016 00:20:00',26),
(445566,'9-13-2016 00:20:00',24),
(445566,'9-14-2016 00:20:00',23),
(445566,'9-15-2016 00:20:00',24),
(445566,'9-16-2016 00:20:00',21),
(445566,'9-17-2016 00:20:00',20),
(445566,'9-18-2016 00:20:00',18.5),
(445566,'9-19-2016 00:20:00',17)
image of the table:

Well, I couldn't think of anything simpler. Here's my try to solve the problem:
;with NumOfHrs_rn as (
select id, [Date], Value_ID,
row_number() over (partition by id order by [date]) AS rn
from [dbo].[NumOfHrs]
), NumOfHrs_lag as (
select t1.id, t1.[date],
t2.Value_ID as prev_value,
t1.Value_ID as curr_value
from NumOfHrs_rn as t1
-- get previous value (lag)
join NumOfHrs_rn as t2 on t1.id = t2.id and t1.rn = t2.rn + 1
), NumOfHrs_flag as (
select id, [Date], prev_value, curr_value,
case
when curr_value >= 25 and prev_value < 25 then 'start'
when curr_value <= 20 and prev_value > 20 then 'stop'
else 'ignore'
end as flag
from NumOfHrs_lag
), NumOfHrs_grp as (
select id, [Date], curr_value, flag,
row_number() over (partition by id order by [Date]) -
case flag
when 'start' then 0
when 'stop' then 1
end as grp
from NumOfHrs_flag
where flag in ('start', 'stop')
)
select min([Date]) AS 'start', max([Date]) as 'stop'
from NumOfHrs_grp
group by id, grp
order by min([Date])
Output:
start stop
------------------------------------------------
2016-09-01 08:40:00.000 2016-09-02 13:20:00.000
2016-09-04 19:20:00.000 2016-09-06 16:20:00.000
2016-09-11 00:20:00.000 2016-09-17 00:20:00.000
You can manipulate the above query in order to get the time difference expressed in hours/minutes/seconds format.
Demo here

Related

Collapse multiple rows into a single row based upon a break condition

I have a simple sounding requirement that has had me stumped for a day or so now, so its time to seek help from the experts.
My requirement is to simply roll-up multiple rows into a single row based upon a break condition - when any of these columns change Employee ID, Allowance Plan, Allowance Amount or To Date, then the row is to be kept, if that makes sense.
An example source data set is shown below:
and the target data after collapsing the rows should look like this:
As you can see I don't need any type of running totals calculating I just need to collapse the rows into a single record per from date/to date combination.
So far I have tried the following SQL using a GROUP BY and MIN function
select [Employee ID], [Allowance Plan],
min([From Date]), max([To Date]), [Allowance Amount]
from [dbo].[#AllowInfo]
group by [Employee ID], [Allowance Plan], [Allowance Amount]
but that just gives me a single row and does not take into account the break condition.
what do I need to do so that the records are rolled-up (correct me if that is not the right terminology) correctly taking into account the break condition?
Any help is appreciated.
Thank you.
Note that your test data does not really exercise the algo that well - e.g. you only have one employee, one plan. Also, as you described it, you would end up with 4 rows as there is a change of todate between 7->8, 8->9, 9->10 and 10->11.
But I can see what you are trying to do, so this should at least get you on the right track, and returns the expected 3 rows. I have taken the end of a group to be where either employee/plan/amount has changed, or where todate is not null (or where we reach the end of the data)
CREATE TABLE #data
(
RowID INT,
EmployeeID INT,
AllowancePlan VARCHAR(30),
FromDate DATE,
ToDate DATE,
AllowanceAmount DECIMAL(12,2)
);
INSERT INTO #data(RowID, EmployeeID, AllowancePlan, FromDate, ToDate, AllowanceAmount)
VALUES
(1,200690,'CarAllowance','30/03/2017', NULL, 1000.0),
(2,200690,'CarAllowance','01/08/2017', NULL, 1000.0),
(6,200690,'CarAllowance','23/04/2018', NULL, 1000.0),
(7,200690,'CarAllowance','30/03/2018', NULL, 1000.0),
(8,200690,'CarAllowance','21/06/2018', '01/04/2019', 1000.0),
(9,200690,'CarAllowance','04/11/2021', NULL, 1000.0),
(10,200690,'CarAllowance','30/03/2017', '13/05/2022', 1000.0),
(11,200690,'CarAllowance','14/05/2022', NULL, 850.0);
-- find where the break points are
WITH chg AS
(
SELECT *,
CASE WHEN LAG(EmployeeID, 1, -1) OVER(ORDER BY RowID) != EmployeeID
OR LAG(AllowancePlan, 1, 'X') OVER(ORDER BY RowID) != AllowancePlan
OR LAG(AllowanceAmount, 1, -1) OVER(ORDER BY RowID) != AllowanceAmount
OR LAG(ToDate, 1) OVER(ORDER BY RowID) IS NOT NULL
THEN 1 ELSE 0 END AS NewGroup
FROM #data
),
-- count the number of break points as we go to group the related rows
grp AS
(
SELECT chg.*,
ISNULL(
SUM(NewGroup)
OVER (ORDER BY RowID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),
0) AS grpNum
FROM chg
)
SELECT MIN(grp.RowID) AS RowID,
MAX(grp.EmployeeID) AS EmployeeID,
MAX(grp.AllowancePlan) AS AllowancePlan,
MIN(grp.FromDate) AS FromDate,
MAX(grp.ToDate) AS ToDate,
MAX(grp.AllowanceAmount) AS AllowanceAmount
FROM grp
GROUP BY grpNum
one way is to get all rows the last todate, and then group on that
select min(t.RowID) as RowID,
t.EmployeeID,
min(t.AllowancePlan) as AllowancePlan,
min(t.FromDate) as FromDate,
max(t.ToDate) as ToDate,
min(t.AllowanceAmount) as AllowanceAmount
from ( select t.RowID,
t.EmployeeID,
t.FromDate,
t.AllowancePlan,
t.AllowanceAmount,
case when t.ToDate is null then ( select top 1 t2.ToDate
from test t2
where t2.EmployeeID = t.EmployeeID
and t2.ToDate is not null
and t2.FromDate > t.FromDate -- t2.RowID > t.RowID
order by t2.RowID, t2.FromDate
)
else t.ToDate
end as todate
from test t
) t
group by t.EmployeeID, t.ToDate
order by t.EmployeeID, min(t.RowID)
See and test yourself in this DBFiddle
the result is
RowID
EmployeeID
AllowancePlan
FromDate
ToDate
AllowanceAmount
1
200690
CarAllowance
2017-03-30
2019-04-01
1000
9
200690
CarAllowance
2021-11-04
2022-05-13
1000
11
200690
CarAllowance
2022-05-14
(null)
850

Select each distinct value over time without losing NULLs in between

Let's assume a table has many columns and a Temporal table is logging its history. There is one field I need to know when it changes.
Number
VersionStartDate
991281
2021-11-12 08:27:11
991281
2021-11-12 08:20:11
NULL
2021-11-12 07:20:11
NULL
2021-11-12 06:20:11
771281
2021-11-11 08:26:11
NULL
2021-11-11 08:25:11
661281
2021-11-10 08:24:11
NULL
2021-11-10 08:22:11
661281
2021-11-10 08:21:11
551281
2021-11-09 08:20:11
I need to get each value, and the moment it changed. I also need to know if it's been set NULL so this query is not giving what I need.
SELECT
Number,
MIN(VersionStartDate) [Date]
FROM _TABLE_
GROUP BY
Number
ORDER BY
[Date] DESC
The result should be
Number
VersionStartDate
991281
2021-11-12 08:20:11
NULL
2021-11-12 06:20:11
771281
2021-11-11 08:26:11
NULL
2021-11-11 06:25:11
661281
2021-11-10 08:24:11
NULL
2021-11-10 08:22:11
661281
2021-11-10 08:21:11
551281
2021-11-09 08:20:11
Quite similar to JMabee's which appeared after I started working on it, but perhaps a bit simpler:
CREATE TABLE #d (Number INT, VersionStartDate DATETIME);
INSERT INTO #d(Number, VersionStartDate)
VALUES
(991281 ,'2021-11-12T08:27:11'),
(991281 ,'2021-11-12T08:20:11'),
(NULL ,'2021-11-12T07:20:11'),
(NULL ,'2021-11-12T06:20:11'),
(771281 ,'2021-11-11T08:26:11'),
(NULL ,'2021-11-11T08:25:11'),
(661281 ,'2021-11-10T08:24:11'),
(NULL ,'2021-11-10T08:22:11'),
(661281 ,'2021-11-10T08:21:11'),
(551281 ,'2021-11-09T08:20:11');
WITH cte AS
(
SELECT Number,
VersionStartDate,
LAG(Number, 1) OVER (ORDER BY VersionStartDate) AS PrevNumber
FROM #d
)
SELECT cte.Number,
cte.VersionStartDate
FROM cte
WHERE ISNULL(cte.Number, -1) <> ISNULL(cte.PrevNumber, -1)
ORDER BY cte.VersionStartDate DESC;
Well this is one way to do it, I am sure there is a more eloquent way to write it, but it gets you started:
CREATE TABLE #T(Number int, VersionStartDate datetime)
INSERT INTO #T vALUES
(991281,'2021-11-12 08:27:11'),
(991281,'2021-11-12 08:20:11'),
(NULL,'2021-11-12 07:20:11'),
(NULL,'2021-11-12 06:20:11'),
(771281,'2021-11-11 08:26:11'),
(NULL,'2021-11-11 08:25:11'),
(661281,'2021-11-10 08:24:11'),
(NULL,'2021-11-10 08:22:11'),
(661281,'2021-11-10 08:21:11'),
(551281,'2021-11-09 08:20:11')
SELECT Number, MIN(VersionStartDate) VersionStartDate
FROM
(
SELECT *, SUM(CASE WHEN ISNULL(Number,-1) <> ISNULL(LG,-1) THEN 1 ELSE 0 END) OVER(ORDER BY VersionStartDate desc) GRP
FROM
(
SELECT *, LAG(Number,1,-1) OVER(ORDER BY VersionStartDate desc) LG
FROM #T
) X
) Y
GROUP BY GRP,Number
ORDER BY VersionStartDate desc
Add VersionStartDate in your group by.
SELECT
Number,
MIN(VersionStartDate)
FROM _TABLE_
GROUP BY
Number, VersionStartDate
ORDER BY
VersionStartDate DESC

fetch two distinct rows with discontinued dates

I want to fetch two rows with discontinued dates from a data sample ex: end date of 1st row should be equal to start date of next row and I want to print whole two rows
tried lead but it did not work
select t1.*
from (select t.*, lead(cast(startdate as date)) over (order by currenykey,cast(enddate as date)) as next_start_date
from table t
) t1
where enddate <> next_start_date
start date end date
1 11/6/17 0:00.00 11/13/17 0:00.00
2 11/13/17 0:00.00 12/26/17 0:00.00
3 12/26/17 0:00.00 1/8/18 0:00.00
4 10/22/18 0:11.13 2/25/19 0:16.35
5 2/25/19 0:16.35 3/4/19 0:09.57
6 3/4/19 0:09.57 3/11/19 0:12.30
7 3/11/19 0:12.30 3/18/19 0:10.21
8 3/18/19 0:10.21 3/25/19 0:09.20
9 3/25/19 0:09.20 4/1/19 0:10.19
I want o print entire rows 3 and 4
If you're on SQL Server 2012 or later you could use the LAG and LEAD functions:
LAG (Transact-SQL)
LEAD (Transact-SQL)
For example...
declare #StackOverflow table (
[ID] int not null,
[StartDate] datetime not null,
[EndDate] datetime not null
);
insert #StackOverflow values
(1, '11/6/17 0:00.00', '11/13/17 0:00.00'),
(2, '11/13/17 0:00.00', '12/26/17 0:00.00'),
(3, '12/26/17 0:00.00', '1/8/18 0:00.00'),
(4, '10/22/18 0:11.13', '2/25/19 0:16.35'),
(5, '2/25/19 0:16.35', '3/4/19 0:09.57'),
(6, '3/4/19 0:09.57', '3/11/19 0:12.30'),
(7, '3/11/19 0:12.30', '3/18/19 0:10.21'),
(8, '3/18/19 0:10.21', '3/25/19 0:09.20'),
(9, '3/25/19 0:09.20', '4/1/19 0:10.19');
select [ID], [StartDate], [EndDate]
from (
select [ID],
[StartDate],
[EndDate],
[Previous] = cast(lag([EndDate]) over (order by [ID]) as date),
[Next] = cast(lead([StartDate]) over (order by [ID]) as date)
from #StackOverflow SO
) SO
where Previous != cast([StartDate] as date)
or Next != cast([EndDate] as date);
Which yields:
ID StartDate EndDate
3 26/12/2017 00:00:00 08/01/2018 00:00:00
4 22/10/2018 00:11:00 25/02/2019 00:16:00
Your query is on the right path, with two caveats:
You want to convert to dates for the comparison.
You need to compare both lead() and lag().
So:
select t.*
from (select t.*,
lead(startdate) over (order by startdate) as next_startdate,
lag(enddate) over (order by startdate) as prev_enddate
from t
) t
where convert(date, enddate) <> convert(date, next_startdate) or
convert(date, startdate) <> convert(date, prev_enddate) ;
That said, I think you are safer with not exists subqueries:
select *
from t
where (not exists (select 1
from t t2
where convert(date, t.startdate) = convert(date, t2.enddate)
) or
not exists (select 1
from t t2
where convert(date, t.enddate) = convert(date, t2.startdate)
)
) and
t.startdate <> (select min(t2.startdate) from t t2) and
t.startdate <> (select max(t2.startdate) from t t2) ;
Here is a db<>fiddle.
To understand why, consider what happens if the start date of line 3 changes. Here is an example where the two do not produce the same results.

SQL failing to add value from previous row into the next

I am trying to add the value of the previous row to the current row into the column cumulative
Select
Ddate as Date, etype, Reference, linkacc as ContraAcc,
Description,
sum(case when amount > 0 then amount else 0 end) as Debits,
sum(case when amount < 0 then amount else 0 end) as Credits,
sum(amount) as Cumulative
from
dbo.vw_LT
where
accnumber ='8400000'
and [DDate] between '2016-04-01 00:00:00' and '2016-04-30 00:00:00'
and [DataSource] = 'PAS11CEDCRE17'
group by
Ddate, etype, Reference, linkacc, Description, Amount
Output(what i am getting):
Date Reference ContraAcc Description Debits Credits Cumulative
--------------------------------------------------------------------------
2016-04-01 CC007 8000000 D/CC007 0 -39.19 -39.19
2016-04-01 CC007 8000000 D/CC007 1117.09 0 1117.09
2016-04-01 CC009 8000000 CC009 2600 0 2600
in the cumulative column should like below(what i need):
Date Reference ContraAcc Description Debits Credits Cumulative
--------------------------------------------------------------------------
2016-04-01 CC007 8000000 D/CC007 0 -39.19 -39.19
2016-04-01 CC007 8000000 D/CC007 1117.09 0 1077.9
2016-04-01 CC009 8000000 CC009 2600 0 3677.9
Before we delve into the solution, let me tell you that if you are using SQL Server version more than 2012, there are LAG and LEAD, which can help you to solve this.
I am not giving you an exact query to solve your problem (as we dont know what your primary key for that table is), but you can get the idea by seeing the below example
DECLARE #t TABLE
(
accountNumber VARCHAR(50)
,dt DATETIME
,TransactedAmt BIGINT
)
INSERT INTO #t VALUES ('0001','7/20/2016',1000)
INSERT INTO #t VALUES ('0001','7/21/2016',-1000)
INSERT INTO #t VALUES ('0001','7/22/2016',2000)
INSERT INTO #t VALUES ('0002','7/20/2016',500)
INSERT INTO #t VALUES ('0002','7/21/2016',-500)
INSERT INTO #t VALUES ('0002','7/22/2016',2000)
;WITH CTE AS
(
SELECT ROW_NUMBER() OVER(Partition by accountNumber order by dt) as RN, *
FROM #t
),CTE1 AS
(
SELECT *,TransactedAmt As TotalBalance
FROM CTE WHERE rn = 1
UNION
SELECT T1.*,T1.TransactedAmt + T0.TransactedAmt as TotalBalance
FROM CTE T1
JOIN CTE T0
ON T1.accountNumber = T0.accountNumber
AND T1.RN = T0.RN+1
AND T1.RN > 1
)
select * from CTE1 order by AccountNumber

How to find the difference between dates within the same column using SQL?

I am trying to solve the following challenge:
1) If a patient visits the ER within 48 hours, I want to flag that as 1.
2) If the same patient visits the ER again after 48 hours, I want to flag that as 2.
3) Each subsequent visit must be flagged as 3, 4, 5 etcetera after the first 48 hours.
Here is what my table looks like:
PATIENT_ID ADMIT_DATE LOCATION
---------- ---------- --------
33 1/10/2014 ER
33 1/11/2014 ER
33 1/15/2014 ER
33 1/17/2014 ER
45 2/20/2014 OBS
45 2/21/2014 OBS
45 2/25/2014 OBS
45 2/30/2014 OBS
45 2/32/2014 OBS
And here is what the desired result should look like:
PATIENT_ID ADMIT_DATE LOCATION FLAG
---------- ---------- -------- ----
33 1/10/2014 ER 1
33 1/15/2014 ER 2
33 1/17/2014 ER 3
45 2/20/2014 OBS 1
45 2/25/2014 OBS 2
45 2/30/2014 OBS 3
45 2/32/2014 OBS 4
I have started something like this but could not complete it:
SELECT PATIENT_ID, ADMIT_DATE, LOCATION,
CASE WHEN MIN(ADMIT_DATE)-MAX(ADMIT_DATE)<48 THEN 1 ELSE 0 AS FLAG
FROM MYTABLE
GROUP BY PATIENT_ID, ADMIT_DATE, LOCATION
Can someone please help?
You can achieve this easy using LAG, DATEDIFF and ROWNUMBER functions. The LAG function helps you to get the previous ADMIT_DATE value. Then you can calculate the difference in hours using the DATEDIFF function. Finally, using ROWNUMBER you can simple rank your results.
This is full working example:
SET NOCOUNT ON
GO
DECLARE #DataSource TABLE
(
[ATIENT_ID] TINYINT
,[ADMIT_DATE] DATE
,[LOCATION] VARCHAR(3)
)
INSERT INTO #DataSource ([ATIENT_ID], [ADMIT_DATE], [LOCATION])
VALUES (33, '1-10-2014', 'ER')
,(33, '1-11-2014', 'ER')
,(33, '1-15-2014', 'ER')
,(33, '1-17-2014', 'ER')
,(45, '2-15-2014', 'OBS')
,(45, '2-16-2014', 'OBS')
,(45, '2-20-2014', 'OBS')
,(45, '2-25-2014', 'OBS')
,(45, '2-27-2014', 'OBS')
;WITH DataSource ([ATIENT_ID], [ADMIT_DATE], [LOCATION], [DIFF_IN_HOURS]) AS
(
SELECT [ATIENT_ID]
,[ADMIT_DATE]
,[LOCATION]
,DATEDIFF(
HOUR
,LAG([ADMIT_DATE], 1, NULL) OVER (PARTITION BY [ATIENT_ID], [LOCATION] ORDER BY [ADMIT_DATE] ASC)
,[ADMIT_DATE]
)
FROM #DataSource
)
SELECT [ATIENT_ID]
,[ADMIT_DATE]
,[LOCATION]
,ROW_NUMBER() OVER (PARTITION BY [ATIENT_ID], [LOCATION] ORDER BY [ADMIT_DATE] ASC)
FROM DataSource
WHERE [DIFF_IN_HOURS] >= 48
OR [DIFF_IN_HOURS] IS NULL -- these are first records
SET NOCOUNT OFF
GO
Note, I have fixed your sample data as it was wrong.
This is alternative solution without LAG function:
;WITH TempDataSource ([ATIENT_ID], [ADMIT_DATE], [LOCATION], [Rank]) AS
(
SELECT [ATIENT_ID]
,[ADMIT_DATE]
,[LOCATION]
,ROW_NUMBER() OVER (PARTITION BY [ATIENT_ID], [LOCATION] ORDER BY [ADMIT_DATE] ASC)
FROM #DataSource
),
DataSource ([ATIENT_ID], [ADMIT_DATE], [LOCATION], [DIFF_IN_HOURS]) AS
(
SELECT DS1.[ATIENT_ID]
,DS1.[ADMIT_DATE]
,DS1.[LOCATION]
,DATEDIFF(HOUR, DS2.[ADMIT_DATE], DS1.[ADMIT_DATE])
FROM TempDataSource DS1
LEFT JOIN TempDataSource DS2
ON DS1.[Rank] - 1 = DS2.[Rank]
AND DS1.[ATIENT_ID] = DS2.[ATIENT_ID]
AND DS1.[LOCATION] = DS2.[LOCATION]
)
SELECT [ATIENT_ID]
,[ADMIT_DATE]
,[LOCATION]
,ROW_NUMBER() OVER (PARTITION BY [ATIENT_ID], [LOCATION] ORDER BY [ADMIT_DATE] ASC)
FROM DataSource
WHERE [DIFF_IN_HOURS] >= 48
OR [DIFF_IN_HOURS] IS NULL -- these are first records
SELECT Patient_id,Admit_date, Location,
CASE WHEN DATEDIFF (HH , min(admit_date) , max(admit_date)) < 48 THEN count(flag)+1 ELSE 0 End As Flag
FROM tbl_Patient
GROUP BY PATIENT_ID, ADMIT_DATE, LOCATION
you can use DATEDIFF() available in sql-server like
SELECT DATEDIFF(hour,startDate,endDate) AS 'Duration'
You can visit http://msdn.microsoft.com/en-IN/library/ms189794.aspx