Find Datediff in the same column - sql

I have a BorderCrossingData Table. A would like to get the PassportNames have minimum one BorderCrossingDateTime-interval what is longer than 4 month.
BorderCrossingID PassportNumber BorderCrossingDateTime
1 ER-2222 2019-01-07 22:11:12.000
2 ER-2222 2019-01-07 23:11:12.000
3 KL-5233 2018-10-03 17:10:39.000
130 FF-4444 2019-01-08 11:11:11.000
5 ER-1111 NULL
6 KL-5686 NULL
7 ER-1111 NULL
8 KL-5235 NULL
9 QW-5656 NULL
10 DF-5685 NULL
11 KL-4558 NULL
--------
113 LL-8989 2019-01-15 16:24:26.333
114 ZZ-0005 2019-01-17 16:18:12.273
115 LL-0223 2019-01-17 16:19:12.000
116 ER-2222 2019-01-03 08:24:29.000
117 ER-2222 2019-02-01 08:25:03.873
118 ER-2222 2019-03-13 08:25:17.000
119 ER-2222 2019-04-10 08:25:32.000
120 ER-2222 2019-09-30 08:25:47.000
I have already get BorderCrossings have BorderCrossingDateTime and put them in Order.
SELECT DISTINCT PassportNumber, BorderCrossingDateTime FROM Passports
WHERE DATEDIFF(Compare 2 upcoming DateTimes)
EXCEPT
SELECT PassportNumber, BorderCrossingDateTime FROM Passports
WHERE BorderCrossingDateTime IS NULL
ORDER BY BorderCrossingDateTime
The result should be like this:
PassportName
ER-2222
TO-0140
NN-4444
TP-0140
TT-0140
WU-5645

Below code will work. I have tested in SQL Server. I am using LEAD, LAG functions to find out the previous, next border crossing dates.
CREATE TABLE #borderCrossing (BorderCrossingID INT, PassportNumber VARCHAR(10), BorderCrossingDateTime DATETIME)
INSERT INTO #borderCrossing
VALUES (1, 'ER-2222', '2019-01-07 22:11:12.000'), (2, 'ER-2222', '2019-03-07 22:11:12.000'), (3, 'ER-2222', '2019-08-07 22:11:12.000');
SELECT DISTINCT PassportNumber
FROM (
SELECT BorderCrossingId, BorderCrossingDateTime AS CurrentBorderCrossingDateTime, passportNumber, lag(BorderCrossingDateTime, 1, NULL) OVER (
PARTITION BY passportNumber ORDER BY BorderCrossingDateTime
) AS prevBorderCrossingDateTime, lead(BorderCrossingDateTime, 1, NULL) OVER (
PARTITION BY passportNumber ORDER BY BorderCrossingDateTime
) AS nextBorderCrossingDateTime
FROM #borderCrossing
) AS t
WHERE DATEDIFF(mm, prevBorderCrossingDateTime, nextBorderCrossingDateTime) > 4

Related

SQL code to add missing StartDate between 2 dates in SQL Server

I have the below table ordered by clientID, contractID and effectiveDate. One Client has multiple contractID and it's respective effectiveDate.
the desired output is as below, where the new FYStartDate column should add the missing FYStartDate between 2 dates of subsequent contractIDs of a clientID(in this scenario, Fiscal Year starts on 01June of every year)
I would be appreciate if you could share the required SQL code.
I'm attaching the SQL code to generate the first table
CREATE TABLE [client] (
[clientid] [int] NULL,
[contractid] [int] NULL,
[effectivedate] [date] NULL
) ON [PRIMARY]
GO
insert into [client] values
('228','2','6/1/2003'),('228','136','6/1/2004'),('228','242','6/1/2008'),
('228','337','12/1/2012'),('228','584','6/1/2017'),('14216','319','5/1/2013'),
('14216','355','6/1/2013'),('14216','739','6/1/2020'),('14216','10','3/1/2021'),
('14216','1009','6/1/2021')
This is a bit convoluted without as #MatBailie suggest about having more structured data. To accomplish what you ask, each record needs to know when the contract before it and after it comes into effect. I think you need to play with the ordering because I didn't quite get how to order the results... by clientid, contractid, dates, etc?
UPDATED: see comments. Changed some CTEs, JOINS and ORDER BY for better partitioning by clientid.
CREATE TABLE [client] (
[clientid] [int] NULL,
[contractid] [int] NULL,
[effectivedate] [date] NULL
) ON [PRIMARY]
;
insert into [client] values
('228','2','6/1/2003'),('228','136','6/1/2004'),('228','242','6/1/2008'),
('228','337','12/1/2012'),('228','584','6/1/2017'),('14216','319','5/1/2013'),
('14216','355','6/1/2013'),('14216','739','6/1/2020'),('14216','10','3/1/2021'),
('14216','1009','6/1/2021')
;
--Need a sequence of numbers to create a sequence of fiscal years.
WITH x AS (
SELECT * FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) as x(a)
), y as (
SELECT ROW_NUMBER() OVER(ORDER BY tens.a, ones.a) as row_num
FROM x as ones, x as tens
), fiscYears as (
SELECT
fyStart = DATEFROMPARTS(2000 + y.row_num -1, 6, 1)
, fyEnd = DATEFROMPARTS(2000 + y.row_num, 5, 31)
FROM y
--Need to order the client records by effective date.
--From updated question... looks like we are reporting by clientid.
), clientOrd as (
SELECT c2.*, ROW_NUMBER() OVER(PARTITION BY c2.clientid ORDER BY c2.effectivedate) as row_num
FROM client c2
--For each contract, get the previous and next contracts by effective date.
), clientWNext as (
SELECT c.*
, cNext.effectivedate as nextEffectiveDate
, cPrev.effectivedate as prevEffectiveDate
FROM clientOrd as c
LEFT JOIN clientOrd as cNext
ON cNext.clientid = c.clientid
AND cNext.row_num = c.row_num + 1
LEFT JOIN clientOrd as cPrev
ON cPrev.clientid = c.clientid
AND cPrev.row_num = c.row_num - 1
)
SELECT
c.clientid
, cwn.contractid
, CASE WHEN cwn.effectiveDate >= fy.fyStart AND cwn.effectiveDate <= fy.fyEnd
THEN cwn.effectivedate
ELSE null
END as effectivedate
, fy.fyStart
FROM fiscYears as fy
--To get a full FY range for each client, we join to a distinct list of clients.
JOIN (
SELECT DISTINCT clientid FROM client
) as c
ON 1=1
--Need to join the list of contracts.
INNER JOIN clientWNext as cwn
ON cwn.clientid = c.clientid
--This is the main join criteria where the effective date is within the fy year start/end.
AND ((
cwn.effectivedate >= fy.fyStart
AND cwn.effectivedate <= fy.fyEnd
)
--This is the "alternate" join criteria where the previous contrat is still in effect
--but there is no new contract to supercede the previous.
OR (
cwn.prevEffectiveDate < fy.fyStart
AND cwn.effectiveDate < fy.fyStart
AND (cwn.nextEffectiveDate > fy.fyEnd OR cwn.nextEffectiveDate IS NULL)
))
--Limiting fiscal year date range.
WHERE fy.fyStart >= '1/1/2003'
AND fy.fyStart < '1/1/2024'
ORDER BY c.clientid, fy.fyStart, cwn.effectivedate
clientid
contractid
effectivedate
fyStart
228
2
2003-06-01
2003-06-01
228
136
2004-06-01
2004-06-01
228
136
null
2005-06-01
228
136
null
2006-06-01
228
136
null
2007-06-01
228
242
2008-06-01
2008-06-01
228
242
null
2009-06-01
228
242
null
2010-06-01
228
242
null
2011-06-01
228
337
2012-12-01
2012-06-01
228
337
null
2013-06-01
228
337
null
2014-06-01
228
337
null
2015-06-01
228
337
null
2016-06-01
228
584
2017-06-01
2017-06-01
228
584
null
2018-06-01
228
584
null
2019-06-01
228
584
null
2020-06-01
228
584
null
2021-06-01
228
584
null
2022-06-01
228
584
null
2023-06-01
14216
319
2013-05-01
2012-06-01
14216
355
2013-06-01
2013-06-01
14216
355
null
2014-06-01
14216
355
null
2015-06-01
14216
355
null
2016-06-01
14216
355
null
2017-06-01
14216
355
null
2018-06-01
14216
355
null
2019-06-01
14216
739
2020-06-01
2020-06-01
14216
10
2021-03-01
2020-06-01
14216
1009
2021-06-01
2021-06-01
14216
1009
null
2022-06-01
14216
1009
null
2023-06-01
fiddle

T-SQL get values for specific group

I have a table EmployeeContract similar like this:
ContractId
EmployeeId
ValidFrom
ValidTo
Salary
12
5
2018-02-01
2019-06-31
x
25
8
2015-01-01
2099-12-31
x
50
5
2019-07-01
2021-05-31
x
52
6
2011-08-01
2021-12-31
x
72
8
2010-08-01
2014-12-31
x
52
6
2011-08-01
2021-12-31
x
Table includes history contracts in company for each employee. I need to get date when employees started work and last date of contract. Sometime records has duplicates.
For example, based on data from above:
EmployeeId
ValidFrom
ValidTo
5
2018-02-01
2021-05-31
8
2010-08-01
2099-12-31
6
2011-08-01
2021-12-31
Base on this article: https://www.techcoil.com/blog/sql-statement-for-selecting-the-latest-record-in-each-group/
I prepared query like this:
select minv.*, maxv.maxvalidto from
(select distinct con.[EmployeeId], mvt.maxvalidto
from [EmployeeContract] con
join (select [EmployeeId], max(validto) as maxvalidto
FROM [EmployeeContract]
group by [EmployeeId]) mvt
on con.[EmployeeId] = mvt.[EmployeeId] and mvt.maxvalidto = con.validto) maxv
join
(select distinct con.[EmployeeId], mvf.minvalidfrom
from [EmployeeContract] con
join (select [EmployeeId], min(validfrom) as minvalidfrom
FROM [EmployeeContract]
group by [EmployeeId]) mvf
on con.[EmployeeId] = mvf.[EmployeeId] and mvf.minvalidfrom = con.validfrom) minv
on minv.[EmployeeId] = maxv.[EmployeeId]
order by 1
But I'm not satisfied, i think it's not easy to read, and probably optimize is poor. How can I do it better?
I think you want group by:
select employeeid, min(validfrom), max(validto)
from employeecontract
group by employeeid

How duplicate a rows in SQL base on difference between date columns and divided aggregated column per duplicate row?

I have a table with some records about fuel consumption. The important columns in the table are: CONSUME_DATE_FROM and CONSUM_DATE_TO.
I want to calculate average fuel consumption per cars on a monthly basis but some rows are not in the same month. For example some have a three month difference between them and the total of gas per litre is aggregated in a single row.
Now I should find records that have difference more than a month between CONSUME_DATE_FROM and CONSUM_DATE_TO, and duplicate them in current or second table per count of month and divide the total gas per litre between related rows.
I've this table with the following data:
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER
1 100 2018-10-25 2018-12-01 600
2 101 2018-07-19 2018-07-24 100
3 102 2018-12-31 2019-01-01 400
4 103 2018-03-29 2018-05-29 200
5 104 2018-02-05 2018-02-09 50
The expected output table should be as below
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER
1 100 2018-10-25 2018-12-01 200
1 100 2018-10-25 2018-12-01 200
1 100 2018-10-25 2018-12-01 200
2 101 2018-07-19 2018-07-24 100
3 102 2018-12-31 2019-01-01 200
3 102 2018-12-31 2019-01-01 200
4 103 2018-03-29 2018-05-29 66.66
4 103 2018-03-29 2018-05-29 66.66
4 103 2018-03-29 2018-05-29 66.66
5 104 2018-02-05 2018-02-09 50
Or as below
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER DATE_RELOAD_GAS
1 100 2018-10-25 2018-12-01 200 2018-10-01
1 100 2018-10-25 2018-12-01 200 2018-11-01
1 100 2018-10-25 2018-12-01 200 2018-12-01
2 101 2018-07-19 2018-07-24 100 2018-07-01
3 102 2018-12-31 2019-01-01 200 2018-12-01
3 102 2018-12-31 2019-01-01 200 2019-01-01
4 103 2018-03-29 2018-05-29 66.66 2018-03-01
4 103 2018-03-29 2018-05-29 66.66 2018-04-01
4 103 2018-03-29 2018-05-29 66.66 2018-05-01
5 104 2018-02-05 2018-02-09 50 2018-02-01
Can someone please help me out with this query?
I'm using oracle database
Your business rule treats the difference between CONSUME_DATE_FROM and CONSUM_DATE_TO as absolute months. So you expect the difference between 2018-10-25 and 2018-12-01 to be three months whereas the difference in days actually equates to about 1.1 months. So we can't use simple date arithmetic to get your desired output, we need to do some additional massaging of the dates.
The query below implements your desired logic by deriving the first day of the month for CONSUME_DATE_FROM and the last day of the month for CONSUME_DATE_TO, then using ceil() to round the difference up to the nearest whole number of months.
This is calculated in a subquery which is used in the main query with the old connect by level trick to multiply a record by level number of times:
with cte as (
select f.*
, ceil(months_between(last_day(CONSUM_DATE_TO)
, trunc(CONSUME_DATE_FROM,'mm'))) as diff
from fuel_consumption f
)
select cte.id
, cte.VehicleId
, cte.CONSUME_DATE_FROM
, cte.CONSUM_DATE_TO
, cte.GAS_PER_LITER/cte.diff as GAS_PER_LITER
, add_months(trunc(cte.CONSUME_DATE_FROM, 'mm'), level-1) as DATE_RELOAD_GAS
from cte
connect by level <= cte.diff
and prior cte.id = cte.id
and prior sys_guid() is not null
;
"what about if add a additional column "DATE_RELOAD_GAS" that display difference date for similar rows"
From your posted sample it seems like DATE_RELOAD_GAS is the first day of the month for each month bounded by CONSUME_DATE_FROM and CONSUM_DATE_TO. I have amended my solution to implement this rule.
By using connect by level structure with considering to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') as month I was able to resolve as below :
select ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO,
trunc(GAS_PER_LITER/max(rn) over (partition by ID order by ID),2) as GAS_PER_LITER,
'01.'||substr(myMonth,5,2)||'.'||substr(myMonth,1,4) as DATE_RELOAD_GAS
from
(
with consumption( ID, VehicleId, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER ) as
(
select 1,100,date'2018-10-25',date'2018-12-01',600 from dual union all
select 2,101,date'2018-07-19',date'2018-07-24',100 from dual union all
select 3,102,date'2018-12-31',date'2019-01-01',400 from dual union all
select 4,103,date'2018-03-29',date'2018-05-29',200 from dual union all
select 5,104,date'2018-02-05',date'2018-02-09', 50 from dual
)
select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth,
VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
row_number() over (partition by ID order by ID) as rn
from dual join consumption c
on c.ID >= 2
group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
union all
select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth,
VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
row_number() over (partition by ID order by ID) as rn
from dual join consumption c
on c.ID = 1
group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
) q
group by ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER, rn
order by ID, myMonth;
I met an interesting issue that if I consider the join condition in the subquery as c.ID >= 1 query hangs on for huge period of time, so splitted into two parts by union all
as c.ID >= 2 and c.ID = 1
Rextester Demo

How to remove overlapping rows based on date and keep most recent in sql?

I need to find all episode_ids for every patient. However, when an overlapping episode arises within 90 days of the previous, then I just want to keep the most recent episode.
For example, patient_num 3242 below has 3 episodes: the second episode overlaps with the first episode within 90 days, and the third episode overlaps with the second episode within 90 days, in this situation I need to just keep the 3rd episode.
CREATE TABLE table1 (episode_id nvarchar(max), patient_num nvarchar(max), admit_date date, discharge_date date)
INSERT INTO table1 (episode_id, patient_num , admit_date , discharge_date ) VALUES
('1','5743','1/1/2016','1/5/2016'),
('2','5743','4/26/2016','4/29/2016'),
('3','5743','5/26/2016','5/28/2016'),
('4','5743','9/21/2016','9/28/2016'),
('5','8859','4/27/2016','5/5/2016'),
('6','3242','4/28/2016','4/29/2016'),
('7','3242','11/21/2016','11/23/2016'),
('8','3242','11/24/2016','11/29/2016'),
('9','3242','12/12/2016','12/29/2016')
Initial Table (table1)
episode_id patient_num admit_date discharge_date
1 5743 2016-01-01 2016-01-05
2 5743 2016-04-26 2016-04-29
3 5743 2016-05-26 2016-05-28
4 5743 2016-09-21 2016-09-28
5 8859 2016-04-27 2016-05-05
6 3242 2016-04-28 2016-04-29
7 3242 2016-11-21 2016-11-23
8 3242 2016-11-24 2016-11-29
9 3242 2016-12-12 2016-12-29
expected result
episode_id patient_num admit_date discharge_date
1 5743 2016-01-01 2016-01-05
3 5743 2016-05-26 2016-05-28
4 5743 2016-09-21 2016-09-28
5 8859 2016-04-27 2016-05-05
6 3242 2016-04-28 2016-04-29
9 3242 2016-12-12 2016-12-29
My attempt:
SELECT *
FROM table1 AS a
WHERE EXISTS
(
SELECT *
FROM table1 AS b
WHERE a.episode_id != b.episode_id
AND a.patient_num= b.patient_num
AND a.admit_date BETWEEN b.discharge_date AND DATEADD(DAY, 90, b.discharge_date ))
There error in my script is that for patient num 3242, I am getting both episode id 8 and 9 where I only want episode 9. I am assuming the reason for this error is that I am comparing each row individually instead of as a group but I am having trouble grouping. Additionally, this script is not showing instances where there is not an overlap, such as the episode_id 1, 4, 5, 6. Any advice on this approach?
I removed the cursor solution here since it has low perforamnce
a solution without using Cursor is:
WITH ExcludedIds AS (
SELECT DISTINCT T2.episode_id
FROM table1 AS T
INNER JOIN table1 AS T2 ON T.episode_id != T2.episode_id
AND T.patient_num = T2.patient_num
AND T2.discharge_date BETWEEN DATEADD(DAY, -90, T.admit_date ) AND T.discharge_date)
SELECT T.episode_id, T.patient_num, T.admit_date, T.discharge_date
FROM table1 AS T
WHERE T.episode_id NOT IN (SELECT ExcludedIds.episode_id FROM ExcludedIds)
Thought understanding this solution is a bit hard.
I think not exists does what you want:
SELECT a.*
FROM table1 a
WHERE NOT EXISTS (SELECT 1
FROM table1 b
WHERE a.episode_id <> b.episode_id AND
a.patient_num = b.patient_num AND
b.admin_date < a.discharge_date AND
b.discharge_date >= DATEADD(DAY, -90, a.discharge_date)
);

List the last two records for each id

Good Afternoon!
I'm having trouble list the last two records each idmicro
Ex:
idhist idmicro idother room unit Dtmov
100 1102 0 8 coa 2009-10-23 10:40:00.000
101 1102 0 1 coa 2009-10-28 10:40:00.000
102 1102 0 2 dib 2008-10-24 10:40:00.000
103 1201 0 6 diraf 2008-10-23 10:40:00.000
104 1201 0 7 diraf 2009-10-21 10:40:00.000
105 1201 0 4 dimel 2008-10-22 10:40:00.000
Would look like this:
ex:
result
idhist idmicro idoutros room unit Dtmov
101 1102 0 1 coa 2009-10-28 10:40:00.000
102 1102 0 2 dib 2008-10-24 10:40:00.000
103 1201 0 6 diraf 2008-10-22 10:40:00.000
104 1201 0 7 diraf 2009-10-21 10:40:00.000
I'm starting to delve into SQL and am having trouble finding this solution
Sorry
Thank you.
EDIT: I am using SQL server, and I made no query.
Yes! is based on the date and time
You can do the same thing with an imbricated SELECT statement.
SELECT *
FROM (
SELECT row_number() OVER (
PARTITION BY idmicro ORDER BY idhist
) AS ind
,*
FROM data
) AS initialResultSet
WHERE initialResultSet.ind < 3
Here is a sample SQLFiddle with how this query works.
WITH etc
AS (
SELECT *
,row_number() OVER (
PARTITION BY idmicro ORDER BY idhist
) AS r
,count() OVER (
PARTITION BY idmicro ORDER BY idhist
) cfrom TABLE
)
SELECT *
FROM etc
WHERE r > c - 2
Use row_number and over partition
SELECT *
FROM (
SELECT *, row_number() OVER (PARTITION BY idmicro ORDER BY idhist desc) AS rownum
FROM data
) AS initialResultSet
WHERE initialResultSet.rownum<=2