SQL - Select date ranges without overlapping

SQL - Select date ranges without overlapping - sql

I have the following table (Oracle database):
ID
valid_from
valid_to
1
01.01.22
28.02.22
1
01.03.22
30.06.22
1
01.07.22
31.12.22
1
01.01.23
null
2
01.01.22
31.03.22
2
01.04.22
null
How do I best extract now all date ranges without overlaps over both IDs? The final result set should look like:
valid_from
valid_to
01.01.22
28.02.22
01.03.22
31.03.22
01.04.22
30.06.22
01.07.23
31.12.22
01.01.23
null
Null stands for max_date (PL / SQL Oracle Max Date).
Moreover, I should only select the values valid for the current year (let's assume we are already in 2022).
Thanks for your help in advance!

You can apply next select statement:
with
-- main table
t1 AS (SELECT w, q1, q2, to_date(q1,'dd.mm.yy') q1d, to_date(q2,'dd.mm.yy') q2d FROM www)
-- custom year in YYYY format
, t0 AS (SELECT '2022' y FROM dual)
-- join and order dates FROM - TO
, t2 AS (SELECT t1.q1, t1.q1d, s2.q2, s2.q2d
FROM t1
LEFT JOIN t1 s2 on t1.q1d <= s2.q2d
ORDER BY t1.q1d, s2.q2d)
-- mark the first each new row-pair by row_number()
, t3 AS (SELECT t2.*,
row_number() OVER (PARTITION BY t2.q1d ORDER BY t2.q1d ) r
FROM t2 )
-- join custom year value and select desired rows based on that value
SELECT q1, q2 FROM t3
JOIN t0 on 1=1
WHERE r = 1
-- for the custom year
AND t0.y <= to_char(q1d, 'yyyy')
ORDER BY q1d;
Demo
In my table-example dates are presented in varchar2 datatype and in dd.mm.yy date format. In case if your table fields have datatype date, then you don't need to implement function to_date() for those 2 fields.
Used table sample:
create table www (w integer, q1 varchar2(30), q2 varchar2(30));
insert into www values (1, '01.01.22', '28.02.22');
insert into www values (1, '01.03.22', '30.06.22');
insert into www values (1, '01.07.22', '31.12.22');
insert into www values (1, '01.01.23', '');
insert into www values (2, '01.01.22', '31.03.22');
insert into www values (2, '01.04.22', '');
If your table sample has more rows which are have null value in the field valid_to and the dates in valid_from are not in any range, let's say:
insert into www values (1, '01.01.24', '');
then previous solution will produce more rows in the end with null value.
In this case you can use that more complex solution:
...
-- join custom year value and select desired rows based on that value
, t4 as (SELECT q1, q2, q1d FROM t3
JOIN t0 on 1=1
WHERE r = 1 AND
-- for the custom year
t0.y <= to_char(q1d, 'yyyy')
ORDER BY q1d)
-- filter non-nullable rows
, t5 as ( SELECT q1, q2 FROM t4 WHERE Q2 IS NOT NULL )
-- max date from rows where Q2 field has null value
, t6 as ( SELECT to_char(MAX(Q1D),'dd.mm.yy') q1, q2
FROM t4
WHERE Q2 IS NULL
GROUP BY q2)
-- append rows with max date
SELECT * FROM t5
UNION ALL
SELECT * FROM t6;
Demo

Related

Query populating dates

query that generates records to hold a future calculated value.
Hi I trying to write a query with the tables below to populate a collection. I want the t2 values when the dates match but when there is not a match I want the dates to populate with a null values (will be populate later with a calculated value) The number of records for the same date should match the last time the dates matched. So in the example for each day after 7/1 there should be 3 records for each day and after 7/5 just 2. I am trying to do this in one query but I am not sure it is possible. Any help on creating this and getting into a collection would be much appreciated.
create table t1 as
WITH DATA AS
(SELECT to_date('07/01/2019', 'MM/DD/YYYY') date1,
to_date('07/10/2019', 'MM/DD/YYYY') date2
FROM dual
)
SELECT date1+LEVEL-1 the_date,
TO_CHAR(date1+LEVEL-1, 'DY','NLS_DATE_LANGUAGE=AMERICAN') day
FROM DATA
WHERE TO_CHAR(date1+LEVEL-1, 'DY','NLS_DATE_LANGUAGE=AMERICAN')
NOT IN ('SAT', 'SUN')
CONNECT BY LEVEL <= date2-date1+1
create table t2
(cdate date,
camount number);
insert into t2 values
('01-JUL-2019', 10);
insert into t2 values
('01-JUL-2019', 20);
insert into t2 values
('01-JUL-2019', 30);
insert into t2 values
('05-JUL-19', 50);
insert into t2 values
('05-JUL-19', 20);
expected results:
01-JUL-19 10
01-JUL-19 20
01-JUL-19 30
02-JUL-19 null
02-JUL-19 null
02-JUL-19 null
03-JUL-19 null
03-JUL-19 null
03-JUL-19 null
04-JUL-19 null
04-JUL-19 null
04-JUL-19 null
05-JUL-19 50
05-JUL-19 20
08-JUL-19 null
08-JUL-19 null
09-JUL-19 null
09-JUL-19 null
10-JUL-19 null
10-JUL-19 null

One approach to this kind of problem is to build the result set incrementally in a few steps:
Count matches that each THE_DATE in T1 has in T2.
Apply the rule you outlined in the question to those THE_DATE which have zero matches (carry forward (across dates in ascending order) the number of matches of the last THE_DATE that did have matches.
Generate the extra rows in T1 for the THE_DATE that have zero matches. (e.g. If it is supposed to have three null records, duplicate up to this number)
Outer join to T2 to get the CAMOUNT where it is available.
Here's an example (The three named subfactors corresponding to steps 1,2,3 above):
WITH DATE_MATCH_COUNT AS (
SELECT T1.THE_DATE,
COUNT(T2.CDATE) AS MATCH_COUNT,
ROW_NUMBER() OVER (PARTITION BY NULL ORDER BY T1.THE_DATE ASC) AS ROWKEY
FROM T1
LEFT OUTER JOIN T2
ON T1.THE_DATE = T2.CDATE
GROUP BY T1.THE_DATE),
ADJUSTED_MATCH_COUNT AS (
SELECT THE_DATE,
MATCH_COUNT AS ACTUAL_MATCH_COUNT,
GREATEST(MATCH_COUNT,
(SELECT MAX(MATCH_COUNT) KEEP ( DENSE_RANK LAST ORDER BY ROWKEY ASC )
FROM DATE_MATCH_COUNT SCALAR_MATCH_COUNT
WHERE SCALAR_MATCH_COUNT.ROWKEY <= DATE_MATCH_COUNT.ROWKEY AND
SCALAR_MATCH_COUNT.MATCH_COUNT > 0)) AS FORCED_MATCH_COUNT
FROM DATE_MATCH_COUNT),
GENERATED_MATCH_ROW AS (
SELECT THE_DATE, FORCED_MATCH_COUNT, MATCH_KEY
FROM ADJUSTED_MATCH_COUNT CROSS APPLY (SELECT LEVEL AS MATCH_KEY
FROM DUAL CONNECT BY LEVEL <= DECODE(ACTUAL_MATCH_COUNT,0,FORCED_MATCH_COUNT,1)))
SELECT THE_DATE, CAMOUNT
FROM GENERATED_MATCH_ROW
LEFT OUTER JOIN T2
ON GENERATED_MATCH_ROW.THE_DATE = T2.CDATE
ORDER BY THE_DATE, CAMOUNT ASC;
Result:
THE_DATE CAMOUNT
____________ __________
01-JUL-19 10
01-JUL-19 20
01-JUL-19 30
02-JUL-19
02-JUL-19
02-JUL-19
03-JUL-19
03-JUL-19
03-JUL-19
04-JUL-19
04-JUL-19
04-JUL-19
05-JUL-19 20
05-JUL-19 50
08-JUL-19
08-JUL-19
09-JUL-19
09-JUL-19
10-JUL-19
10-JUL-19

Determining consecutive and independent PTO days

Based on feedback, I am restructuring my question.
I am working with SQL on a Presto database.
My objective is to report on employees that take consecutive days of PTO or Sick Time since the beginning of 2018. My desired output would have the individual islands of time taken by employee with the start and end dates, along the lines of:
The main table I am using is d_employee_time_off
There are only two time_off_type_name: PTO and Sick Leave.
The ds is a datestamp and I use the latest ds (usually the current date)
I have access to a date table named d_date
I can join the tables on d_employee_time_off.time_off_date = d_date.full_date
I hope that I have structured this question in a fashion that is understandable.

I believe the need here is to join the day off material to a calendar table.
In the example solution below I am generating this "on the fly" but I think you do have your own solution for this. Also in my example I have used the string 'Monday' and moved backward from that (or, you could use 'Friday' and move forward). I'm, not keen on language dependent solutions but as I'm not a Presto user wasn't able to test anything on Presto. So the example below uses some of your own logic, but using SQL Server syntax which I trust you can translate to Presto:
Query:
;WITH
Digits AS (
SELECT 0 AS digit UNION ALL
SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL
SELECT 9
)
, cal AS (
SELECT
ca.number
, dateadd(day,ca.number,'20180101') as cal_date
, datename(weekday,dateadd(day,ca.number,'20180101')) weekday
FROM Digits [1s]
CROSS JOIN Digits [10s]
CROSS JOIN Digits [100s] /* add more like this as needed */
cross apply (
SELECT
[1s].digit
+ [10s].digit * 10
+ [100s].digit * 100 /* add more like this as needed */
AS number
) ca
)
, time_off AS (
select
*
from cal
inner join mytable t on (cal.cal_date = t.time_off_date and cal.weekday <> 'Monday')
or (cal.cal_date between dateadd(day,-2,t.time_off_date)
and t.time_off_date and datename(weekday,t.time_off_date) = 'Monday')
)
, starting_points AS (
SELECT
employee_id,
cal_date,
dense_rank() OVER(partition by employee_id
ORDER BY
time_off_date
) AS rownum
FROM
time_off A
WHERE
NOT EXISTS (
SELECT
*
FROM
time_off B
WHERE
B.employee_id = A.employee_id
AND B.cal_date = DATEADD(day, -1, A.cal_date)
)
)
, ending_points AS (
SELECT
employee_id,
cal_date,
dense_rank() OVER(partition by employee_id
ORDER BY
time_off_date
) AS rownum
FROM
time_off A
WHERE
NOT EXISTS (
SELECT
*
FROM
time_off B
WHERE
B.employee_id = A.employee_id
AND B.cal_date = DATEADD(day, 1, A.cal_date)
)
)
SELECT
S.employee_id,
S.cal_date AS start_range,
E.cal_date AS end_range
FROM
starting_points S
JOIN
ending_points E
ON E.employee_id = S.employee_id
AND E.rownum = S.rownum
order by employee_id
, start_range
Result:
employee_id start_range end_range
1 200035 02.01.2018 02.01.2018
2 200035 20.04.2018 27.04.2018
3 200037 27.01.2018 29.01.2018
4 200037 31.03.2018 02.04.2018
see: http://rextester.com/MISZ50793
CREATE TABLE mytable(
ID INT NOT NULL
,employee_id INTEGER NOT NULL
,type VARCHAR(3) NOT NULL
,time_off_date DATE NOT NULL
,time_off_in_days INT NOT NULL
);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (1,200035,'PTO','2018-01-02',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (2,200035,'PTO','2018-04-20',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (3,200035,'PTO','2018-04-23',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (4,200035,'PTO','2018-04-24',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (5,200035,'PTO','2018-04-25',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (6,200035,'PTO','2018-04-26',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (7,200035,'PTO','2018-04-27',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (8,200037,'PTO','2018-01-29',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (9,200037,'PTO','2018-04-02',1);

How to write SQL script to update conditional data and constraints?

I have just approached SQL and have not had the solution to do this.
I have 2 tables A and B:
A B
ID (char) Year (char)
Zone (char) Code (char)
ZCode (char)
At first, table B will be completely empty. Ex data of table A:
A
01 A 2013/AA
02 A 2018/KK
03 A null
04 B
05 B 2016/HH
I want to update data from table A to table B provided that only ZCode of Zone has the latest year and ZCode will be separated by a "/". This is the result I want:
B
2018 KK
2016 HH
Looking forward to having someone give me the solution to do this.

A very simple solution if your data is consistent. This only works if your data always have a complete year as number ex:2018on left and only 2 characters on right. This is more of hard coding of column length, cant see a reason why u cant use this.
Using Max will select latest year by code
Insert into tableB (Year,Code)
select Max(Left(Columnname,4)) year,
Right (columnname,2) Code from TableA
where Right (columnname,2) is not null or Right (columnname,2)<> ''
group by Right (columnname,2)

Try This
IF OBJECT_ID('dbo.TableA')IS NOT NULL
DROP TABLE TableA
IF OBJECT_ID('dbo.TableB')IS NOT NULL
DROP TABLE TableB
CREATE TABLE TableA (Id INT,Zone VARCHAR(2) ,ZCode VARCHAR(20))
CREATE TABLE TableB ([Year] INT,Code VARCHAR(20))
GO
INSERT INTO TableA(Id,Zone,ZCode)
SELECT 01,'A','2013/AA' UNION ALL
SELECT 02,'B','2016/HH' UNION ALL
SELECT 03,'A','2018/KK'
GO
INSERT INTO TableB
SELECT [Year]
,[Code]
FROM
(
SELECT SUBSTRING(ZCode,0,CHARINDEX('/',ZCode)) As [Year]
,SUBSTRING(ZCode,CHARINDEX('/',ZCode)+1,LEN(ZCode)) AS Code
FROM TableA
)dt
SELECT * FROM TableB ORDER BY [Year] DESC
Result
Year Code
------------
2018 KK
2016 HH
2013 AA

In order to UPDATE in tableB you would required to JOIN the table with tableB on Year / Code columns
WITH CTE AS
(
SELECT
left(a.zcode, 4) year,
substring(a.zcode, charindex('/', a.zcode)+1, len(a.zcode)) code
FROM tableA a
INNER JOIN (
select Zone, max(left(zcode, 4)) year
FROM tableA
GROUP BY Zone
)b ON a.Zone = b.zone and b.year = left(a.zcode, 4)
)
SELECT * FROM CTE

Below code snippet would give your desired output, now based on your requirement you can either do an INSERT into tableB or do an UPDATE
DECLARE #A TABLE(ID CHAR(10), ZONE CHAR(10), ZCODE CHAR(20))
INSERT INTO #A VALUES
('01', 'A', '2013/AA'),
('02', 'B', '2016/HH'),
('03', 'A', '2018/KK')
SELECT Year,Code FROM(
SELECT Year,Code,ROW_NUMBER() OVER (PARTITION BY ZONE ORDER BY Year DESC) rn FROM
(SELECT cast(concat('<x>', REPLACE(ZCODE, '/', '</x><x>'), '</x>') as xml).value('/x[1]','varchar(100)') AS Year,
cast(concat('<x>', REPLACE(ZCODE, '/', '</x><x>'), '</x>') as xml).value('/x[2]','varchar(100)') AS Code,*
FROM #A WHERE ZCODE IS NOT NULL) t1) t2
WHERE rn = 1;

You can use this query to insert data into table B, when it is completely empty
INSERT INTO B ([YEAR], [MONTH])
select
Substring(ZCode,0,charindex('/',ZCode)) BYEAR,
Substring(ZCode,charindex('/',ZCode)+1,LEN(ZCode)-charindex('/',ZCode)) BCode
from A
else use can update this query to update record of table B based on BCode.
Query edited for NOT NULL and GROUP condition
select MAX(v.BYEAR), v.BCode from
(select
Substring(ZCode,0,charindex('/',ZCode)) BYEAR
,Substring(ZCode,charindex('/',ZCode)+1,LEN(ZCode)-charindex('/',ZCode)) BCode
from B ) v
Where v.BCODE IS NOT NULL
Group by v.BCODE

Insert data into empty cells in ascending order

I have a Postgres table with following structure:
CREATE TABLE tb1 (
id integer,
name text,
date date,
time time without tz
);
CREATE TABLE tb2 (
id integer,
name text,
date date
);
I need to generate a 3rd table tb3 that will have column time_now with increment steps by 10 minutes. If tb1.time is not the same as tb3.time_now then b2.name is filled in. If tb1.time equals (or close by time_now) it is inserted into tb3.
Example:
tb1
1, xxxx, 2014-10-01, 08:20:00
2, yyyy, 2014-10-01, 08:40:00
tb2
1, zzzz, 2014-10-01
2, vvvv, 2014-10-01
3, eeee, 2014-10-01
3rd table should look like:
1, 08:00:00,zzzz -----> from tb2
2, 08:10:00,vvvv -----> from tb2
3, 08:20:00,xxxx -----> from tb1
4, 08:30:00,eeee -----> from tb2
5, 08:40:00,yyyy -----> from tb1
How to achieve this?

SELECT t.id, t.time::text, COALESCE(t.name, t2.name) AS name
FROM (
SELECT g.id, g.time, t1.name
, CASE WHEN t1.name IS NULL THEN
row_number() OVER (PARTITION BY t1.name ORDER BY g.id)
END AS rn
FROM (
SELECT g AS id, '08:00'::time + '10 min'::interval * (g-1) AS time
FROM generate_series (1,6) g
) g
LEFT JOIN tb1 t1 USING (time)
) t
LEFT JOIN tb2 t2 ON t2.id = t.rn
ORDER BY t.id;
First build a table of desired times with generate_series(). From 8:00 to 8:50 in my example.
Join to tb1 on time. Attach ascending numbers to empty slots with row_number() (rn).
Join tb2 to the remaining empty slots in ascending order.
Use COALESCE to pick names from tb1 and tb2.
Be wary about off-by-1 errors.
SQL Fiddle.
Aside: I would use none of those column names. id and name are not descriptive - never use those, be more specific. date and time are basic type names.

SQL to Return missing Row

I have one Scenario where I need to find missing records in Table using SQL - without using Cursor, Views, SP.
For a particular CustID initial Start_Date will be 19000101 and End_date will be any random date.
Then for next Record for the same CustID will have its Start_Date as End_Date (of previous Record) + 1.
Its End_Date again will be any random date.
And so on….
For Last record of same CustID its end Date will be 99991231.
Following population of data will explain it better.
CustID Start_Date End_Date
1 19000101 20121231
1 20130101 20130831
1 20130901 20140321
1 20140321 99991231
Basically I am trying to populate data like in SCD2 scenario.
Now I want to find missing record (or CustID).
Like below we don’t have record with CustID = 4 with Start_Date = 20120606 and End_Date = 20140101
CustID Start_Date End_Date
4 19000101 20120605
4 20140102 99991231
Code for Creating Table
CREATE TABLE TestTable
(
CustID int,
Start_Date int,
End_Date int
)
INSERT INTO TestTable values (1,19000101,20121231)
INSERT INTO TestTable values (1,20130101,20130831)
INSERT INTO TestTable values (1,20130901,20140321)
INSERT INTO TestTable values (1,20140321,99991231)
INSERT INTO TestTable values (2,19000101,99991213)
INSERT INTO TestTable values (3,19000101,20140202)
INSERT INTO TestTable values (3,20140203,99991231)
INSERT INTO TestTable values (4,19000101,20120605)
--INSERT INTO TestTable values (4,20120606,20140101) --Missing Value
INSERT INTO TestTable values (4,20140102,99991231)
Now SQL should return CustID = 4 as its has missing Value.

My idea is based on this logic. Lets assume 19000101 as 1 and 99991231 as 10. Now for all IDs, if you subtract the End_date - start_date and add them up, the total sum must be equal to 9 (10 - 1). You can do the same here
SELECT ID, SUM(END_DATE - START_DATE) as total from TABLE group by ID where total < (MAX_END_DATE - MIN_START_DATE)
You might want to find the command in your SQL that gives the number of days between 2 days and use that in the SUM part.
Lets take the following example
1 1900 2003
1 2003 9999
2 1900 2222
2 2222 9977
3 1900 9999
The query will be executed as follows
1 (2003 - 1900) + (9999 - 2003) = 1 8098
2 (2222 - 1900) + (9977 - 2222) = 2 9077
3 (9999 - 1900) = 3 8098
The where clause will eliminate 1 and 3 giving you only 2, which is what you want.

If you just need the CustID then this will do
SELECT t1.CustID
FROM TestTable t1
LEFT JOIN TestTable t2
ON DATEADD(D, 1, t1.Start_Date) = t2.Start_Date
WHERE t2.CustID IS NULL
GROUP BY t1.CustID

You need rows if the one of the following conditions is met:
Not a final row (99991231) and no matching next row
Not a start row (19000101) and no matching previous row
You can left join to the same table to find previous and next rows and filter the results where you don't find a row by checking the column values for null:
SELECT t1.CustID, t1.StartDate, t1.EndDate
FROM TestTable t1
LEFT JOIN TestTable tPrevious on tPrevious.CustID = t1.CustID
and tPrevious.EndDate = t1.StartDate - 1
LEFT JOIN TestTable tNext on tNext.CustID = t1.CustID
and tNext.StartDate = t1.EndDate + 1
WHERE (t1.EndDate <> 99991231 and tNext.CustID is null) -- no following
or (t1.StartDate <> 19000101 and tPrevious.CustID is null) -- no previous

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL - Select date ranges without overlapping - sql

Related

Query populating dates

Determining consecutive and independent PTO days

How to write SQL script to update conditional data and constraints?

Insert data into empty cells in ascending order

SQL to Return missing Row

Categories

Resources