How do I do this ? Time interval - sql

there is a table.
(1, 'b', '2010-01-01 00:00:00', '2020-01-01 00:00:00'),
(1, 'z', '2010-02-01 00:00:00', '2015-01-01 00:00:00'),
How to do this:
(1, 'b', '2010-01-01 00:00:00', '2010-01-31 23:59:59'),
(1, 'z', '2010-02-01 00:00:00', '2015-01-01 00:00:00'),
(1, 'b', '2015-01-01 00:00:01', '2020-01-01 00:00:00');

You can do it this way:
I ddint add the part when you take away a second from the enddate or add a second to the fromdate as I didnt see the logic there
with cte as
(
select 1 as a, 'b' as b, cast('2010-01-01 00:00:00'as date) as start_, cast('2020-01-01 00:00:00'as date) as end_
union select 1, 'z', '2010-02-01 00:00:00', '2015-01-01 00:00:00'
),
cte2 as
(
select start_ as date_ from cte union select end_ from cte
),
cte3 as
(
select a, b, date_ from cte2 a inner join cte b on date_ between start_ and end_
),
final as
(
select a.a, a.b, a.date_ as startdate,
case when a.b = lead(a.b)over(order by a.date_) then lead(a.date_)over(order by a.date_) end as enddate
from cte3 a
)
select * from final where enddate is not null order by startdate
Output:
a b startdate enddate
1 b 2010-01-01 2010-02-01
1 z 2010-02-01 2015-01-01
1 b 2015-01-01 2020-01-01

Related

Postgres grouping by range

I have data looking like this
What I am trying to achieve is data for historgram that would count values into specific ranges. For category A value range 1-100 and for category B value range 0-125 where value for category C = 5. The problem I have that is data in multiplerows and I need to filter first on C and then count values into ranges to display histogram.
To get counts lets say per 10 seconds looking like this
Code to generate data:
CREATE TEMP TABLE sample (
ts timestamp
,category varchar(2)
, val int)
insert into sample values
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'A', 12),
(to_timestamp('01.01.2018 08:00:02', 'dd-mm-yyyy hh24:mi:ss'), 'A', 44),
(to_timestamp('01.01.2018 08:00:03', 'dd-mm-yyyy hh24:mi:ss'), 'C', 1),
(to_timestamp('01.01.2018 08:00:04', 'dd-mm-yyyy hh24:mi:ss'), 'B', 24),
(to_timestamp('01.01.2018 08:00:05', 'dd-mm-yyyy hh24:mi:ss'), 'B', 111),
(to_timestamp('01.01.2018 08:00:06', 'dd-mm-yyyy hh24:mi:ss'), 'C', 5),
(to_timestamp('01.01.2018 08:00:07', 'dd-mm-yyyy hh24:mi:ss'), 'A', 145),
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'B', 16),
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'C', 47),
(to_timestamp('01.01.2018 08:00:02', 'dd-mm-yyyy hh24:mi:ss'), 'C', 5),
(to_timestamp('01.01.2018 08:00:02', 'dd-mm-yyyy hh24:mi:ss'), 'B', 34),
(to_timestamp('01.01.2018 08:00:03', 'dd-mm-yyyy hh24:mi:ss'), 'B', 111),
(to_timestamp('01.01.2018 08:00:03', 'dd-mm-yyyy hh24:mi:ss'), 'C', 5),
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'A', 19),
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'B', 46),
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'C', 57)
I thought if I pivot data like so
s
elect
ts,
category,
case when category = 'A' then val end as "A",
case when category = 'B' then val end as "B",
case when category = 'C' then val end as "C"
from sample
order by ts
then have problem with pivot nulls
Here it is:
with periods(pts) as
(
select *
from generate_series
(
timestamp '2018-01-01 08:00:00',
timestamp '2018-01-01 08:01:00',
interval '10 seconds'
) ts
)
select pts period_start,
pts + interval '10 seconds' period_end,
lat.cat_a,
lat.cat_b,
lat.cat_c
from periods
cross join lateral
(
select count(1) filter (where category = 'A' and val between 0 and 100) as cat_a,
count(1) filter (where category = 'B' and val between 0 and 125) as cat_b,
count(1) filter (where category = 'C' and val = 5) as cat_c
from sample
where ts >= pts and ts < pts + interval '10 seconds'
) lat;
period_start
period_end
cat_a
cat_b
cat_c
2018-01-01 08:00:00
2018-01-01 08:00:10
2
2
1
2018-01-01 08:00:10
2018-01-01 08:00:20
0
0
0
2018-01-01 08:00:20
2018-01-01 08:00:30
0
0
0
2018-01-01 08:00:30
2018-01-01 08:00:40
0
0
0
2018-01-01 08:00:40
2018-01-01 08:00:50
0
0
0
2018-01-01 08:00:50
2018-01-01 08:01:00
0
0
0
2018-01-01 08:01:00
2018-01-01 08:01:10
0
0
0
One-row version is simple:
select min(ts) period_start,
max(ts) period_end,
count(1) filter (where category = 'A' and val between 0 and 100) as cat_a,
count(1) filter (where category = 'B' and val between 0 and 125) as cat_b,
count(1) filter (where category = 'C' and val = 5) as cat_c
from sample;
Added after the clarification comments
select * from (<the first version of the query here>) t where cat_c > 0;

find at least 2 consecutive items based on date ranges

there are lot of solutions of similar question but based only one date column.
I would like to know maybe better solution for this to solve, I am attaching my solution but I find it a little bit complicated if you know better approach to this please post it.
here is table with orders with start and end dates for 2 items.
I would like to print at least 2 consecutive rows based on date and item.
ITEM , START , END
1. A, 01.01.2020, 31.01.2020
2. A, 01.02.2020, 31.03.2020
3. B, 01.02.2020, 30.04.2020
4. A, 01.05.2020, 30.06.2020
5. B, 01.06.2020, 31.07.2020
6. B, 01.09.2020, 30.09.2020
7. A, 01.08.2020, 31.10.2020
8. B, 01.10.2020, 31.10.2020
9. B, 01.11.2020, 31.12.2020
the output should be rows 1 and 2 for item A and rows 6,8 and 9 for item B
here is my approuch to this
with pool as (
select ITEM, START_DATE, END_DATE,
nvl(lag(end_date,1) over (partition by item order by end_date),START_DATE-1) prev_End_Date
from orders )
, pool2 as (
select item ,
START_DATE, END_DATE,
sum(case when PREV_END_DATE+1 = START_DATE then 0 else 1 end ) over (partition by item order by START_DATE) grp
from pool )
select item,start_date,end_date from (
select
ITEM,
START_DATE,
END_DATE,
grp,
count(grp) over (partition by item,grp ) cnt
from pool2)
where cnt>=2
;
Hmmm . . . use lag() and lead() to see the next/previous values and check if they match:
select o.*
from (select o.*,
lag(end) over (partition by product order by start) as prev_end,
lead(start) over (partition by product order by start) as next_start
from orders o
) o
where start = prev_end + interval '1' day or
end = next_start - interval '1' day;
-- create table and insert rows for test
Create table order_overlap (id number, item varchar2(1), start_date date , end_date date );
insert into order_overlap(id,start_date, end_date, item) values( 1,to_date('01.01.2020', 'dd.mm.yyyy'), to_date( '31.01.2020', 'dd.mm.yyyy'), 'A');
insert into order_overlap(id,start_date, end_date, item) values( 2, to_date('01.02.2020', 'dd.mm.yyyy'), to_date( '31.03.2020', 'dd.mm.yyyy'), 'A');
insert into order_overlap(id,start_date, end_date, item) values( 3, to_date('01.02.2020', 'dd.mm.yyyy'), to_date( '30.04.2020', 'dd.mm.yyyy'), 'B');
insert into order_overlap(id,start_date, end_date, item) values( 4, to_date('01.05.2020', 'dd.mm.yyyy'), to_date( '30.06.2020', 'dd.mm.yyyy'), 'A');
insert into order_overlap(id,start_date, end_date, item) values( 5, to_date('01.06.2020', 'dd.mm.yyyy'), to_date( '31.07.2020', 'dd.mm.yyyy'), 'B');
insert into order_overlap(id,start_date, end_date, item) values( 6, to_date('01.09.2020', 'dd.mm.yyyy'), to_date( '30.09.2020', 'dd.mm.yyyy'), 'B');
insert into order_overlap(id,start_date, end_date, item) values( 7, to_date('01.08.2020', 'dd.mm.yyyy'), to_date( '31.10.2020', 'dd.mm.yyyy'), 'A');
insert into order_overlap(id,start_date, end_date, item) values( 8, to_date('01.10.2020', 'dd.mm.yyyy'), to_date( '31.10.2020', 'dd.mm.yyyy'), 'B');
insert into order_overlap(id,start_date, end_date, item) values( 5, to_date('01.11.2020', 'dd.mm.yyyy'), to_date( '31.12.2020', 'dd.mm.yyyy'), 'B');
-- I did sth a little bit different but maybe you like it.
-- I joined conseutive rows into one - so if you have item
A 01.01.2020 - 31.01.2020
A 01.02.2020 - 28.02.2020
you get one recod
A 01.01.2020 - 28.02.2020
select item, min(start_date) start_date , max(end_date) end_date, count(*)
from (
select item, start_date, end_date,
case when lead(start_date) over(partition by item order by start_date) = end_date + 1
OR lag(end_date) over(partition by item order by end_date) + 1 = start_date
then 0
else rownum
end continuity
from order_overlap )
group by item, continuity
order by item, start_date;
You can simply use MATCH_RECOGNIZE to perform a row-by-row comparison and to only return the groups of rows which match the pattern:
SELECT *
FROM table_name
MATCH_RECOGNIZE (
PARTITION BY item
ORDER BY start_date, end_date
ALL ROWS PER MATCH
PATTERN ( FIRST_ROW NEXT_ROWS+ )
DEFINE
NEXT_ROWS AS (
NEXT_ROWS.START_DATE = PREV( END_DATE ) + INTERVAL '1' DAY
)
)
So, for your sample data:
CREATE TABLE table_name ( ITEM, START_DATE, END_DATE ) AS
SELECT 'A', DATE '2020-01-01', DATE '2020-01-31' FROM DUAL UNION ALL
SELECT 'A', DATE '2020-02-01', DATE '2020-03-31' FROM DUAL UNION ALL
SELECT 'B', DATE '2020-02-01', DATE '2020-04-30' FROM DUAL UNION ALL
SELECT 'A', DATE '2020-05-01', DATE '2020-06-30' FROM DUAL UNION ALL
SELECT 'B', DATE '2020-06-01', DATE '2020-07-31' FROM DUAL UNION ALL
SELECT 'B', DATE '2020-09-01', DATE '2020-09-30' FROM DUAL UNION ALL
SELECT 'A', DATE '2020-08-01', DATE '2020-10-31' FROM DUAL UNION ALL
SELECT 'B', DATE '2020-10-01', DATE '2020-10-31' FROM DUAL UNION ALL
SELECT 'B', DATE '2020-11-01', DATE '2020-12-31' FROM DUAL;
This outputs:
ITEM | START_DATE | END_DATE
:--- | :--------- | :---------
A | 2020-01-01 | 2020-01-31
A | 2020-02-01 | 2020-03-31
B | 2020-09-01 | 2020-09-30
B | 2020-10-01 | 2020-10-31
B | 2020-11-01 | 2020-12-31
db<>fiddle here

need to get a subsequent record with a specific value

i have the following table :
Dt Status
05.23.2019 10:00:00 A
05.23.2019 11:00:00 B
05.23.2019 12:00:00 B
05.23.2019 13:00:00 D
05.23.2019 14:00:00 A
05.23.2019 15:00:00 B
05.23.2019 16:00:00 C
05.23.2019 17:00:00 D
05.23.2019 18:00:00 A
For each status A i need to get the next status D. The result should be like this :
Status1 Status2 Dt1 Dt2
A D 05.23.2019 10:00:00 05.23.2019 13:00:00
A D 05.23.2019 14:00:00 05.23.2019 17:00:00
A null 05.23.2019 18:00:00 null
I have my own solution based on cross/outer apply , In terms of performance i need solution without cross/outer apply.
We can try using ROW_NUMBER here along with some pivot logic:
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY Status ORDER BY Dt) rn
FROM yourTable
WHERE Status IN ('A', 'D')
)
SELECT
MAX(CASE WHEN Status = 'A' THEN Status END) AS Status1,
MAX(CASE WHEN Status = 'D' THEN Status END) AS Status2,
MAX(CASE WHEN Status = 'A' THEN Dt END) AS Dt1,
MAX(CASE WHEN Status = 'D' THEN Dt END) AS Dt2
FROM cte
GROUP BY rn
ORDER BY rn;
Demo
The idea here is to generate a row number sequence along your entire table, for each separate Status value (A or D). Then, aggregate by that row number sequence to bring the A and D records together.
As result columns Status1 and Status2 always seem to be "A" and "D" respectively, I omitted them in my result.
DECLARE #Data TABLE
(
[Dt] SMALLDATETIME,
[Status] CHAR(1)
);
INSERT INTO #Data ([Dt], [Status]) VALUES
('2019-05-23 10:00:00', 'A'),
('2019-05-23 11:00:00', 'B'),
('2019-05-23 12:00:00', 'B'),
('2019-05-23 13:00:00', 'D'),
('2019-05-23 14:00:00', 'A'),
('2019-05-23 15:00:00', 'B'),
('2019-05-23 16:00:00', 'C'),
('2019-05-23 17:00:00', 'D'),
('2019-05-23 18:00:00', 'A'),
('2019-05-23 19:00:00', 'D'),
('2019-05-23 20:00:00', 'D'),
('2019-05-23 21:00:00', 'A'),
('2019-05-23 22:00:00', 'A'),
('2019-05-23 23:00:00', 'A');
SELECT
D.[Dt] AS [Dt1],
[LastDBeforeNextA].[Dt] AS [Dt2]
FROM
#Data AS D
OUTER APPLY (SELECT TOP (1) [Dt]
FROM #Data
WHERE [Status] = 'A' AND [Dt] > D.[Dt]
ORDER BY [Dt]) AS [NextA]
OUTER APPLY (SELECT TOP (1) [Dt]
FROM #Data
WHERE [Status] = 'D' AND [Dt] < [NextA].[Dt] AND [Dt] > D.[Dt]
ORDER BY [Dt] DESC) AS [LastDBeforeNextA]
WHERE
D.[Status] = 'A' AND
([NextA].[Dt] > [LastDBeforeNextA].[Dt] OR ([LastDBeforeNextA].[Dt] IS NULL AND [NextA].[Dt] IS NULL))
It initially gets all records from the table where status is 'A' (using expression D.[Status] = 'A' in the WHERE-clause).
For each record found, it joins the date of the next record with status A (table expression with alias NextA) and the date of the last record with status D that comes right before the next A-record but after the current A-record (table expression with alias LastDBeforeNextA).
Results are valid when a D-record is found (expression [NextA].[Dt] > [LastDBeforeNextA].[Dt] in the WHERE-clause) or when there is no D-record yet (expression [LastDBeforeNextA].[Dt] IS NULL in the WHERE-clause). In the latter case, you need to get the latest A-record, however (expression [NextA].[Dt] IS NULL in the WHERE-clause), since there can be multiple A-records after the last D-record.

Date wise hourly (on 24 hour) coustomer count

I have a data set where customer id , customer join time and leave time available. I want to count hourly basis each date customer
Here is sample data set
My expected output
Here I going to add my code snip that i tried,where 1st created 24 hours span then tried to join and aggregate function for getting expected result and got for current date but i need for any date i.e dynamically
select logdate as date,timespan,count(customer_id)
(
SELECT userid,cast(joinTime as date) as logdate,customer_id
,starttime,endtime,timespan
FROM login_out_logs AS logTable
left join
(select '00:00:00 - 01:00:00' timespan,DATEadd(hh,0,cast(dateadd(dd,-1,getdate()))) starttime,dateadd(hh,1,cast(dateadd(dd,-1,getdate()))) endtime
union
select '01:00:00 - 02:00:00', dateadd(hh,1,cast(dateadd(dd,-1,getdate()))),dateadd(hh,2,cast(dateadd(dd,-1,getdate())))
union
select '02:00:00 - 03:00:00', dateadd(hh,2,cast(dateadd(dd,-1,getdate()))),dateadd(hh,3,cast(dateadd(dd,-1,getdate())))
union
select '03:00:00 - 04:00:00', dateadd(hh,3,cast(dateadd(dd,-1,getdate()))),dateadd(hh,4,cast(dateadd(dd,-1,getdate())))
union
select '04:00:00 - 05:00:00', dateadd(hh,4,cast(dateadd(dd,-1,getdate()))),dateadd(hh,5,cast(dateadd(dd,-1,getdate())))
union
select '05:00:00 - 06:00:00',dateadd(hh,5,cast(dateadd(dd,-1,getdate()))),dateadd(hh,6,cast(dateadd(dd,-1,getdate())))
union
select '06:00:00 - 07:00:00',dateadd(hh,6,cast(dateadd(dd,-1,getdate()))),dateadd(hh,7,cast(dateadd(dd,-1,getdate())))
union
select '07:00:00 - 08:00:00',dateadd(hh,7,cast(dateadd(dd,-1,getdate()))),dateadd(hh,8,cast(dateadd(dd,-1,getdate())))
union
select '08:00:00 - 09:00:00',dateadd(hh,8,cast(dateadd(dd,-1,getdate()))),dateadd(hh,9,cast(dateadd(dd,-1,getdate())))
union
select '09:00:00 - 10:00:00',dateadd(hh,9,cast(dateadd(dd,-1,getdate()))),dateadd(hh,10,cast(dateadd(dd,-1,getdate())))
union
select '10:00:00 - 11:00:00',dateadd(hh,10,cast(dateadd(dd,-1,getdate()))),dateadd(hh,11,cast(dateadd(dd,-1,getdate())))
union
select '11:00:00 - 12:00:00',dateadd(hh,11,cast(dateadd(dd,-1,getdate()))),dateadd(hh,12,cast(dateadd(dd,-1,getdate())))
union
select '12:00:00 - 13:00:00',dateadd(hh,12,cast(dateadd(dd,-1,getdate()))),dateadd(hh,13,cast(dateadd(dd,-1,getdate())))
union
select '13:00:00 - 14:00:00',dateadd(hh,13,cast(dateadd(dd,-1,getdate()))),dateadd(hh,14,cast(dateadd(dd,-1,getdate())))
union
select '14:00:00 - 15:00:00',dateadd(hh,14,cast(dateadd(dd,-1,getdate()))),dateadd(hh,15,cast(dateadd(dd,-1,getdate())))
union
select '15:00:00 - 16:00:00',dateadd(hh,15,cast(dateadd(dd,-1,getdate()))),dateadd(hh,16,cast(dateadd(dd,-1,getdate())))
union
select '16:00:00 - 17:00:00',dateadd(hh,16,cast(dateadd(dd,-1,getdate()))),dateadd(hh,17,cast(dateadd(dd,-1,getdate())))
union
select '17:00:00 - 18:00:00',dateadd(hh,17,cast(dateadd(dd,-1,getdate()))),dateadd(hh,18,cast(dateadd(dd,-1,getdate())))
union
select '18:00:00 - 19:00:00',dateadd(hh,18,cast(dateadd(dd,-1,getdate()))),dateadd(hh,19,cast(dateadd(dd,-1,getdate())))
union
select '19:00:00 - 20:00:00',dateadd(hh,19,cast(dateadd(dd,-1,getdate()))),dateadd(hh,20,cast(dateadd(dd,-1,getdate())))
union
select '20:00:00 - 21:00:00',dateadd(hh,20,cast(dateadd(dd,-1,getdate()))),dateadd(hh,21,cast(dateadd(dd,-1,getdate())))
union
select '21:00:00 - 22:00:00',dateadd(hh,21,cast(dateadd(dd,-1,getdate()))),dateadd(hh,22,cast(dateadd(dd,-1,getdate())))
union
select '22:00:00 - 23:00:00',dateadd(hh,22,cast(dateadd(dd,-1,getdate()))),dateadd(hh,23,cast(dateadd(dd,-1,getdate())))
union
select '24:00:00 - 00:00:00',dateadd(hh,23,cast(dateadd(dd,-1,getdate()))),dateadd(hh,23,dateadd(mi,59,cast(dateadd(dd,-1,getdate())))))a
on starttime between jointime and leaveTime
or endtime between jointime and leaveTime
or jointime>=starttime and jointime<endtime
) as T
group by leaveTime,timespan
Date Hour customer_count
2018-01-01 8-9 1
2018-01-01 9-10 1
2018-01-01 10-11 1
2018-01-01 11-12 1
2018-01-01 12-13 1
2018-01-01 13-14 1
2018-01-01 14-15 1
2018-01-01 15-16 1
2018-01-01 16-17 1
2018-01-01 17-18 1
2018-01-01 18-19 1
2018-01-01 19-20 1
2018-01-01 20-21 2
2018-01-01 21-22 3
2018-01-01 22-23 2
2018-01-01 23-00 1
Here is an approach - maybe this already solves your problem. I designed it in order to work with any day-difference between join and leave. However, I can't tell anything about the performance on larger sets since I tested with your example only and the evaluation of all relevant hours might take a bit longer if it comes to bigger data sets.
Anyways, I used a recursice cte here in order to evaluate all hours between join and leave and lateron I group by date and hour:
DECLARE #Cust TABLE(
customer_id INT,
joinTime DATETIME,
leaveTime DATETIME
)
INSERT INTO #Cust VALUES
(536, '2018-01-01 08:05:00', '2018-01-01 18:31:00'),
(344, '2018-01-01 19:37:00', '2018-01-01 20:16:00'),
(344, '2018-01-01 19:49:00', '2018-01-01 20:00:00'),
(899, '2018-01-01 20:49:00', '2018-01-01 21:14:00'),
(2336, '2018-01-01 21:02:00', '2018-01-01 21:03:00'),
(335, '2018-01-01 21:03:00', '2018-01-01 23:43:00'),
(2336, '2018-01-01 21:03:00', '2018-01-02 00:06:00'),
(899, '2018-01-01 21:18:00', '2018-01-01 22:24:00'),
(345, '2018-01-01 21:21:00', '2018-01-01 21:39:00'),
(345, '2018-01-01 21:53:00', '2018-01-02 00:13:00');
;WITH cte AS(
SELECT c.customer_id,
c.joinTime,
c.leaveTime,
c.joinTime x
FROM #Cust c
UNION ALL
SELECT c.customer_id,
c.joinTime,
c.leaveTime,
DATEADD(HOUR, 1, x) x
FROM cte c
WHERE DATEADD(HOUR, 1, x) <= CASE WHEN DATEPART(MINUTE, x) < DATEPART(MINUTE, c.leaveTime) THEN c.leaveTime ELSE DATEADD(HOUR, 1, c.leaveTime) END
)
SELECT CONVERT(DATE, x) AS cDate, DATEPART(HOUR, x) AS cHour, COUNT(*) AS cCount
FROM cte
GROUP BY CONVERT(DATE, x), DATEPART(HOUR, x)
ORDER BY 1,2
OPTION (MAXRECURSION 0)
Try this:
;WITH hourlist(starthour) AS (
SELECT 0 -- Seed Row
UNION ALL
SELECT starthour + 1 -- Recursion
FROM hourlist
where starthour+1<=23
)
SELECT
day
,convert(nvarchar,starthour)+'-'+convert(nvarchar,case when starthour+1=24 then 0 else starthour+1 end) hourtitle
,count(distinct customer_id) 'customer count'
FROM
hourlist h -- list of all hourse
cross join
(
select distinct dateadd(day,datediff(day,0, joinTime),0) from #login_out_logs
union
select distinct dateadd(day,datediff(day,0,leaveTime),0) from #login_out_logs
)q10(day) -- list of all days of jointime and leavetime
inner join #login_out_logs l on -- log considered for specific day/hour if starts before hourend and ends before hourstart
l.joinTime <dateadd(hour,starthour+1,q10.day)
and
l.leaveTime>=dateadd(hour,starthour ,q10.day)
group by day,starthour
order by day,starthour
Note: this will only work for jointimes and leavetimes that differ 0 or 1 days, not 2 or more.

splitting overlapping dates in SQL

I'm on SQLServer 2008 R2
I'm trying to create a report and chart for a a manufacturing resource's activity for a give period (typically 30-90 days)
Jobs are created for the length of the run (e.g. 4 days). If the weekend is not worked and the above jobs starts on a Friday, the resource's activity needs to show 1 day running, 2 days down, 3 days running without the production scheduler having to make it two jobs. I have the jobs' schedules in one table and the downtimes in another (so think of DT like some sort of calendar table). Unusually, the end time is supplied with the downtime factored in.
So I need the query to create 3 datetime ranges for this job: Fri running, Sat,Sun down, Mon,Tues,Wed Running. Note: a single job can have multiple downtime events.
Been going round in circles on this for a while. i'm sure there's an elegant way to do it: I just can't find it. I've found several similar post, but can't apply any to my case (or at least can;t get them to work)
Below is some sample date and expected results. I hope the explanation and example data is clear.
-- Create tables to work with / Source and Destination
CREATE TABLE #Jobs
(
ResourceID int
,JobNo VARCHAR(10)
,startdate SMALLDATETIME
,enddate SMALLDATETIME
)
CREATE TABLE #Downtime
(
ResourceID INT
,Reason VARCHAR(10)
,startdate SMALLDATETIME
,enddate SMALLDATETIME
)
CREATE TABLE #Results
(
ResourceID INT
,Activity VARCHAR(10)
,startdate SMALLDATETIME
,enddate SMALLDATETIME
,ActivityType varchar(1)
)
-- Job Schedule
INSERT INTO [#Jobs]
(
[ResourceID],
[JobNo],
startdate
,enddate
)
SELECT 1, 'J1', '2014-04-01 08:00' ,'2014-04-01 17:00'
UNION ALL
SELECT 1, 'J2', '2014-04-01 17:00' , '2014-04-01 23:00'
UNION ALL
SELECT 2, 'J3', '2014-04-01 08:00' ,'2014-04-01 23:00'
UNION ALL
SELECT 3, 'J4', '2014-04-01 08:00' ,'2014-04-01 09:00'
SELECT * FROM #jobs
-- Downtime Scehdule
INSERT INTO [#Downtime]
(
[ResourceID],
Reason,
startdate
,enddate
)
SELECT 1, 'DOWN', '2014-04-01 10:00' ,'2014-04-01 11:00'
UNION ALL
SELECT 1, 'DOWN', '2014-04-01 21:00' , '2014-04-01 22:00'
UNION ALL
SELECT 2, 'DOWN', '2014-04-01 10:00' ,'2014-04-01 11:00'
UNION ALL
SELECT 2, 'DOWN', '2014-04-01 21:00' , '2014-04-01 22:00'
UNION ALL
SELECT 3, 'DOWN', '2014-04-01 10:00' ,'2014-04-01 11:00'
UNION ALL
SELECT 3, 'DOWN', '2014-04-01 21:00' , '2014-04-01 22:00'
SELECT * FROM #Downtime
-- Expected Results
INSERT INTO [#Results]
(
Activity,
[ResourceID],
startdate
,enddate
,[ActivityType]
)
SELECT 'J1', 1, '2014-04-01 08:00' ,'2014-04-01 10:00', 'P'
UNION ALL
SELECT 'DOWN', 1, '2014-04-01 10:00' , '2014-04-01 11:00', 'D'
UNION ALL
SELECT 'J1', 1, '2014-04-01 11:00' ,'2014-04-01 17:00', 'P'
UNION ALL
SELECT 'J2', 1, '2014-04-01 17:00' , '2014-04-01 21:00', 'P'
UNION ALL
SELECT 'DOWN', 1, '2014-04-01 21:00' , '2014-04-01 22:00', 'D'
UNION ALL
SELECT 'J2', 1, '2014-04-01 22:00' ,'2014-04-01 23:00', 'P'
UNION ALL
SELECT 'J3', 2, '2014-04-01 08:00' ,'2014-04-01 10:00', 'P'
UNION ALL
SELECT 'DOWN', 2, '2014-04-01 10:00' , '2014-04-01 11:00', 'D'
UNION ALL
SELECT 'J3', 2, '2014-04-01 11:00' ,'2014-04-01 21:00', 'P'
UNION ALL
SELECT 'DOWN', 2, '2014-04-01 21:00' , '2014-04-01 22:00', 'D'
UNION ALL
SELECT 'J3', 2, '2014-04-01 22:00' ,'2014-04-01 23:00', 'P'
UNION ALL
SELECT 'J4', 3, '2014-04-01 08:00' ,'2014-04-01 09:00', 'P'
UNION ALL
SELECT 'DOWN', 3, '2014-04-01 10:00' , '2014-04-01 11:00', 'D'
UNION ALL
SELECT 'DOWN', 3, '2014-04-01 21:00' , '2014-04-01 22:00', 'D'
SELECT * FROM #Results
ORDER BY [ResourceID], Startdate
DELETE FROM #Results
|--------------------------J1------------------------------------| running
|----D1-----| |-------D2-------| down
|--J1--|----D1-----|-------J1------|-------D2-------|-----J1-----| result
|-----------------------------J1-----------| running
|----D1-------| down
|-----------------J1-----------------------| |----D1-------| result
Can someone point me in the right direction?
This is the closest I've got. Works great when there is an overlap, but fails on J4 where job ends before downtime
WITH cte
AS ( SELECT
ROW_NUMBER() OVER ( ORDER BY ResourceID, dt ) AS Rno
,x.ResourceID
,x.Activity
,Dt
,xdt.ActivityType
FROM
(
SELECT
ResourceID
,JobNo AS Activity
,startdate
,enddate
,'P' AS ActivityType
FROM #Jobs
UNION ALL
SELECT
ResourceID
,Reason AS Activity
,startdate
,enddate
,'D' AS ActivityType
FROM #Downtime
) AS x
CROSS APPLY
(
VALUES ( x.startdate, x.ActivityType),
( x.enddate, x.ActivityType) ) AS xdt
( Dt, ActivityType )
)
SELECT
x.ResourceID
,CASE WHEN x.Activity > x1.Activity THEN x.Activity
ELSE x1.Activity
END AS Activity
,x.dt AS StartDate
,x1.Dt AS EndDate
,CASE WHEN x.ActivityType > x1.ActivityType THEN x.ActivityType
ELSE x1.ActivityType
END AS activitytype
FROM
cte AS x
LEFT OUTER JOIN cte AS x1 ON x.ResourceID = x1.ResourceID
AND x.Rno = x1.Rno - 1
WHERE
x1.Dt IS NOT NULL
AND x1.Dt <> x.Dt;
Thanks
Mark
You were actually pretty close - rather than doing everything in the initial CTE, you actually want to join back to the original data later. Essentially, you're performing a variant on the answer supplied here.
The following query should get you what you need:
WITH AllDates AS (SELECT a.*, ROW_NUMBER() OVER(PARTITION BY resourceId ORDER BY rangeDate) AS rn
FROM (SELECT resourceId, startDate
FROM Jobs
UNION ALL
SELECT resourceId, endDate
FROM Jobs
UNION ALL
SELECT resourceId, startDate
FROM Downtime
UNION ALL
SELECT resourceId, endDate
FROM DownTime) a(resourceId, rangeDate)),
Range AS (SELECT startRange.resourceId,
startRange.rangeDate AS startDate, endRange.rangeDate AS endDate
FROM AllDates startRange
JOIN AllDates endRange
ON endRange.resourceId = startRange.resourceId
AND endRange.rn = startRange.rn + 1
AND endRange.rangeDate <> startRange.rangeDate)
SELECT Range.resourceId, Range.startDate, Range.endDate,
COALESCE(Downtime.reason, Jobs.jobNo) as activity
FROM Range
LEFT JOIN Jobs
ON Jobs.resourceId = Range.resourceId
AND Jobs.startDate <= Range.startDate
AND Jobs.endDate >= Range.endDate
LEFT JOIN Downtime
ON Downtime.resourceId = Range.resourceId
AND Downtime.startDate <= Range.startDate
AND Downtime.endDate >= Range.endDate
WHERE Jobs.jobNo IS NOT NULL
OR Downtime.reason IS NOT NULL
(And working fiddle. This should actually be ANSI-standard SQL)
...which yields the expected:
RESOURCEID STARTDATE ENDDATE ACTIVITY
----------------------------------------------------------------------------
1 2014-04-01 08:00:00 2014-04-01 10:00:00 J1
1 2014-04-01 10:00:00 2014-04-01 11:00:00 DOWN
1 2014-04-01 11:00:00 2014-04-01 17:00:00 J1
1 2014-04-01 17:00:00 2014-04-01 21:00:00 J2
1 2014-04-01 21:00:00 2014-04-01 22:00:00 DOWN
1 2014-04-01 22:00:00 2014-04-01 23:00:00 J2
2 2014-04-01 08:00:00 2014-04-01 10:00:00 J3
2 2014-04-01 10:00:00 2014-04-01 11:00:00 DOWN
2 2014-04-01 11:00:00 2014-04-01 21:00:00 J3
2 2014-04-01 21:00:00 2014-04-01 22:00:00 DOWN
2 2014-04-01 22:00:00 2014-04-01 23:00:00 J3
3 2014-04-01 08:00:00 2014-04-01 09:00:00 J4
3 2014-04-01 10:00:00 2014-04-01 11:00:00 DOWN
3 2014-04-01 21:00:00 2014-04-01 22:00:00 DOWN