Postgres grouping by range - sql

I have data looking like this
What I am trying to achieve is data for historgram that would count values into specific ranges. For category A value range 1-100 and for category B value range 0-125 where value for category C = 5. The problem I have that is data in multiplerows and I need to filter first on C and then count values into ranges to display histogram.
To get counts lets say per 10 seconds looking like this
Code to generate data:
CREATE TEMP TABLE sample (
ts timestamp
,category varchar(2)
, val int)
insert into sample values
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'A', 12),
(to_timestamp('01.01.2018 08:00:02', 'dd-mm-yyyy hh24:mi:ss'), 'A', 44),
(to_timestamp('01.01.2018 08:00:03', 'dd-mm-yyyy hh24:mi:ss'), 'C', 1),
(to_timestamp('01.01.2018 08:00:04', 'dd-mm-yyyy hh24:mi:ss'), 'B', 24),
(to_timestamp('01.01.2018 08:00:05', 'dd-mm-yyyy hh24:mi:ss'), 'B', 111),
(to_timestamp('01.01.2018 08:00:06', 'dd-mm-yyyy hh24:mi:ss'), 'C', 5),
(to_timestamp('01.01.2018 08:00:07', 'dd-mm-yyyy hh24:mi:ss'), 'A', 145),
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'B', 16),
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'C', 47),
(to_timestamp('01.01.2018 08:00:02', 'dd-mm-yyyy hh24:mi:ss'), 'C', 5),
(to_timestamp('01.01.2018 08:00:02', 'dd-mm-yyyy hh24:mi:ss'), 'B', 34),
(to_timestamp('01.01.2018 08:00:03', 'dd-mm-yyyy hh24:mi:ss'), 'B', 111),
(to_timestamp('01.01.2018 08:00:03', 'dd-mm-yyyy hh24:mi:ss'), 'C', 5),
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'A', 19),
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'B', 46),
(to_timestamp('01.01.2018 08:00:01', 'dd-mm-yyyy hh24:mi:ss'), 'C', 57)
I thought if I pivot data like so
s
elect
ts,
category,
case when category = 'A' then val end as "A",
case when category = 'B' then val end as "B",
case when category = 'C' then val end as "C"
from sample
order by ts
then have problem with pivot nulls

Here it is:
with periods(pts) as
(
select *
from generate_series
(
timestamp '2018-01-01 08:00:00',
timestamp '2018-01-01 08:01:00',
interval '10 seconds'
) ts
)
select pts period_start,
pts + interval '10 seconds' period_end,
lat.cat_a,
lat.cat_b,
lat.cat_c
from periods
cross join lateral
(
select count(1) filter (where category = 'A' and val between 0 and 100) as cat_a,
count(1) filter (where category = 'B' and val between 0 and 125) as cat_b,
count(1) filter (where category = 'C' and val = 5) as cat_c
from sample
where ts >= pts and ts < pts + interval '10 seconds'
) lat;
period_start
period_end
cat_a
cat_b
cat_c
2018-01-01 08:00:00
2018-01-01 08:00:10
2
2
1
2018-01-01 08:00:10
2018-01-01 08:00:20
0
0
0
2018-01-01 08:00:20
2018-01-01 08:00:30
0
0
0
2018-01-01 08:00:30
2018-01-01 08:00:40
0
0
0
2018-01-01 08:00:40
2018-01-01 08:00:50
0
0
0
2018-01-01 08:00:50
2018-01-01 08:01:00
0
0
0
2018-01-01 08:01:00
2018-01-01 08:01:10
0
0
0
One-row version is simple:
select min(ts) period_start,
max(ts) period_end,
count(1) filter (where category = 'A' and val between 0 and 100) as cat_a,
count(1) filter (where category = 'B' and val between 0 and 125) as cat_b,
count(1) filter (where category = 'C' and val = 5) as cat_c
from sample;
Added after the clarification comments
select * from (<the first version of the query here>) t where cat_c > 0;

Related

finding the time periods an issuer has not called the service from a given time period

I have a table that contains the time periods when an issuer calls the service. this table can have overlapping and non overlapping time periods:
with mht_issuer_revoked_call (issuerid, startdate, enddate) as (values
(4, to_date('25-11-2022', 'dd-mm-yyyy'), to_date('25-11-2022 12:00:00', 'dd-mm-yyyy hh24:mi:ss'),
(4, to_date('25-11-2022 12:00:00', 'dd-mm-yyyy hh24:mi:ss'), to_date('26-11-2022', 'dd-mm-yyyy'),
(40, to_date('25-11-2022', 'dd-mm-yyyy'), to_date('25-11-2022 06:00:00', 'dd-mm-yyyy hh24:mi:ss'),
(40, to_date('25-11-2022 06:00:00', 'dd-mm-yyyy hh24:mi:ss'), to_date('25-11-2022 12:00:00', 'dd-mm-yyyy hh24:mi:ss'),
(40, to_date('25-11-2022 11:30:00', 'dd-mm-yyyy hh24:mi:ss'), to_date('25-11-2022 18:00:00', 'dd-mm-yyyy hh24:mi:ss'),
(40, to_date('25-11-2022 18:30:00', 'dd-mm-yyyy hh24:mi:ss'), to_date('25-11-2022 19:30:00', 'dd-mm-yyyy hh24:mi:ss'),
(50, to_date('25-11-2022', 'dd-mm-yyyy'), to_date('25-11-2022 12:00:00', 'dd-mm-yyyy hh24:mi:ss'),
(50, to_date('25-11-2022 11:00:00', 'dd-mm-yyyy hh24:mi:ss'), to_date('26-11-2022 01:30:00', 'dd-mm-yyyy hh24:mi:ss'),
(40, to_date('25-11-2022 19:31:00', 'dd-mm-yyyy hh24:mi:ss'), to_date('26-11-2022', 'dd-mm-yyyy'),
(50, to_date('25-11-2022 23:10:00', 'dd-mm-yyyy hh24:mi:ss'), to_date('25-11-2022 23:30:00', 'dd-mm-yyyy hh24:mi:ss'),
(50, to_date('25-11-2022 23:30:00', 'dd-mm-yyyy hh24:mi:ss'), to_date('25-11-2022 23:45:00', 'dd-mm-yyyy hh24:mi:ss'),
(50, to_date('25-11-2022 23:50:00', 'dd-mm-yyyy hh24:mi:ss'), to_date('25-11-2022 23:55:00', 'dd-mm-yyyy hh24:mi:ss')
)
i managed to merge the time periods and new time periods dont have any overlapping with each other. my output is as follows:
with issuer_calls_merged (issuerid, start_date_time, end_date_time) as (values
(4 ,'11/25/2022' , '11/26/2022'),
(40 ,'11/25/2022', '11/25/2022 6:00:00 PM'),
(40 ,'11/25/2022 6:30:00 PM', '11/25/2022 7:30:00 PM'),
(40 ,'11/25/2022 7:31:00 PM', '11/26/2022' ),
(50 ,'11/25/2022', '11/26/2022 1:30:00 AM')
)
i am trying to write a procedure that gets FromDate and EndDate as input parameters and for each issuer calculates how many minutes are not covered according to retrieved FromDate and EndDate Parameters.
for example i will give these parameters:
FromDate := '11/20/2022'
EndDate := '11/28/2022'
then according to inserted time periods in issuer_calls table, for issuerid 40 i expect this output:
| issuerid | start_date_time(uncovered) | end_date_time(uncovered) | uncovered_time_minutes
| 40 | 11/20/2022 | 11/25/2022 | 7200
| 40 | 11/25/2022 6:00:00 PM | 11/25/2022 6:30:00 PM | 30
| 40 | 11/25/2022 7:30:00 PM | 11/25/2022 7:31:00 PM | 1
| 40 | 11/26/2022 | 11/28/2022 | 2880
i tried to do the job with procedure bellow:
create or replace procedure GAP(out_res out sys_refcursor,
in_FromDate mht_issuer_revoked_call.startdate%type,
in_EndDate mht_issuer_revoked_call.enddate%type
) AS
BEGIN
**-- i tried to compare the given time period(FromDate-EndDate) with previous merged time periods and calculate the gaps and then union with previous gap**
open out_res for
select ut.issuerid,
ut.startdate,
ut.enddate,
ut.initialgap as gap
from
(
with minStartDate as
(
select r.issuerid,
min(r.startdate) as min_StartDate
from mht_issuer_revoked_call r
group by r.issuerid
)
select m.issuerid,
in_FromDate as StartDate,
case
when m.min_StartDate >= in_EndDate then in_EndDate
else m.min_StartDate
end as EndDate,
case
when m.min_StartDate >= in_EndDate then (in_EndDate - in_FromDate + 1)*24*60
else (min_StartDate - in_FromDate + 1)*24*60
end as initialgap
from minStartDate m
union all
**--- bellow part merges the time periods and calculate the gaps between them**
SELECT issuerid,
end_date_time,
next_row_start,
(next_row_start - end_date_time)*24*60 as gap
from
(
SELECT issuerid,
start_date_time,
end_date_time,
case
when lead(start_date_time) over(partition by issuerid order by start_date_time) is null then end_date_time
else lead(start_date_time) over(partition by issuerid order by start_date_time)
end as next_row_start
FROM (
SELECT issuerid,
LAG( dt ) OVER ( PARTITION BY issuerid ORDER BY dt ) AS start_date_time,
dt AS end_date_time,
start_end
FROM (
SELECT issuerid,
dt,
CASE SUM( value ) OVER ( PARTITION BY issuerid ORDER BY dt ASC, value DESC, ROWNUM ) * value
WHEN 1 THEN 'start'
WHEN 0 THEN 'end'
END AS start_end
FROM mht_issuer_revoked_call
UNPIVOT ( dt FOR value IN ( startdate AS 1, enddate AS -1 ) )
)
WHERE start_end IS NOT NULL
)
WHERE start_end = 'end'
)
where (next_row_start - end_date_time) > 0
group by issuerid,next_row_start,end_date_time
) ut
order by ut.issuerid, ut.StartDate;
END gap;
but at the end i couldn't achieve the explained result above
You can get your result using just SQL and process it later. In this answer your FromDate And EndDate (P_FROM, P_UNTILL) are set to those from your question. You can define them as parameters or bind variables so you could change them. Comments are in the code.
WITH
tbl AS -- sample data
(
Select 4 "ID", To_Date('25.11.2022 00:00:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('25.11.2022 12:00:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual Union All
Select 4 "ID", To_Date('25.11.2022 12:00:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('26.11.2022 00:00:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual Union All
Select 40 "ID", To_Date('25.11.2022 00:00:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('25.11.2022 06:00:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual Union All
Select 40 "ID", To_Date('25.11.2022 06:00:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('25.11.2022 12:00:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual Union All
Select 40 "ID", To_Date('25.11.2022 11:30:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('25.11.2022 18:00:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual Union All
Select 40 "ID", To_Date('25.11.2022 18:30:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('25.11.2022 19:30:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual Union All
Select 40 "ID", To_Date('25.11.2022 19:31:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('26.11.2022 00:00:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual Union All
Select 50 "ID", To_Date('25.11.2022 00:00:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('25.11.2022 12:00:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual Union All
Select 50 "ID", To_Date('25.11.2022 11:00:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('26.11.2022 01:30:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual Union All
Select 50 "ID", To_Date('25.11.2022 23:10:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('25.11.2022 23:30:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual Union All
Select 50 "ID", To_Date('25.11.2022 23:30:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('25.11.2022 23:45:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual Union All
Select 50 "ID", To_Date('25.11.2022 23:50:00', 'dd.mm.yyyy hh24:mi:ss') "START_DATE", To_Date('25.11.2022 23:55:00', 'dd.mm.yyyy hh24:mi:ss') "END_DATE" From Dual
),
day_tbl AS -- create CTE to prepare your data
( Select ID,
ROW_NUMBER() OVER(Partition By ID Order By START_DATE) "RN", -- ordering ID events
START_DATE "START_DATE", To_Char(START_DATE, 'hh24:mi:ss') "START_TIME", -- just showing the time part of START_DATE
END_DATE "END_DATE", To_Char(END_DATE, 'hh24:mi:ss') "END_TIME", -- just showing the time part of END_DATE
--
To_Date('20.11.2022', 'dd.mm.yyyy') "P_FROM", -- column with P_FROM - you could define it as bind variable
To_Date('28.11.2022', 'dd.mm.yyyy') "P_UNTILL", -- column with P_FROM - you could define it as bind variable
( END_DATE - START_DATE ) * 24 * 60 "MINS" -- first calculation used for first and last row
From
(Select *
From ( -- for each ID create starting and ending row and union them with your data
Select ID "ID", START_DATE "START_DATE", END_DATE "END_DATE" From tbl Union ALL
Select ID, To_Date('20.11.2022', 'dd.mm.yyyy'), Min(START_DATE) From tbl GROUP BY ID Union All -- row with P_FROM as START_DATE - you could define it as bind variable
Select ID, Max(END_DATE), To_Date('28.11.2022', 'dd.mm.yyyy') From tbl GROUP BY ID -- row with P_UNTILL as END_DATE - you could define it as bind variable
)
Order By ID, START_DATE
)
)
SELECT
* -- you can select just the columns you need (not all of them like here)
FROM
( Select
ID, RN, START_DATE, START_TIME, END_DATE, END_TIME, P_FROM, P_UNTILL,
CASE WHEN RN = 1 Or RN = Max(RN) OVER(Partition By ID) THEN MINS -- first and last row already calculated
-- else --> second calculation for rows that are not first nor last
ELSE Round(( START_DATE - FIRST_VALUE(END_DATE) OVER(Partition By ID, TRUNC(START_DATE) Order By START_DATE Rows Between 1 Preceding And Current Row) ) * 24 * 60, 0)
END "MINS"
From
day_tbl
)
WHERE
MINS > 0 -- if you want just ID=40 here you can filter it
--
/* R e s u l t :
ID RN START_DATE START_TIME END_DATE END_TIME P_FROM P_UNTILL MINS
---------- ---------- ---------- ---------- --------- -------- --------- --------- ----------
4 1 20-NOV-22 00:00:00 25-NOV-22 00:00:00 20-NOV-22 28-NOV-22 7200
4 4 26-NOV-22 00:00:00 28-NOV-22 00:00:00 20-NOV-22 28-NOV-22 2880
40 1 20-NOV-22 00:00:00 25-NOV-22 00:00:00 20-NOV-22 28-NOV-22 7200
40 5 25-NOV-22 18:30:00 25-NOV-22 19:30:00 20-NOV-22 28-NOV-22 30
40 6 25-NOV-22 19:31:00 26-NOV-22 00:00:00 20-NOV-22 28-NOV-22 1
40 7 26-NOV-22 00:00:00 28-NOV-22 00:00:00 20-NOV-22 28-NOV-22 2880
50 1 20-NOV-22 00:00:00 25-NOV-22 00:00:00 20-NOV-22 28-NOV-22 7200
50 6 25-NOV-22 23:50:00 25-NOV-22 23:55:00 20-NOV-22 28-NOV-22 5
50 7 26-NOV-22 01:30:00 28-NOV-22 00:00:00 20-NOV-22 28-NOV-22 2790
*/
i think i could finally finish the job.
mht_issuer_revoked_call: this is the table whenever an issuer calls the service, issuerid, startdate and enddate are submitted.
MHT_ISSUER_MERGED_CALLS: the requests of issuers are merged and stored in this table.
MHT_ISSUER_UNIONED_CALLS: merged calls and input parameters are unioned and stored in this table.
create or replace procedure GAP(out_res out sys_refcursor,
in_FromDate mht_issuer_revoked_call.startdate%type,
in_EndDate mht_issuer_revoked_call.enddate%type
) as
BEGIN
delete from MHT_ISSUER_MERGED_CALLS;
commit;
insert into MHT_ISSUER_MERGED_CALLS (
select issuerid,
start_date_time as start_date,
end_date_time as end_date
FROM (
SELECT issuerid,
LAG(dt) OVER(PARTITION BY issuerid ORDER BY dt) AS start_date_time,
dt AS end_date_time,
start_end
FROM (
SELECT issuerid,
dt,
CASE SUM(value) OVER(PARTITION BY issuerid ORDER BY dt ASC, value DESC, ROWNUM) * value
WHEN 1 THEN 'start'
WHEN 0 THEN 'end'
END AS start_end
FROM mht_issuer_revoked_call UNPIVOT(dt FOR value IN(startdate AS 1, enddate AS - 1))
)
WHERE start_end IS NOT NULL
)
WHERE start_end = 'end');
commit;
delete from MHT_ISSUER_UNIONED_CALLS;
commit;
insert into MHT_ISSUER_UNIONED_CALLS
(
Select ISSUERID,
ROW_NUMBER() OVER(Partition By ISSUERID Order By START_DATE) "RN",
START_DATE "START_DATE",
To_Char(START_DATE, 'hh24:mi:ss') "START_TIME",
END_DATE "END_DATE",
To_Char(END_DATE, 'hh24:mi:ss') "END_TIME",
in_FromDate "P_FROM",
in_EndDate "P_UNTILL",
(END_DATE - START_DATE) * 24 * 60 "MINS"
From (
Select *
From (
Select issuerid "ISSUERID",
STARTDATE "START_DATE",
ENDDATE "END_DATE"
From MHT_ISSUER_MERGED_CALLS
Union ALL
Select issuerid,
case
when in_FromDate >= Min(STARTDATE) then Min(STARTDATE)
else in_FromDate
end,
case
when in_EndDate <= Min(STARTDATE) then in_EndDate
else Min(STARTDATE)
end
From MHT_ISSUER_MERGED_CALLS
GROUP BY issuerid
Union All
Select issuerid,
case
when Max(ENDDATE) <= in_FromDate then in_FromDate
else Max(ENDDATE)
end,
case
when in_EndDate <= Max(ENDDATE) then Max(ENDDATE)
else in_EndDate
end
From MHT_ISSUER_MERGED_CALLS
GROUP BY issuerid)
Order By ISSUERID, START_DATE ASC, END_DATE ASC
)
);
COMMIT;
open out_res for
select ISSUERID,
START_DATE AS START_DATE,
SELECTED_END_DATE AS END_DATE,
MINS AS GAP_MINUTES
from
(
Select ISSUERID,
RN,
START_DATE,
trunc(start_date),
START_TIME,
END_DATE,
END_TIME,
P_FROM,
P_UNTILL,
FIRST_VALUE(END_DATE) OVER(Partition By ISSUERID, TRUNC(START_DATE) Order By START_DATE Rows Between 1 Preceding And Current Row) as Selected_End_Date,
CASE
WHEN RN = 1 Or RN = Max(RN) OVER(Partition By ISSUERID) THEN MINS
ELSE
Round((START_DATE - FIRST_VALUE(END_DATE)OVER(Partition By ISSUERID, TRUNC(START_DATE) Order By START_DATE
Rows Between 1 Preceding And Current Row)) * 24 * 60,0)
END "MINS"
From MHT_ISSUER_UNIONED_CALLS
)
where mins > 0
and START_DATE >= in_FromDate and END_DATE <= in_EndDate;
END gap;

Oracle SQL - How to return common date periods and "divide" when there are gaps between periods

I'm trying to return common date periods (per id) from below data, but I cannot find a way to handle case when date periods have a gap between common periods. Can anyone help?
|id|code_id|code|date_from|date_to|
|--|--|--|--|--|
|10|100| 1000 |02/02/2022 |03/02/2022 23:57:00|
|10|100| 1000 |07/02/2022 01:00:00 |08/02/2022 |
|10|100| 2000 |02/02/2022 |02/02/2022 23:00:00|
|10|100| 2000 |07/02/2022 03:00:00 |08/02/2022 |
|10|200| 2000 |02/02/2022 02:14:00 |04/02/2022 21:37:00|
|20|100| 1000 |01/02/2022 05:00:00 |03/02/2022 |
|30|100| 2000 |02/02/2022 |02/02/2022 23:00:00|
|30|200| 2000 |02/02/2022 02:14:00 |04/02/2022 |
|40|100| 2000 |07/02/2022 03:00:00 |08/02/2022 23:10:00|
|50|200| 2000 |04/02/2022 |04/02/2022 21:37:00|
|50|200| 3000 |04/02/2022 02:12:00 |05/02/2022 23:31:00|
Below simple query works fine, but only for ids which have one common period (with no gaps).
I would expect for id = 10 to return two rows (as there is a gap between dates) for periods which are:
I) 02/02/2022 00:00:00 <-> 04/02/2022 21:37:00
II) 07/02/2022 01:00:00 <-> 08/02/2022 00:00:00
SELECT id
,MIN(date_from) date_from
,MAX(date_to) date_to
FROM my_gtt
GROUP BY id
ORDER BY id
Current results (but id = 10 is incorrect):
|id|date_from|date_to|
|--|--|--|
|10| 02/02/2022 |08/02/2022 |
|20| 01/02/2022 05:00:00 |03/02/2022 |
|30| 02/02/2022 |04/02/2022 |
|40| 07/02/2022 03:00:00 |08/02/2022 23:10:00|
|50| 04/02/2022 |05/02/2022 23:31:00|
Data and table creation:
CREATE GLOBAL TEMPORARY TABLE my_gtt
(
id NUMBER(10),
code_id NUMBER(10),
code NUMBER(10),
date_from DATE,
date_to DATE
)
ON COMMIT PRESERVE ROWS;
INSERT INTO my_gtt VALUES (10, 100, 1000, TO_DATE('02-02-2022', 'dd-mm-yyyy'), TO_DATE('03-02-2022 23:57:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (10, 100, 1000, TO_DATE('07-02-2022 01:00:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('08-02-2022', 'dd-mm-yyyy'));
INSERT INTO my_gtt VALUES (10, 100, 2000, TO_DATE('02-02-2022', 'dd-mm-yyyy'), TO_DATE('02-02-2022 23:00:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (10, 100, 2000, TO_DATE('07-02-2022 03:00:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('08-02-2022', 'dd-mm-yyyy'));
INSERT INTO my_gtt VALUES (10, 200, 2000, TO_DATE('02-02-2022 02:14:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('04-02-2022 21:37:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (20, 100, 1000, TO_DATE('01-02-2022 05:00:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('03-02-2022', 'dd-mm-yyyy'));
INSERT INTO my_gtt VALUES (30, 100, 2000, TO_DATE('02-02-2022', 'dd-mm-yyyy'), TO_DATE('02-02-2022 23:00:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (30, 200, 2000, TO_DATE('02-02-2022 02:14:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('04-02-2022', 'dd-mm-yyyy'));
INSERT INTO my_gtt VALUES (40, 100, 2000, TO_DATE('07-02-2022 03:00:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('08-02-2022 23:10:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (50, 200, 2000, TO_DATE('04-02-2022', 'dd-mm-yyyy'), TO_DATE('04-02-2022 21:37:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (50, 200, 3000, TO_DATE('04-02-2022 02:12:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('05-02-2022 23:31:00', 'dd-mm-yyyy hh24:mi:ss'));
From Oracle 12, MATCH_RECOGNIZE is the simplest solution:
SELECT *
FROM my_gtt
MATCH_RECOGNIZE (
PARTITION BY id
ORDER BY date_from, date_to
MEASURES
MIN(date_from) AS start_date,
MAX(date_to) AS end_date
PATTERN (overlap* last_row)
DEFINE
overlap AS MAX(date_to) >= NEXT(date_from)
);
However, if you are on an earlier version you can find the output using:
SELECT id,
MIN(dt) AS date_from,
MAX(dt) AS date_to
FROM (
SELECT id,
dt,
SUM(value) OVER (PARTITION BY id ORDER BY dt, ROWNUM) AS match_no
FROM (
SELECT id,
dt,
type * SUM(type) OVER (PARTITION BY id ORDER BY dt, ROWNUM) AS value
FROM my_gtt
UNPIVOT (dt FOR type IN (date_from AS 1, date_to AS -1))
)
WHERE value IN (1,0)
)
GROUP BY id, match_no
Which, for the sample data:
INSERT INTO my_gtt VALUES (10, 100, 1000, TO_DATE('02-02-2022', 'dd-mm-yyyy'), TO_DATE('03-02-2022 23:57:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (10, 100, 1000, TO_DATE('07-02-2022 01:00:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('08-02-2022', 'dd-mm-yyyy'));
INSERT INTO my_gtt VALUES (10, 100, 2000, TO_DATE('02-02-2022', 'dd-mm-yyyy'), TO_DATE('02-02-2022 23:00:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (10, 100, 2000, TO_DATE('07-02-2022 03:00:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('08-02-2022', 'dd-mm-yyyy'));
INSERT INTO my_gtt VALUES (10, 200, 2000, TO_DATE('02-02-2022 02:14:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('04-02-2022 21:37:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (20, 100, 1000, TO_DATE('01-02-2022 05:00:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('03-02-2022', 'dd-mm-yyyy'));
INSERT INTO my_gtt VALUES (30, 100, 2000, TO_DATE('02-02-2022', 'dd-mm-yyyy'), TO_DATE('02-02-2022 23:00:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (30, 200, 2000, TO_DATE('02-02-2022 02:14:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('04-02-2022', 'dd-mm-yyyy'));
INSERT INTO my_gtt VALUES (40, 100, 2000, TO_DATE('07-02-2022 03:00:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('08-02-2022 23:10:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (50, 200, 2000, TO_DATE('04-02-2022', 'dd-mm-yyyy'), TO_DATE('04-02-2022 21:37:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (50, 200, 3000, TO_DATE('04-02-2022 02:12:00', 'dd-mm-yyyy hh24:mi:ss'), TO_DATE('05-02-2022 23:31:00', 'dd-mm-yyyy hh24:mi:ss'));
INSERT INTO my_gtt VALUES (60, 200, 3000, DATE '2022-01-01', DATE '2022-01-10');
INSERT INTO my_gtt VALUES (60, 200, 3000, DATE '2022-01-02', DATE '2022-01-04');
INSERT INTO my_gtt VALUES (60, 200, 3000, DATE '2022-01-06', DATE '2022-01-11');
INSERT INTO my_gtt VALUES (60, 200, 3000, DATE '2022-01-13', DATE '2022-01-16');
INSERT INTO my_gtt VALUES (60, 200, 3000, DATE '2022-01-14', DATE '2022-01-15');
Both output:
ID
START_DATE
END_DATE
10
2022-02-02 00:00:00
2022-02-04 21:37:00
10
2022-02-07 01:00:00
2022-02-08 00:00:00
20
2022-02-01 05:00:00
2022-02-03 00:00:00
30
2022-02-02 00:00:00
2022-02-04 00:00:00
40
2022-02-07 03:00:00
2022-02-08 23:10:00
50
2022-02-04 00:00:00
2022-02-05 23:31:00
60
2022-01-01 00:00:00
2022-01-11 00:00:00
60
2022-01-13 00:00:00
2022-01-16 00:00:00
db<>fiddle here
SQL pattern matching can help:
select * from my_gtt match_recognize (
partition by id
order by date_from, date_to
measures
min ( date_from ) start_date,
max ( date_to ) end_date
pattern ( overlap* gap )
define
overlap as next ( date_from ) <= max ( date_to )
);
ID START_DATE END_DATE
---------- -------------------- --------------------
10 02-FEB-2022 00:00:00 04-FEB-2022 21:37:00
10 07-FEB-2022 01:00:00 08-FEB-2022 00:00:00
20 01-FEB-2022 05:00:00 03-FEB-2022 00:00:00
30 02-FEB-2022 00:00:00 04-FEB-2022 00:00:00
40 07-FEB-2022 03:00:00 08-FEB-2022 23:10:00
50 04-FEB-2022 00:00:00 05-FEB-2022 23:31:00
I discuss how this works in more detail in pattern matching use cases

How do I do this ? Time interval

there is a table.
(1, 'b', '2010-01-01 00:00:00', '2020-01-01 00:00:00'),
(1, 'z', '2010-02-01 00:00:00', '2015-01-01 00:00:00'),
How to do this:
(1, 'b', '2010-01-01 00:00:00', '2010-01-31 23:59:59'),
(1, 'z', '2010-02-01 00:00:00', '2015-01-01 00:00:00'),
(1, 'b', '2015-01-01 00:00:01', '2020-01-01 00:00:00');
You can do it this way:
I ddint add the part when you take away a second from the enddate or add a second to the fromdate as I didnt see the logic there
with cte as
(
select 1 as a, 'b' as b, cast('2010-01-01 00:00:00'as date) as start_, cast('2020-01-01 00:00:00'as date) as end_
union select 1, 'z', '2010-02-01 00:00:00', '2015-01-01 00:00:00'
),
cte2 as
(
select start_ as date_ from cte union select end_ from cte
),
cte3 as
(
select a, b, date_ from cte2 a inner join cte b on date_ between start_ and end_
),
final as
(
select a.a, a.b, a.date_ as startdate,
case when a.b = lead(a.b)over(order by a.date_) then lead(a.date_)over(order by a.date_) end as enddate
from cte3 a
)
select * from final where enddate is not null order by startdate
Output:
a b startdate enddate
1 b 2010-01-01 2010-02-01
1 z 2010-02-01 2015-01-01
1 b 2015-01-01 2020-01-01

Count of IDs at a specific status at various set time intervals

Im trying to create a histogram to identify situations where there are a large number of 'ID's which have the status "CASTING" simultaneously.
You will notice I have a rank column present - I have attempted this problem in a number of ways, the most recent was to partition over the IDs, convert the event_time column to the nearest 15 min interval by rounding down and then counting the number of IDs with the "CASTING" status by interval. Unfortunately, this doesn't get me where i need to be, as it simply sums the instances of the status change, rather than a running current total by including those that were at 'CASTING' prior to the specific time interval.
I lack direction on how to generate a list of the 15 min time intervals between
14:00 and 17:30 (if this is even possible?!) - not overly necessary as i can resort to hardcoding them in and the perhaps performing some kind of join to the list.
I think i've been staring at this problem for so long, that i cant see the obvious solution... Is anyone able to provide a high-level outline on what method i can use to achieve the desired results below?
Here is a test dataset:
CREATE TABLE Table1
(`id` varchar(1), `loc` int, `status` varchar(7), `date` datetime, `event_time` datetime, `rnk` int)
;
INSERT INTO Table1
(`id`, `loc`, `status`, `date`, `event_time`, `rnk`)
VALUES
('A', 1, 'READY', '2019-08-04 00:00:00', '2019-08-04 15:39:00', 1),
('A', 1, 'CASTING', '2019-08-04 00:00:00', '2019-08-04 14:09:00', 1),
('A', 1, 'QUEUED', '2019-08-04 00:00:00', '2019-08-04 12:59:00', 1),
('B', 1, 'READY', '2019-08-04 00:00:00', '2019-08-04 23:59:00', 1),
('B', 1, 'CASTING', '2019-08-04 00:00:00', '2019-08-04 13:52:00', 1),
('B', 1, 'QUEUED', '2019-08-04 00:00:00', '2019-08-04 13:44:00', 1),
('C', 1, 'READY', '2019-08-04 00:00:00', '2019-08-04 17:59:00', 1),
('C', 1, 'CASTING', '2019-08-04 00:00:00', '2019-08-04 14:59:00', 1),
('C', 1, 'QUEUED', '2019-08-04 00:00:00', '2019-08-04 11:59:00', 1),
('D', 1, 'READY', '2019-08-04 00:00:00', '2019-08-04 13:59:00', 1),
('D', 1, 'CASTING', '2019-08-04 00:00:00', '2019-08-04 12:59:00', 1),
('D', 1, 'QUEUED', '2019-08-04 00:00:00', '2019-08-04 11:59:00', 1),
('E', 1, 'READY', '2019-08-04 00:00:00', '2019-08-04 21:51:00', 1),
('E', 1, 'CASTING', '2019-08-04 00:00:00', '2019-08-04 18:59:00', 1),
('E', 1, 'QUEUED', '2019-08-04 00:00:00', '2019-08-04 11:59:00', 1)
;
My output should look something like this:
Date Count_at_casting
08/04/2019 14:00 1
08/04/2019 14:15 2
08/04/2019 14:30 2
08/04/2019 14:45 2
08/04/2019 15:00 3
08/04/2019 15:15 3
08/04/2019 15:30 3
08/04/2019 15:45 2
08/04/2019 16:00 2
08/04/2019 16:15 2
08/04/2019 16:30 2
08/04/2019 16:45 2
08/04/2019 17:00 2
08/04/2019 17:15 2
08/04/2019 17:30 2
Id loc status date START END
A 1 CASTING 08/04/2019 00:00 08/04/2019 14:09 08/04/2019 15:39
B 1 CASTING 08/04/2019 00:00 08/04/2019 13:52 08/04/2019 23:59
C 1 CASTING 08/04/2019 00:00 08/04/2019 14:59 08/04/2019 17:59
D 1 CASTING 08/04/2019 00:00 08/04/2019 12:59 08/04/2019 13:59
E 1 CASTING 08/04/2019 00:00 08/04/2019 18:59 08/04/2019 21:51
Your output does not match your input - I assume that is just a sample. Following is for oracle. [from dual] in oracle is to just select 1 row and first with is to simulate input. Oracle to_char and to_date convert strings to date. Similar functions are available in sqlserver along with datepart type of functions
[
with table1 as (
select 'A' id, 1 loc, 'READY' status, to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS') datetime, to_date('2019-08-04 15:39:00', 'YYYY-MM-DD HH24:MI:SS') event_time from dual union all
select 'A', 1, 'CASTING', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 14:09:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'A', 1, 'QUEUED', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 12:59:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'B', 1, 'READY', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 23:59:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'B', 1, 'CASTING', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 13:52:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'B', 1, 'QUEUED', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 13:44:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'C', 1, 'READY', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 17:59:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'C', 1, 'CASTING', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 14:59:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'C', 1, 'QUEUED', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 17:59:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'D', 1, 'READY', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 13:59:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'D', 1, 'CASTING', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 12:59:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'D', 1, 'QUEUED', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 11:59:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'E', 1, 'READY', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 21:51:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'E', 1, 'CASTING', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 18:59:00', 'YYYY-MM-DD HH24:MI:SS') from dual union all
select 'E', 1, 'QUEUED', to_date('2019-08-04 00:00:00','YYYY-MM-DD HH24:MI:SS'), to_date('2019-08-04 11:59:00', 'YYYY-MM-DD HH24:MI:SS') from dual
),
table1_with_rounded_min as
(
select y.*,
to_date(to_char(event_time,'YYYYMMDDHH24') || ltrim(to_char(event_round_15)),'YYYYMMDDHH24MI') event_time_round15
from
(
select x.*,
trunc(x.event_min / 15)*15 event_round_15
from
(
select table1.*, to_char(event_time,'MI') event_min
from table1
) x
) y
)
select to_char(event_time_round15, 'MM/DD/YY HH24:MI') date1,
sum(case when status = 'CASTING' then 1 else 0 end) Count_at_casting
from table1_with_rounded_min
group by event_time_round15
order by 1
]

Dense_rank query in sql(4 different columns) in

I have a table as follows:
Sn no. t_time Value rate
ABC 17-MAY-18 08:00:00 100.00 3
ABC 17-MAY-18 22:00:00 200.00 1
ABC 16-MAY-18 08:00:00 100.00 1
XYZ 14-MAY-18 01:00:00 700.00 1
XYZ 15-MAY-18 10:00:00 500.00 2
XYZ 15-MAY-18 13:00:00 100.00 2
And I want to generate the output as follows:
Sn no. New_value
ABC 150
XYZ 450
It is grouped by the Sn no. The New_value is the latest time of each date value multiplied by rate, and then averaged together.
For example ABC new_value is
Average of:[(100*1) and (200*1)]
Its a large dataset. How do I write a query for the above in the most efficient way. Please help.
You can use analytical function(row_number()) to achieve the result
SQL> WITH cte_table(Snno, t_time, Value, rate) AS (
2 SELECT 'ABC', to_date('2018-05-17 08:00:00', 'YYYY-MM-DD HH24:MI:SS'), 100.00, 3 FROM DUAL UNION ALL
3 SELECT 'ABC', to_date('2018-05-17 22:00:00', 'YYYY-MM-DD HH24:MI:SS'), 200.00, 1 FROM DUAL UNION ALL
4 SELECT 'ABC', to_date('2018-05-16 08:00:00', 'YYYY-MM-DD HH24:MI:SS'), 100.00, 1 FROM DUAL UNION ALL
5 SELECT 'XYZ', to_date('2018-05-14 01:00:00', 'YYYY-MM-DD HH24:MI:SS'), 700.00, 1 FROM DUAL UNION ALL
6 SELECT 'XYZ', to_date('2018-05-15 10:00:00', 'YYYY-MM-DD HH24:MI:SS'), 500.00, 2 FROM DUAL UNION ALL
7 SELECT 'XYZ', to_date('2018-05-15 13:00:00', 'YYYY-MM-DD HH24:MI:SS'), 100.00, 2 FROM DUAL),
8 --------------------------------
9 -- End of data preparation
10 --------------------------------
11 rn_table AS (
12 SELECT t.*, row_number() OVER (PARTITION BY TRUNC(t_time) ORDER BY t_time DESC) AS rn
13 FROM cte_table t)
14 SELECT snno,
15 AVG(VALUE * rate) new_value
16 FROM rn_table
17 WHERE rn = 1
18 GROUP BY snno;
Output:
SNNO NEW_VALUE
---- ----------
ABC 150
XYZ 450
Use the ROW_NUMBER (or RANK/DENSE_RANK if it is more appropriate) analytic function in a sub-query and then aggregate in the outer query:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( Snno, t_time, Value, rate ) AS
SELECT 'ABC', TIMESTAMP '2018-05-17 08:00:00', 100.00, 3 FROM DUAL UNION ALL
SELECT 'ABC', TIMESTAMP '2018-05-17 22:00:00', 200.00, 1 FROM DUAL UNION ALL
SELECT 'ABC', TIMESTAMP '2018-05-16 08:00:00', 100.00, 1 FROM DUAL UNION ALL
SELECT 'XYZ', TIMESTAMP '2018-05-14 01:00:00', 700.00, 1 FROM DUAL UNION ALL
SELECT 'XYZ', TIMESTAMP '2018-05-15 10:00:00', 500.00, 2 FROM DUAL UNION ALL
SELECT 'XYZ', TIMESTAMP '2018-05-15 13:00:00', 100.00, 2 FROM DUAL;
Query 1:
SELECT snno,
AVG( value * rate ) As new_value
FROM (
SELECT t.*,
ROW_NUMBER() OVER (
PARTITION BY snno, value
ORDER BY t_time DESC
) AS rn
FROM table_name t
)
WHERE rn = 1
GROUP BY snno
Results:
| SNNO | NEW_VALUE |
|------|-------------------|
| ABC | 250 |
| XYZ | 633.3333333333334 |