Oracle help: Dynamic choose t-5 to t-1 days average data - sql

select distinct trunc(a,'dd') day,
avg(g) over ( order by trunc(a,'dd') RANGE between 5 preceding and current row ) a1
from(
select to_date(concat(concat(a,' '),b),'yyyymmdd hh24mi') a, A1SPREAD21.c, A1SPREAD21.d,
A1SPREAD21.e, A1SPREAD21.f,A1SPREAD21.g, A1SPREAD21.h, A1SPREAD21.i, A1SPREAD21.j
from A1SPREAD21) t1
order by 1
SQL code is listed as above. but unfortunately, I want to calculate 5 days average data daynamically, such as on T day, I want to use T-5 to T-1 interval data. so will someone help?
table
20100419 1034 IF1005 IF1006 3361.60 3388.60 -27 4695 527 316 24
20100419 1035 IF1005 IF1006 3365 3392.20 -27.20 4713 530 402 23
20100419 1036 IF1005 IF1006 3366 3392.80 -26.80 4722 527 408 16
20100419 1037 IF1005 IF1006 3367 3394 -27 4682 533 454 35
20100419 1038 IF1005 IF1006 3366.40 3395 -28.60 4741 529 301 28
20100419 1039 IF1005 IF1006 3366.40 3395 -28.60 4770 530 179 17
edit:
data is deposit on dropbox, xlsx fromat
https://www.dropbox.com/s/67y8mm0gims96us/a1spread21.xlsx
select avg(g)
from A1SPREAD21
where a between 20110101 and 20110110
result is
-27.00, and the sql could give 20110111 data -27.00.
what I want is every trading day(can get in the table, not as calendar day) getting previous t-5 to t-1 average.

I assume the clause should be this one:
avg(g) over (order by a RANGE between NUMTODSINTERVAL(5, 'day') PRECEDING
AND NUMTODSINTERVAL(1, 'day') PRECEDING)

select distinct trunc(a,'dd') day,
avg(g) over ( order by trunc(a,'dd') RANGE between 6 preceding and 1 preceding ) a1
from(
select to_date(concat(concat(a,' '),b),'yyyymmdd hh24mi') a, A1SPREAD21.c, A1SPREAD21.d,
A1SPREAD21.e, A1SPREAD21.f,A1SPREAD21.g, A1SPREAD21.h, A1SPREAD21.i, A1SPREAD21.j
from A1SPREAD21) t1
order by 1
Did testing over
SELECT DISTINCT TRUNC ( a, 'dd' ) DAY,
AVG ( g ) OVER ( ORDER BY TRUNC ( A, 'dd' ) RANGE BETWEEN 6 PRECEDING AND
1 preceding ) a1, g
FROM
(
SELECT sysdate - 10 a, 1 as g from dual union all
SELECT SYSDATE - 5 , 3 FROM dual UNION ALL
SELECT SYSDATE - 3 , 45 FROM dual UNION ALL
SELECT SYSDATE - 6 , 56 FROM dual UNION ALL
SELECT SYSDATE - 7 , 23 FROM dual UNION ALL
SELECT sysdate - 8 , 67 from dual union all
SELECT sysdate - 2 , 7 from dual union all
SELECT SYSDATE - 1 , 8 FROM dual UNION ALL
SELECT sysdate - 4 , 541 from dual
)
t1
ORDER BY 1
Output
| DAY | A1 | G |
|---------------------------------|-----------------|-----|
| December, 29 2013 00:00:00+0000 | (null) | 1 |
| December, 31 2013 00:00:00+0000 | 1 | 67 |
| January, 01 2014 00:00:00+0000 | 34 | 23 |
| January, 02 2014 00:00:00+0000 | 30.333333333333 | 56 |
| January, 03 2014 00:00:00+0000 | 36.75 | 3 |
| January, 04 2014 00:00:00+0000 | 30 | 541 |
| January, 05 2014 00:00:00+0000 | 138 | 45 |
| January, 06 2014 00:00:00+0000 | 122.5 | 7 |
| January, 07 2014 00:00:00+0000 | 112.5 | 8 |

try this one (testtable is, of course, only for testing):
with testtable as (
select date '2013-01-08' a,1 g from dual union all
select date '2013-01-02' a,2 g from dual union all
select date '2013-01-05' a,3 g from dual union all
select date '2013-01-07' a,4 g from dual
)
select distinct trunc(a,'dd') day,
avg(g) over(order by a rows between (select count(*)
from testtable tin
where trunc(a,'dd') between trunc(t.a,'dd') - 5 and trunc(t.a,'dd') - 1) preceding and current row) a1
from testtable t
order by 1
I used sum() instead of average to check, because average is quite difficult to test fast, but for sure it works.
I am not sure about this requirement of you trunc(t.a,'dd') - 1, maybe you wish to remove -1?
See Oracle windowing functions description to learn more about it.

Related

Pivot two columns and keep the values same in sql

I have created a query to get different time types and hours
SELECT calc_time.hours measure,
calc_time.payroll_time_type elements,
calc_time.person_id,
calc_time.start_time
FROM hwm_tm_rep_work_hours_sum_v calc_time,
per_all_people_f papf
WHERE grp_type_id = 200
AND payroll_time_type IN ( 'Afternoon shift',
'TL',
'Evening shift',
'Regular Pay ',
'OT' )
AND (To_date(To_char(calc_time.start_time, 'YYYY-MM-DD') , 'YYYY-MM-DD') BETWEEN To_date(To_char(:From_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD')
AND To_date( To_char(:To_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD' ))
AND papf.person_id = calc_time.person_id
I get the output like -
Start_time person_id elements measure
01-Jan-2021 198 Regular Pay 10
01-Jan-2021 198 OT 2
01-jAN-2021 198 Afternoon shift 2
16-JAN-2021 198 Regular Pay 10
17-JAN-2021 198 OT 3
20-JAN-2021 198 EVENING SHIFT 8
08-JAN-2021 11 Regular Pay 8
09-JAN-2021 11 OT 1
08-JAN-2021 11 tl 2
10-JAN-2021 12 Evening shift 9
11-JAN-2021 12 Evening shift 9
I want this output to be dispplayed as follows WITHIN TWO DATES THAT I PASS AS PARAMETER - LIKE PARAMETER TO AND FROM DATE 01-JAN-2021 AND 31-JAN-2021
person_id Regular_pay OT OTHER_MEASURE OTHER_CODE
198 20 5 2 Afternoon shift
198 20 5 8 EVENING SHIFT
11 8 1 2 TL
12 18 Evening shift
So sum of Regular pay and OT IN separate columns and all others in other_measure and other_code
How can I tweak the main query to achieve this?
You can use:
SELECT *
FROM (
SELECT c.person_id,
SUM(CASE c.payroll_time_type WHEN 'Regular Pay' THEN SUM(c.hours) END)
OVER (PARTITION BY c.person_id) AS regular_pay,
SUM(CASE c.payroll_time_type WHEN 'OT' THEN SUM(c.hours) END)
OVER (PARTITION BY c.person_id) AS OT,
SUM(c.hours) AS other_measure,
c.payroll_time_type AS Other_code
FROM hwm_tm_rep_work_hours_sum_v c
INNER JOIN per_all_people_f p
ON (p.person_id = c.person_id)
WHERE grp_type_id = 200
AND payroll_time_type IN (
'Afternoon shift',
'TL',
'Evening shift',
'Regular Pay',
'OT'
)
AND c.start_time >= TRUNC(:from_date)
AND c.start_time < TRUNC(:to_date) + INTERVAL '1' DAY
GROUP BY
c.person_id,
c.payroll_time_type
)
WHERE other_code NOT IN ('Regular Pay', 'OT');
Which, for the sample data:
CREATE TABLE hwm_tm_rep_work_hours_sum_v (start_time, person_id, payroll_time_type, hours) AS
SELECT DATE '2021-01-01', 198, 'Regular Pay', 10 FROM DUAL UNION ALL
SELECT DATE '2021-01-01', 198, 'OT', 2 FROM DUAL UNION ALL
SELECT DATE '2021-01-01', 198, 'Afternoon shift', 2 FROM DUAL UNION ALL
SELECT DATE '2021-01-16', 198, 'Regular Pay', 10 FROM DUAL UNION ALL
SELECT DATE '2021-01-17', 198, 'OT', 3 FROM DUAL UNION ALL
SELECT DATE '2021-01-20', 198, 'Evening shift', 8 FROM DUAL UNION ALL
SELECT DATE '2021-01-08', 11, 'Regular Pay', 8 FROM DUAL UNION ALL
SELECT DATE '2021-01-09', 11, 'OT', 1 FROM DUAL UNION ALL
SELECT DATE '2021-01-08', 11, 'TL', 2 FROM DUAL UNION ALL
SELECT DATE '2021-01-10', 12, 'Evening shift', 9 FROM DUAL UNION ALL
SELECT DATE '2021-01-11', 12, 'Evening shift', 9 FROM DUAL;
CREATE TABLE per_all_people_f (person_id, grp_type_id) AS
SELECT 198, 200 FROM DUAL UNION ALL
SELECT 11, 200 FROM DUAL UNION ALL
SELECT 12, 200 FROM DUAL;
Outputs:
PERSON_ID
REGULAR_PAY
OT
OTHER_MEASURE
OTHER_CODE
11
8
1
2
TL
12
18
Evening shift
198
20
5
2
Afternoon shift
198
20
5
8
Evening shift
db<>fiddle here
You could try something like this - In your question, unfortunately, it is not clear in which table which columns/values ​​are available.
SELECT
calc_time.person_id,
(select sum(calc_time.start_time) FROM hwm_tm_rep_work_hours_sum_v calc_time where papf.person_id = calc_time.person_id and calc_time.payroll_time_type = 'Regular Pay') as Regular_Pay,
...
FROM hwm_tm_rep_work_hours_sum_v calc_time,
per_all_people_f papf
WHERE grp_type_id = 200
AND payroll_time_type IN ( 'Afternoon shift',
'TL',
'Evening shift',
'Regular Pay ',
'OT' )
AND (
To_date(To_char(calc_time.start_time, 'YYYY-MM-DD') , 'YYYY-MM-DD') BETWEEN To_date(To_char(:From_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD')
AND To_date( To_char(:To_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD' ) )
and papf.person_id = calc_time.person_id
-- use a group by
GROUP BY
calc_time.person_id
You may use aggregation and then apply model clause to calculate the required columns. Below is the code with comments, assuming you can manage filter by dates.
select *
from t
PERSON_ID | ELEMENTS | MEASURE
--------: | :-------------- | ------:
198 | Regular Pay | 1
198 | Regular Pay | 2
198 | Afternoon shift | 3
198 | Afternoon shift | 4
198 | OT | 5
198 | OT | 6
198 | EVENING SHIFT | 7
198 | EVENING SHIFT | 8
11 | Regular Pay | 11
11 | Regular Pay | 12
11 | TL | 13
11 | TL | 14
11 | EVENING SHIFT | 15
11 | EVENING SHIFT | 16
12 | TL | 21
12 | TL | 22
12 | EVENING SHIFT | 23
12 | EVENING SHIFT | 24
select
person_id,
ot,
regular_pay,
elements as other_code,
mes as other_measure
from (
/*First you need to aggregate all the measures by person_id and code*/
select
person_id,
elements,
sum(measure) as mes
from t
/*Date filter goes here*/
group by
person_id,
elements
)
model
/*RETURN UPDATED ROWS
will do the trick,
because we'll update only "other"
measures, so OT and Regular pay will no go
to the output*/
return updated rows
/*Where to break the calculation*/
partition by (person_id)
/*To be able to reference by code*/
dimension by (elements)
measures (
mes,
0 as ot,
0 as regular_pay
)
rules upsert (
ot[
elements not in ('OT', 'Regular Pay')
] = sum(mes)['OT'],
regular_pay[
elements not in ('OT', 'Regular Pay')
] = sum(mes)['Regular Pay']
)
PERSON_ID | OT | REGULAR_PAY | OTHER_CODE | OTHER_MEASURE
--------: | ---: | ----------: | :-------------- | ------------:
198 | 11 | 3 | EVENING SHIFT | 15
198 | 11 | 3 | Afternoon shift | 7
11 | null | 23 | TL | 27
11 | null | 23 | EVENING SHIFT | 31
12 | null | null | TL | 43
12 | null | null | EVENING SHIFT | 47
db<>fiddle here

Oracle: Calculate the count() based on the past 6 month interval for each rows

I have the following data (the data is available from 2017 - Present)
SELECT * FROM TABLE1 WHERE DATE > TO_DATE('01/01/2019','MM/DD/YYYY')
Emp_ID Date Vehicle_ID Working_Hours
1005 01/01/2019 X500 7
1005 01/02/2019 X500 6
1005 01/03/2019 X700 7
1005 01/04/2019 X500 5
1005 01/05/2019 X700 7
1005 01/06/2019 X500 7
1006 01/01/2019 X500 7
1006 01/02/2019 X500 6
1006 01/03/2019 X700 7
1006 01/04/2019 X500 5
1006 01/05/2019 X700 7
1006 01/06/2019 X500 7
I need to calculate two columns.
LAST_6M_UNIQ_Vehicle_Count ==> Count of Unique Vehicle ID in the last(past) 6 months for that employee
LAST_6M_Vehicle_Count ==> Count of all vehicle ID for that employee in the Past 6 months
Note: Past 6 month from the date column
Expected output:
Emp_ID Date Vehicle_ID Working_Hours LAST_6M_UNIQ_Vehicle_Count LAST_6M_Vehicle_Count
1005 01/01/2019 X500 7 6 66
1005 01/02/2019 X500 6 7 62
1005 01/03/2019 X700 7 6 63
1005 01/04/2019 X500 5 7 67
1005 01/05/2019 X700 7 7 66
1005 01/06/2019 X500 7 7 67
. . . .
. . . .
. . . .
1005 03/20/2019 X600 6 12 75
1006 01/01/2019 X500 7 11 74
1006 01/02/2019 X500 6 10 66
1006 01/03/2019 X700 7 11 72
1006 01/04/2019 X500 5 13 67
1006 01/05/2019 X700 7 12 64
1006 01/06/2019 X500 7 12 63
For example, in the first row, the value for LAST_6M_UNIQ_Vehicle_Count is 6 because for the employee id 1005, the unique count of vehicle id between ((01/01/2019) - 6 month) and 01/01/2019 has 6 different vehicle id in them.
I tried Over and Partition by but the 6 month interval is missing
SELECT t.*, COUNT(DISTINCT t.VEHICLE_ID) OVER (PARTITION BY t.EMP_ID ORDER BY t.DATE)
AS LAST_6M_UNIQ_Vehicle_Count
FROM TABLE1 t
I am not able to calculate the values based on 6 month interval for each rows.
Your help is much appreciated.
Oracle doesn't like COUNT( DISTINCT ... ) OVER ( ... ) when used in a windowed analytic function with a range and will raise an ORA-30487: ORDER BY not allowed here exception (otherwise, that would be the solution). It will work without the DISTINCT keyword but not with it.
Instead, you can use a correlated sub-query:
SELECT t.*,
( SELECT COUNT( DISTINCT vehicle_id )
FROM table_name c
WHERE c.emp_id = t.emp_id
AND c."DATE" <= t."DATE"
AND ADD_MONTHS( t."DATE", -6 ) <= c."DATE"
) AS last_6m_uniq_vehicle_count,
COUNT(t.vehicle_id) OVER (
PARTITION BY t.emp_id
ORDER BY t."DATE"
RANGE BETWEEN INTERVAL '6' MONTH PRECEDING
AND CURRENT ROW
) AS last_6m_vehicle_count
FROM table_name t
Which for the sample data:
CREATE TABLE table_name ( vehicle_id, emp_id, "DATE" ) AS
SELECT 1, 1, DATE '2020-08-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-07-31' FROM DUAL UNION ALL
SELECT 1, 1, DATE '2020-06-30' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-05-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-04-30' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-03-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-02-29' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-01-31' FROM DUAL UNION ALL
SELECT 3, 1, DATE '2020-01-31' FROM DUAL;
Outputs:
VEHICLE_ID | EMP_ID | DATE | LAST_6M_UNIQ_VEHICLE_COUNT | LAST_6M_VEHICLE_COUNT
---------: | -----: | :-------- | -------------------------: | --------------------:
2 | 1 | 31-JAN-20 | 2 | 2
3 | 1 | 31-JAN-20 | 2 | 2
2 | 1 | 29-FEB-20 | 2 | 3
2 | 1 | 31-MAR-20 | 2 | 4
2 | 1 | 30-APR-20 | 2 | 5
2 | 1 | 31-MAY-20 | 2 | 6
1 | 1 | 30-JUN-20 | 3 | 7
2 | 1 | 31-JUL-20 | 3 | 8
1 | 1 | 31-AUG-20 | 2 | 7
db<>fiddle here
You can do this with window functions, and a range frame specification.
Computing the distinct count is a bit tricky: Oracle does not support it directly, but we can proceed in two steps. First perform a window count within employee/vehicle partitions, and then take in account only the first occurence of each vehicle in the employee partition.
So:
select vehicle_id, emp_id, "DATE",
sum(case when flag = 1 then 1 else 0 end) over(
partition by emp_id
order by "DATE"
range between interval '6' month preceding and current row
) as last_6m_uniq_vehicle_count,
count(*) over (
partition by emp_id
order by "DATE"
range between interval '6' month preceding and current row
) as last_6m_vehicle_count
from (
select t.*,
count(*) over (
partition by emp_id , vehicle_id
order by "DATE"
range between interval '6' month preceding and current row
) as flag
from table_name t
) t
order by "DATE", vehicle_id
As MTO points out, count(distinct) cannot be used as a window function to solve this.
For that reason, I would go for a lateral join:
select t.*, l.*
from t cross join lateral
(select count(*) as last_6m_vehicle_count, count(distinct t.vehicle_id) as last_6m_uniq_vehicle_count
from t t2
where t2.emp_id = t.emp_id and
t2.dte <= t.dte and
t2.dte > add_months(t.dte, -6)
) l;
Here is a db<>fiddle.

ORACLE SQL - How to find the number of reliefs each teacher has, each day, 2 months before the teacher resigned?

I need some help in finding the number of reliefs each teacher has, every single day, 2 months before the teacher resigns.
Join_dt - teacher's join date,
Resign_dt - teacher's resign date,
Relief_ID - Relief teacher's ID,
Start_dt - Relief's start date,
End_dt - Relief's end date,
note that there may be overlapping dates between 2 or more different reliefs and so I need to find the number of distinct reliefs each teacher has for each date.
This is what I am given:
Teacher_ID Join_dt Resign_dt Relief_ID Start_dt End_dt
12 2006-08-30 2019-08-01 20 2017-02-07 2019-07-04
12 2006-08-30 2019-08-01 20 2016-11-10 2019-01-30
12 2006-08-30 2019-08-01 103 2016-08-20 2019-07-29
12 2006-08-30 2019-08-01 17 2016-01-30 2017-12-30
23 2017-10-01 2018-11-12 44 2018-10-19 2018-11-11
23 2017-10-01 2018-11-12 29 2018-04-01 2018-12-02
23 2017-10-01 2018-11-12 06 2017-11-25 2018-05-02
05 2015-02-11 2019-10-02 38 2019-01-17 2019-07-21
05 2015-02-11 2019-10-02 11 2018-11-02 2019-02-05
05 2015-02-11 2019-10-02 15 2018-09-30 2018-10-03
Expected result:
Teacher_ID Dates No_of_reliefs
12 2019-07-31 0
12 2019-07-30 0
12 2019-07-29 1
12 2019-07-28 1
12 2019-07-27 1
... ...
12 2019-07-04 2
... ...
12 2016-05-30 2
12 2016-05-29 2
12 2016-05-28 2
12 2016-05-27 2
12 2016-05-26 1
23 2018-10-31 2
... ...
For date 2019-07-29, No_of_reliefs = 1 because of Relief_ID 103.
For date 2017-07-04, No_of_reliefs = 2 because of Relief_ID 20 & 103.
Dates are supposed to start from 1 month before the teacher resigned. For Teacher_ID 23, since she resigned on 2019-11-12, dates shall start from 2019-10-31.
I have tried using connect by but the execution time is really long since it involves a large amount of data.
Any other methods will be greatly appreciated!!
Thank you kind souls!!!
You can use
connect by level <= last_day(add_months(Resign_dt,-1)) - add_months(Resign_dt,-2) clause :
I suppose you mean 2 months before resignment for the starting date, and ending on the last day of the previous month.
with t1(Teacher_ID,Resign_dt,Relief_ID,start_dt,end_dt) as
(
select 12,date'2019-08-01',20 ,date'2017-02-07',date'2019-07-04' from dual union all
select 12,date'2019-08-01',20 ,date'2016-11-10',date'2019-01-30' from dual union all
select 12,date'2019-08-01',103,date'2016-08-20',date'2019-07-29' from dual
......
), t2 as
(
select distinct last_day(add_months(Resign_dt,-1)) - level + 1 as Resign_dt, Teacher_ID
from t1
connect by level <= last_day(add_months(Resign_dt,-1)) - add_months(Resign_dt,-2)
and prior Teacher_ID = Teacher_ID and prior sys_guid() is not null
)
select Teacher_ID, to_char(Resign_dt,'yyyy-mm-dd') as Dates,
(select count(distinct Relief_ID)
from t1
where t2.Resign_dt between start_dt and end_dt
and t2.Teacher_ID = Teacher_ID
)
from t2
order by Teacher_ID, Resign_dt desc;
Demo
select d.dt
, tr.Teacher_ID
--, tr.Join_dt
--, tr.Resign_dt
, count(tr.Relief_ID)
--, tr.Start_dt
--, tr.End_dt
from tr
right outer join (
SELECT dt
FROM (
SELECT DATE '2006-01-01' + ROWNUM - 1 dt
FROM DUAL CONNECT BY ROWNUM < 5000
) q
WHERE EXTRACT(YEAR FROM dt) < EXTRACT(YEAR FROM sysdate) + 2
--order by 1
) d on d.dt between tr.Join_dt and tr.End_dt
and d.dt between tr.Start_dt and tr.Resign_dt
group by d.dt
, tr.Teacher_ID
order by d.dt desc

Sum values in column from previous rows, that fall within a certain time period, and share a unique identifier

I have a table with the columns: DATE, INDIVIDUAL_ID, CONDITION_FLAG. Each row represents an individual at a particular date. Each individual can appear at more than one date instance. The CONDITION_FLAG is either a 0, for condition not met, or a 1 for condition met.
I would like to count the number of 1s in the CONDITION_FLAG column, for that row's INDIVIDUAL_ID, that occurred between the date of a row and 60 days before that date (inclusive).
Please see the following example:
SELECT DATE
, INDIVIDUAL_ID
, CONDITION_FLAG
FROM ORIGINAL_TABLE
;
-- ORIGINAL TABLE:
---------------------------------------------
DATE INDIVIDUAL_ID CONDITION_FLAG
---------------------------------------------
15/07/19 01 0
12/07/19 01 1
01/07/19 01 1
30/06/19 01 1
15/07/19 02 1
11/07/19 02 0
29/06/19 02 1
14/07/19 03 0
02/07/19 03 1
30/06/19 03 0
28/06/19 01 0
---------------------------------------------
What do I have to add to the query above to create the PREV_CONDITION_COUNT column as shown below?
-- DESIRED TABLE:
---------------------------------------------------------------------
DATE INDIVIDUAL_ID CONDITION_FLAG PREV_CONDITION_COUNT
---------------------------------------------------------------------
15/07/19 01 0 3
12/07/19 01 1 3
01/07/19 01 1 2
30/06/19 01 1 1
15/07/19 02 1 2
11/07/19 02 0 1
29/06/19 02 1 1
14/07/19 03 0 1
02/07/19 03 1 1
30/06/19 03 0 0
28/06/19 01 0 0
---------------------------------------------------------------------
All help is appreciated. Thank you!
You can use a windowed analytic function to do it in a single table scan (without any joins or needing a correlated sub-query):
Oracle Setup:
CREATE TABLE ORIGINAL_TABLE (DATE1, INDIVIDUAL_ID, CONDITION_FLAG ) AS
SELECT DATE '2019-07-15', '01', 0 FROM DUAL UNION ALL
SELECT DATE '2019-07-12', '01', 1 FROM DUAL UNION ALL
SELECT DATE '2019-07-01', '01', 1 FROM DUAL UNION ALL
SELECT DATE '2019-06-30', '01', 1 FROM DUAL UNION ALL
SELECT DATE '2019-07-15', '02', 1 FROM DUAL UNION ALL
SELECT DATE '2019-07-11', '02', 0 FROM DUAL UNION ALL
SELECT DATE '2019-06-29', '02', 1 FROM DUAL UNION ALL
SELECT DATE '2019-07-14', '03', 0 FROM DUAL UNION ALL
SELECT DATE '2019-07-02', '03', 1 FROM DUAL UNION ALL
SELECT DATE '2019-06-30', '03', 0 FROM DUAL UNION ALL
SELECT DATE '2019-06-28', '01', 0 FROM DUAL
Query:
SELECT t.*,
SUM( CONDITION_FLAG ) OVER (
PARTITION BY INDIVIDUAL_ID
ORDER BY DATE1
RANGE BETWEEN 60 PRECEDING AND 0 PRECEDING
) PREV_CONDITION_COUNT
FROM ORIGINAL_TABLE t
ORDER BY INDIVIDUAL_ID, DATE1 DESC
Output:
DATE1 | INDIVIDUAL_ID | CONDITION_FLAG | PREV_CONDITION_COUNT
:-------- | :------------ | -------------: | -------------------:
15-JUL-19 | 01 | 0 | 3
12-JUL-19 | 01 | 1 | 3
01-JUL-19 | 01 | 1 | 2
30-JUN-19 | 01 | 1 | 1
28-JUN-19 | 01 | 0 | 0
15-JUL-19 | 02 | 1 | 2
11-JUL-19 | 02 | 0 | 1
29-JUN-19 | 02 | 1 | 1
14-JUL-19 | 03 | 0 | 1
02-JUL-19 | 03 | 1 | 1
30-JUN-19 | 03 | 0 | 0
db<>fiddle here
You can use the following self join for it:
SELECT
OT1.DATE1,
OT1.INDIVIDUAL_ID,
OT1.CONDITION_FLAG,
SUM(CASE
WHEN OT2.DATE1 BETWEEN OT1.DATE1 - 60 AND OT1.DATE1 THEN OT2.CONDITION_FLAG
END) AS P
FROM
ORIGINAL_TABLE OT1
JOIN ORIGINAL_TABLE OT2 ON ( OT1.INDIVIDUAL_ID = OT2.INDIVIDUAL_ID )
GROUP BY
OT1.DATE1,
OT1.INDIVIDUAL_ID,
OT1.CONDITION_FLAG;
db<>fiddle demo
-- In the demo, You can ignore ID column as it was just used to give you output in order defined in the question
Cheers!!
Another option: a correlated subquery:
SQL> with test (cdate, individual_id, conditional_flag) as
2 (select date '2019-07-15', '01', 0 from dual union all
3 select date '2019-07-12', '01', 1 from dual union all
4 select date '2019-07-01', '01', 1 from dual union all
5 select date '2019-06-30', '01', 1 from dual union all
6 --
7 select date '2019-07-15', '02', 1 from dual union all
8 select date '2019-07-11', '02', 0 from dual union all
9 select date '2019-06-29', '02', 1 from dual union all
10 --
11 select date '2019-07-14', '03', 0 from dual union all
12 select date '2019-07-02', '03', 1 from dual union all
13 select date '2019-06-30', '03', 0 from dual union all
14 --
15 select date '2019-06-28', '01', 0 from dual
16 )
17 select t.cdate,
18 t.individual_id,
19 t.conditional_flag,
20 --
21 (select count(*)
22 from test t1
23 where t1.individual_id = t.individual_id
24 and t1.conditional_flag = 1
25 and t1.cdate between t.cdate - 60 and t.cdate
26 ) prev_condition_count
27 from test t
28 order by t.individual_id,
29 t.cdate desc;
CDATE IN CONDITIONAL_FLAG PREV_CONDITION_COUNT
---------- -- ---------------- --------------------
15/07/2019 01 0 3
12/07/2019 01 1 3
01/07/2019 01 1 2
30/06/2019 01 1 1
28/06/2019 01 0 0
15/07/2019 02 1 2
11/07/2019 02 0 1
29/06/2019 02 1 1
14/07/2019 03 0 1
02/07/2019 03 1 1
30/06/2019 03 0 0
11 rows selected.
SQL>

Oracle Event Count Query

My SAMPLE table has the following five columns:
sample_id (PK) (NUMBER)
sampled_on (DATE)
received_on (DATE)
completed_on (DATE)
authorized_on (DATE)
I would like a query with one row per hour (constrained by a given date range) and five columns:
The hour YYYY-MM-DD HH24
Number of samples sampled during that hour
Number of samples received during that hour
Number of samples completed during that hour
Number of samples authorized during that hour
Please provide a query or at least a point in the right direction.
Reopened with bounty:
+300 reputation for the first person to incorporate Rob van Wijk's answer (single access to sample) into a view where I can efficiently query by date range (start_date/end_date or start_date/num_days).
Try:
CREATE OR REPLACE VIEW my_view AS
WITH date_bookends AS (
SELECT LEAST(MIN(t.sampled_on), MIN(t.received_on), MIN(t.completed_on), MIN(t.authorized_on)) 'min_date'
GREATEST(MAX(t.sampled_on), MAX(t.received_on), MAX(t.completed_on), MAX(t.authorized_on)) 'max_date'
FROM SAMPLE t),
all_hours AS (
SELECT t.min_date + numtodsinterval(LEVEL - 1,'hour') date_by_hour
FROM date_bookends t
CONNECT BY LEVEL <= ( t.max_date - t.min_date + 1) * 24)
SELECT h.date_by_hour,
COUNT(CASE WHEN h.hour = TRUNC(s.sampled_on,'hh24') THEN 1 END) sampled#
COUNT(CASE WHEN h.hour = TRUNC(s.received_on,'hh24') THEN 1 END) received#
COUNT(CASE WHEN h.hour = TRUNC(s.completed_on,'hh24') THEN 1 END) completed#
COUNT(CASE WHEN h.hour = TRUNC(s.authorized_on,'hh24') THEN 1 END) authorized#
FROM all_hours h
CROSS JOIN sample s
GROUP BY h.hour
Without using Subquery Factoring:
CREATE OR REPLACE VIEW my_view AS
SELECT h.date_by_hour,
COUNT(CASE WHEN h.hour = TRUNC(s.sampled_on,'hh24') THEN 1 END) sampled#
COUNT(CASE WHEN h.hour = TRUNC(s.received_on,'hh24') THEN 1 END) received#
COUNT(CASE WHEN h.hour = TRUNC(s.completed_on,'hh24') THEN 1 END) completed#
COUNT(CASE WHEN h.hour = TRUNC(s.authorized_on,'hh24') THEN 1 END) authorized#
FROM (SELECT t.min_date + numtodsinterval(LEVEL - 1,'hour') date_by_hour
FROM (SELECT LEAST(MIN(t.sampled_on), MIN(t.received_on), MIN(t.completed_on), MIN(t.authorized_on)) 'min_date'
GREATEST(MAX(t.sampled_on), MAX(t.received_on), MAX(t.completed_on), MAX(t.authorized_on)) 'max_date'
FROM SAMPLE t) t
CONNECT BY LEVEL <= ( t.max_date - t.min_date + 1) * 24) h
CROSS JOIN sample s
GROUP BY h.hour
The query accesses the SAMPLES table twice - the first time to get the earliest & latest date to frame the construction of the date_by_hour value.
This may not be the prettiest or most optimal solution, but it seems to work. Explanation: first convert all the dates to YYYY-MM-DD HH24 format, next gather number sampled/received/completed/authorized by date+HH24, finally join together.
with sample_hour as
(select sample_id,
to_char(sampled_on, 'YYYY-MM-DD HH24') sampled_on,
to_char(received_on, 'YYYY-MM-DD HH24') received_on,
to_char(completed_on, 'YYYY-MM-DD HH24') completed_on,
to_char(authorized_on, 'YYYY-MM-DD HH24') authorized_on
from sample),
s as
(select sampled_on thedate, count(*) num_sampled
from sample_hour
group by sampled_on),
r as
(select received_on thedate, count(*) num_received
from sample_hour
group by received_on),
c as
(select completed_on thedate, count(*) num_completed
from sample_hour
group by completed_on),
a as
(select authorized_on thedate, count(*) num_authorized
from sample_hour
group by authorized_on)
select s.thedate, num_sampled, num_received, num_completed, num_authorized
from s
left join r on s.thedate = r.thedate
left join c on s.thedate = c.thedate
left join a on s.thedate = a.thedate
;
This assumes a table "sample" created something like this:
create table sample
(sample_id number not null primary key,
sampled_on date,
received_on date,
completed_on date,
authorized_on date);
Here is an example. First create the table and insert some random data.
SQL> create table sample
2 ( sample_id number primary key
3 , sampled_on date
4 , received_on date
5 , completed_on date
6 , authorized_on date
7 )
8 /
Tabel is aangemaakt.
SQL> insert into sample
2 select level
3 , trunc(sysdate) + dbms_random.value(0,2)
4 , trunc(sysdate) + dbms_random.value(0,2)
5 , trunc(sysdate) + dbms_random.value(0,2)
6 , trunc(sysdate) + dbms_random.value(0,2)
7 from dual
8 connect by level <= 1000
9 /
1000 rijen zijn aangemaakt.
Then introduce the variables for your given date range and fill them.
SQL> var DATE_RANGE_START varchar2(10)
SQL> var DATE_RANGE_END varchar2(10)
SQL> exec :DATE_RANGE_START := '2009-10-23'
PL/SQL-procedure is geslaagd.
SQL> exec :DATE_RANGE_END := '2009-10-24'
PL/SQL-procedure is geslaagd.
First you'll have to generate all hours in your given date range. This makes sure that in case you have an hour where no dates are present, you'll still have a record with 4 zeros. The implementation is in the all_hours query. The rest of the query (with only one table access to your sample table!) can then be quite simple like this.
SQL> with all_hours as
2 ( select to_date(:DATE_RANGE_START,'yyyy-mm-dd') + numtodsinterval(level-1,'hour') hour
3 from dual
4 connect by level <=
5 ( to_date(:DATE_RANGE_END,'yyyy-mm-dd')
6 - to_date(:DATE_RANGE_START,'yyyy-mm-dd')
7 + 1
8 ) * 24
9 )
10 select h.hour
11 , count(case when h.hour = trunc(s.sampled_on,'hh24') then 1 end) sampled#
12 , count(case when h.hour = trunc(s.received_on,'hh24') then 1 end) received#
13 , count(case when h.hour = trunc(s.completed_on,'hh24') then 1 end) completed#
14 , count(case when h.hour = trunc(s.authorized_on,'hh24') then 1 end) authorized#
15 from all_hours h
16 cross join sample s
17 group by h.hour
18 /
HOUR SAMPLED# RECEIVED# COMPLETED# AUTHORIZED#
------------------- ---------- ---------- ---------- -----------
23-10-2009 00:00:00 18 25 20 20
23-10-2009 01:00:00 26 24 16 13
23-10-2009 02:00:00 16 26 17 15
23-10-2009 03:00:00 19 18 27 13
23-10-2009 04:00:00 28 20 18 23
23-10-2009 05:00:00 17 13 19 21
23-10-2009 06:00:00 18 23 16 15
23-10-2009 07:00:00 19 24 14 22
23-10-2009 08:00:00 21 19 23 22
23-10-2009 09:00:00 25 20 23 24
23-10-2009 10:00:00 16 21 25 18
23-10-2009 11:00:00 21 29 21 18
23-10-2009 12:00:00 33 28 24 20
23-10-2009 13:00:00 24 19 15 15
23-10-2009 14:00:00 20 27 16 25
23-10-2009 15:00:00 15 25 27 13
23-10-2009 16:00:00 19 14 27 18
23-10-2009 17:00:00 22 22 15 27
23-10-2009 18:00:00 20 19 29 23
23-10-2009 19:00:00 20 18 17 23
23-10-2009 20:00:00 11 18 20 27
23-10-2009 21:00:00 13 25 24 19
23-10-2009 22:00:00 22 13 22 29
23-10-2009 23:00:00 20 20 19 24
24-10-2009 00:00:00 18 17 18 29
24-10-2009 01:00:00 23 30 26 21
24-10-2009 02:00:00 28 19 28 25
24-10-2009 03:00:00 21 21 11 23
24-10-2009 04:00:00 23 20 21 17
24-10-2009 05:00:00 24 16 23 23
24-10-2009 06:00:00 23 26 22 30
24-10-2009 07:00:00 25 26 18 12
24-10-2009 08:00:00 24 20 23 17
24-10-2009 09:00:00 18 26 15 19
24-10-2009 10:00:00 20 19 25 18
24-10-2009 11:00:00 19 27 17 20
24-10-2009 12:00:00 23 16 18 20
24-10-2009 13:00:00 15 15 22 19
24-10-2009 14:00:00 23 23 16 29
24-10-2009 15:00:00 18 31 32 28
24-10-2009 16:00:00 22 15 18 13
24-10-2009 17:00:00 25 17 20 26
24-10-2009 18:00:00 19 20 21 16
24-10-2009 19:00:00 22 13 28 29
24-10-2009 20:00:00 23 17 23 14
24-10-2009 21:00:00 18 18 21 22
24-10-2009 22:00:00 22 20 18 21
24-10-2009 23:00:00 21 18 22 22
48 rijen zijn geselecteerd.
Hope this helps.
Regards,
Rob.
I'd do are 4 queries like this (one for each date):
SELECT <date to hour>, count(*) FROM sample GROUP BY <date to hour>
And then put the data together in the application. If you really want a single query, you can join the individual queries on hour.
Try this...
WITH src_data AS
( SELECT sample_id
, TRUNC( sampled_on, 'HH24' ) sampled_on
, TRUNC( received_on, 'HH24' ) received_on
, TRUNC( completed_on, 'HH24' ) completed_on
, TRUNC( authorized_on, 'HH24' ) authorized_on
FROM sample
)
, src_hours AS
( SELECT sampled_on the_date
FROM src_data
WHERE sampled_on IS NOT NULL
UNION
SELECT received_on the_date
FROM src_data
WHERE received_on IS NOT NULL
UNION
SELECT completed_on the_date
FROM src_data
WHERE completed_on IS NOT NULL
UNION
SELECT authorized_on the_date
FROM src_data
WHERE authorized_on IS NOT NULL
)
SELECT h.the_date
, ( SELECT COUNT(*)
FROM src_data s
WHERE s.sampled_on = h.the_date ) num_sampled_on
, ( SELECT COUNT(*)
FROM src_data r
WHERE r.received_on = h.the_date ) num_received_on
, ( SELECT COUNT(*)
FROM src_data c
WHERE c.completed_on = h.the_date ) num_completed_on
, ( SELECT COUNT(*)
FROM src_data a
WHERE a.authorized_on = h.the_date ) num_authorized_on
FROM src_hours h
Maybe somthing like creating this view:
create view hours as
select hour, max(cnt_sample) cnt_sample, max(cnt_received) cnt_received, max(cnt_completed) cnt_completed, max(cnt_authorized) cnt_authorized
from (
select to_char(sampled_on , 'yyyymmddhh24') hour,
count(sample_id) over (partition by to_char(sampled_on ,'yyyymmddhh24')) cnt_sample,
0 cnt_received,
0 cnt_completed,
0 cnt_authorized from sample union all
select to_char(received_on , 'yyyymmddhh24') hour,
0 cnt_sample,
count(sample_id) over (partition by to_char(received_on ,'yyyymmddhh24')) cnt_received,
0 cnt_completed,
0 cnt_authorized from sample union all
select to_char(completed_on , 'yyyymmddhh24') hour,
0 cnt_sample,
0 cnt_received,
count(sample_id) over (partition by to_char(completed_on ,'yyyymmddhh24')) cnt_completed,
0 cnt_authorized from sample union all
select to_char(authorized_on, 'yyyymmddhh24') hour,
0 cnt_sample,
0 cnt_received,
0 cnt_completed,
count(sample_id) over (partition by to_char(authorized_on ,'yyyymmddhh24')) cnt_authorized from sample
)
group by hour
;
and then selecting from the view:
select * from hours where hour >= '2001010102' and hour <= '2001010105'
order by hour;
I now propose:
create view hours_ as
with four as (
select 1 as n from dual union all
select 2 as n from dual union all
select 3 as n from dual union all
select 4 as n from dual )
select
case when four.n = 1 then trunc(sampled_on , 'hh24')
when four.n = 2 then trunc(received_on , 'hh24')
when four.n = 3 then trunc(completed_on , 'hh24')
when four.n = 4 then trunc(authorized_on, 'hh24')
end hour_,
sum ( case when four.n = 1 then 1
else 0
end ) sample_,
sum ( case when four.n = 2 then 1
else 0
end ) receive_,
sum ( case when four.n = 3 then 1
else 0
end ) complete_,
sum ( case when four.n = 4 then 1
else 0
end ) authorize_
from
four cross join sample
group by
case when four.n = 1 then trunc(sampled_on , 'hh24')
when four.n = 2 then trunc(received_on , 'hh24')
when four.n = 3 then trunc(completed_on , 'hh24')
when four.n = 4 then trunc(authorized_on, 'hh24')
end ;
In order to see if the view is indeed accessed only once:
explain plan for select * from hours_
where hour_ between sysdate -1 and sysdate;
select * from table (dbms_xplan.display);
Which results in:
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 61 | 16 (7)| 00:00:01 |
| 1 | VIEW | HOURS_ | 1 | 61 | 16 (7)| 00:00:01 |
| 2 | HASH GROUP BY | | 1 | 39 | 16 (7)| 00:00:01 |
|* 3 | FILTER | | | | | |
| 4 | NESTED LOOPS | | 1 | 39 | 15 (0)| 00:00:01 |
| 5 | VIEW | | 4 | 12 | 8 (0)| 00:00:01 |
| 6 | UNION-ALL | | | | | |
| 7 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 8 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 9 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 10 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
|* 11 | TABLE ACCESS FULL| SAMPLE | 1 | 36 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Here's what I'm thinking, but I'm not sure it's optimal enough for a view.
select
the_date,
sum(decode(the_type,'S',the_count,0)) samples,
sum(decode(the_type,'R',the_count,0)) receipts,
sum(decode(the_type,'C',the_count,0)) completions,
sum(decode(the_type,'A',the_count,0)) authorizations
from(
select
trunc(sampled_on,'HH24') the_date,
'S' the_type,
count(1) the_count
FROM sample
group by trunc(sampled_on,'HH24')
union all
select
trunc(received_on,'HH24'),
'R',
count(1)
FROM sample
group by trunc(received_on,'HH24')
union all
select
trunc(completed_on,'HH24'),
'C',
count(1)
FROM sample
group by trunc(completed_on,'HH24')
union all
select
trunc(authorized_on,'HH24'),
'A',
count(1)
FROM sample
group by trunc(authorized_on,'HH24')
)
group by the_date
Then, to query, you could just query with normal date contructs:
select * from magic_view where the_date > sysdate-1;
EDIT
Okay, so I created a sample table and did some metrics:
create table sample (
sample_id number primary key,
sampled_on date,
received_on date,
completed_on date,
authorized_on date
);
insert into sample (
select
level,
trunc(sysdate) + dbms_random.value(0,2),
trunc(sysdate) + dbms_random.value(0,2),
trunc(sysdate) + dbms_random.value(0,2),
trunc(sysdate) + dbms_random.value(0,2),
from dual
connect by level <= 1000
);
The explain plan is:
---------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
---------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4000 | 97K| 25 (20)|
| 1 | HASH GROUP BY | | 4000 | 97K| 25 (20)|
| 2 | VIEW | | 4000 | 97K| 24 (17)|
| 3 | UNION-ALL | | | | |
| 4 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 5 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
| 6 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 7 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
| 8 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 9 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
| 10 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 11 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
---------------------------------------------------------------------
On my machine, the a query against this view for the past 24 hours completes in 23ms. Not bad, but it's only 1,000 rows. Before you discount the 4 separate queries, you'll need to do performance analysis of the individual solutions.
Similar to René Nyffenegger's idea. Filter by each type of date field, and then amalgamate the counts.
Note, that it's not possible to do this query in one Select, because you need to both Group and Order By each date field, this is impossible without splitting into separate sub-queries.
I have coded a date range of '2009-11-04' to '2009-11-04 23:59:59' for this example:
SELECT
DateHour,
SUM(sampled) total_sampled,
SUM(received) total_received,
SUM(completed) total_completed,
SUM(authorized) total_authorized
FROM
(SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
1 sampled,
0 received,
0 completed,
0 authorized
FROM
SAMPLE
WHERE
sampled_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND sampled_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
UNION ALL
SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
0 sampled,
1 received,
0 completed,
0 authorized
FROM
SAMPLE
WHERE
received_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND received_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
UNION ALL
SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
0 sampled,
0 received,
1 completed,
0 authorized
FROM
SAMPLE
WHERE
completed_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND completed_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
UNION ALL
SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
0 sampled,
0 received,
0 completed,
1 authorized
FROM
SAMPLE
WHERE
authorized_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND authorized_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS'))
GROUP BY
DateHour
ORDER BY
DateHour