Pivot two columns and keep the values same in sql - sql

I have created a query to get different time types and hours
SELECT calc_time.hours measure,
calc_time.payroll_time_type elements,
calc_time.person_id,
calc_time.start_time
FROM hwm_tm_rep_work_hours_sum_v calc_time,
per_all_people_f papf
WHERE grp_type_id = 200
AND payroll_time_type IN ( 'Afternoon shift',
'TL',
'Evening shift',
'Regular Pay ',
'OT' )
AND (To_date(To_char(calc_time.start_time, 'YYYY-MM-DD') , 'YYYY-MM-DD') BETWEEN To_date(To_char(:From_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD')
AND To_date( To_char(:To_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD' ))
AND papf.person_id = calc_time.person_id
I get the output like -
Start_time person_id elements measure
01-Jan-2021 198 Regular Pay 10
01-Jan-2021 198 OT 2
01-jAN-2021 198 Afternoon shift 2
16-JAN-2021 198 Regular Pay 10
17-JAN-2021 198 OT 3
20-JAN-2021 198 EVENING SHIFT 8
08-JAN-2021 11 Regular Pay 8
09-JAN-2021 11 OT 1
08-JAN-2021 11 tl 2
10-JAN-2021 12 Evening shift 9
11-JAN-2021 12 Evening shift 9
I want this output to be dispplayed as follows WITHIN TWO DATES THAT I PASS AS PARAMETER - LIKE PARAMETER TO AND FROM DATE 01-JAN-2021 AND 31-JAN-2021
person_id Regular_pay OT OTHER_MEASURE OTHER_CODE
198 20 5 2 Afternoon shift
198 20 5 8 EVENING SHIFT
11 8 1 2 TL
12 18 Evening shift
So sum of Regular pay and OT IN separate columns and all others in other_measure and other_code
How can I tweak the main query to achieve this?

You can use:
SELECT *
FROM (
SELECT c.person_id,
SUM(CASE c.payroll_time_type WHEN 'Regular Pay' THEN SUM(c.hours) END)
OVER (PARTITION BY c.person_id) AS regular_pay,
SUM(CASE c.payroll_time_type WHEN 'OT' THEN SUM(c.hours) END)
OVER (PARTITION BY c.person_id) AS OT,
SUM(c.hours) AS other_measure,
c.payroll_time_type AS Other_code
FROM hwm_tm_rep_work_hours_sum_v c
INNER JOIN per_all_people_f p
ON (p.person_id = c.person_id)
WHERE grp_type_id = 200
AND payroll_time_type IN (
'Afternoon shift',
'TL',
'Evening shift',
'Regular Pay',
'OT'
)
AND c.start_time >= TRUNC(:from_date)
AND c.start_time < TRUNC(:to_date) + INTERVAL '1' DAY
GROUP BY
c.person_id,
c.payroll_time_type
)
WHERE other_code NOT IN ('Regular Pay', 'OT');
Which, for the sample data:
CREATE TABLE hwm_tm_rep_work_hours_sum_v (start_time, person_id, payroll_time_type, hours) AS
SELECT DATE '2021-01-01', 198, 'Regular Pay', 10 FROM DUAL UNION ALL
SELECT DATE '2021-01-01', 198, 'OT', 2 FROM DUAL UNION ALL
SELECT DATE '2021-01-01', 198, 'Afternoon shift', 2 FROM DUAL UNION ALL
SELECT DATE '2021-01-16', 198, 'Regular Pay', 10 FROM DUAL UNION ALL
SELECT DATE '2021-01-17', 198, 'OT', 3 FROM DUAL UNION ALL
SELECT DATE '2021-01-20', 198, 'Evening shift', 8 FROM DUAL UNION ALL
SELECT DATE '2021-01-08', 11, 'Regular Pay', 8 FROM DUAL UNION ALL
SELECT DATE '2021-01-09', 11, 'OT', 1 FROM DUAL UNION ALL
SELECT DATE '2021-01-08', 11, 'TL', 2 FROM DUAL UNION ALL
SELECT DATE '2021-01-10', 12, 'Evening shift', 9 FROM DUAL UNION ALL
SELECT DATE '2021-01-11', 12, 'Evening shift', 9 FROM DUAL;
CREATE TABLE per_all_people_f (person_id, grp_type_id) AS
SELECT 198, 200 FROM DUAL UNION ALL
SELECT 11, 200 FROM DUAL UNION ALL
SELECT 12, 200 FROM DUAL;
Outputs:
PERSON_ID
REGULAR_PAY
OT
OTHER_MEASURE
OTHER_CODE
11
8
1
2
TL
12
18
Evening shift
198
20
5
2
Afternoon shift
198
20
5
8
Evening shift
db<>fiddle here

You could try something like this - In your question, unfortunately, it is not clear in which table which columns/values ​​are available.
SELECT
calc_time.person_id,
(select sum(calc_time.start_time) FROM hwm_tm_rep_work_hours_sum_v calc_time where papf.person_id = calc_time.person_id and calc_time.payroll_time_type = 'Regular Pay') as Regular_Pay,
...
FROM hwm_tm_rep_work_hours_sum_v calc_time,
per_all_people_f papf
WHERE grp_type_id = 200
AND payroll_time_type IN ( 'Afternoon shift',
'TL',
'Evening shift',
'Regular Pay ',
'OT' )
AND (
To_date(To_char(calc_time.start_time, 'YYYY-MM-DD') , 'YYYY-MM-DD') BETWEEN To_date(To_char(:From_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD')
AND To_date( To_char(:To_Date, 'YYYY-MM-DD'), 'YYYY-MM-DD' ) )
and papf.person_id = calc_time.person_id
-- use a group by
GROUP BY
calc_time.person_id

You may use aggregation and then apply model clause to calculate the required columns. Below is the code with comments, assuming you can manage filter by dates.
select *
from t
PERSON_ID | ELEMENTS | MEASURE
--------: | :-------------- | ------:
198 | Regular Pay | 1
198 | Regular Pay | 2
198 | Afternoon shift | 3
198 | Afternoon shift | 4
198 | OT | 5
198 | OT | 6
198 | EVENING SHIFT | 7
198 | EVENING SHIFT | 8
11 | Regular Pay | 11
11 | Regular Pay | 12
11 | TL | 13
11 | TL | 14
11 | EVENING SHIFT | 15
11 | EVENING SHIFT | 16
12 | TL | 21
12 | TL | 22
12 | EVENING SHIFT | 23
12 | EVENING SHIFT | 24
select
person_id,
ot,
regular_pay,
elements as other_code,
mes as other_measure
from (
/*First you need to aggregate all the measures by person_id and code*/
select
person_id,
elements,
sum(measure) as mes
from t
/*Date filter goes here*/
group by
person_id,
elements
)
model
/*RETURN UPDATED ROWS
will do the trick,
because we'll update only "other"
measures, so OT and Regular pay will no go
to the output*/
return updated rows
/*Where to break the calculation*/
partition by (person_id)
/*To be able to reference by code*/
dimension by (elements)
measures (
mes,
0 as ot,
0 as regular_pay
)
rules upsert (
ot[
elements not in ('OT', 'Regular Pay')
] = sum(mes)['OT'],
regular_pay[
elements not in ('OT', 'Regular Pay')
] = sum(mes)['Regular Pay']
)
PERSON_ID | OT | REGULAR_PAY | OTHER_CODE | OTHER_MEASURE
--------: | ---: | ----------: | :-------------- | ------------:
198 | 11 | 3 | EVENING SHIFT | 15
198 | 11 | 3 | Afternoon shift | 7
11 | null | 23 | TL | 27
11 | null | 23 | EVENING SHIFT | 31
12 | null | null | TL | 43
12 | null | null | EVENING SHIFT | 47
db<>fiddle here

Related

Return True if Date in 1 table is within 24 hours of Date from table 2

I basically want to return a new column, 'First_24?' as shown in the desired output below that will return 'TRUE' IFF the the Folder_ID has a 'FOLDER_DATE_CREATED' within the first 24 hours following the 'TEAM_CREATE_DATE'. My attempt below for some reason yields TRUE for everything?
SELECT Folder_ID,
TEAM_ID,
FOLDER_DATE_CREATED,
NAME,
CASE WHEN t.TEAM_CREATE_DATE <= d.FOLDER_DATE_CREATED + interval '24 hours'
THEN TRUE
ELSE FALSE END AS First_24?
FROM DATA d
JOIN TEAM t on t.id = d.id
Current data Model
Folder_ID TEAM_ID FOLDER_DATE_CREATED NAME
11 100 1/21/2021 Sample 1
12 101 1/24/2021 Sample 2
13 102 4/21/2021 Sample 3
14 103 3/11/2021 Sample 4
15 104 5/31/2021 Sample 5
16 104 4/12/2021 Sample 6
TEAM_ID Team_Create_Date
100 1/21/2021
101 1/24/2021
102 2/20/2020
103 3/21/2020
104 4/12/2021
104 4/12/2021
Desired Output
Folder_ID TEAM_ID FOLDER_DATE_CREATED NAME First_24?
11 100 1/21/2021 Sample 1 TRUE
12 101 1/24/2021 Sample 2 TRUE
13 102 4/21/2021 Sample 3 FALSE
14 103 3/11/2021 Sample 4 FALSE
15 104 5/31/2021 Sample 5 FALSE
16 104 4/12/2021 Sample 6 TRUE
You can use a subquery with EXISTS
SELECT d.*,
(EXISTS (SELECT 1
FROM TEAM t
WHERE d.FOLDER_DATE_CREATED >= t.TEAM_CREATE_DATE AND
d.FOLDER_DATE_CREATED < t.TEAM_CREATE_DATE + interval '24 hours'
)
) AS First_24?
FROM DATA d;
You need to check, if FOLDER_DATE_CREATED is less than TEAM_CREATE_DATE plus 24 hours. You've added 24 to another side:
with f(Folder_ID, TEAM_ID, FOLDER_DATE_CREATED, NAME) as (
select *
from (values
( 11, 100, date '2021-01-21', 'Sample 1' ),
( 12, 101, date '2021-01-24', 'Sample 2' ),
( 13, 102, date '2021-04-21', 'Sample 3' ),
( 14, 103, date '2021-03-11', 'Sample 4' ),
( 15, 104, date '2021-05-31', 'Sample 5' ),
( 16, 104, date '2021-04-12', 'Sample 6' )
) as t
)
, t(TEAM_ID, Team_Create_Date) as (
select *
from (values
( 100, date '2021-01-21' ),
( 101, date '2021-01-24' ),
( 102, date '2020-02-20' ),
( 103, date '2020-03-21' ),
( 104, date '2021-04-12' )
) as a
)
select f.*
, cast(
f.FOLDER_DATE_CREATED
between t.Team_Create_Date
and t.Team_Create_Date + interval '24 hours'
as varchar(5)) as "First_24?"
from f
join t
using(team_id)
folder_id | team_id | folder_date_created | name | First_24?
--------: | ------: | :------------------ | :------- | :--------
11 | 100 | 2021-01-21 | Sample 1 | true
12 | 101 | 2021-01-24 | Sample 2 | true
13 | 102 | 2021-04-21 | Sample 3 | false
14 | 103 | 2021-03-11 | Sample 4 | false
15 | 104 | 2021-05-31 | Sample 5 | false
16 | 104 | 2021-04-12 | Sample 6 | true
db<>fiddle here

Oracle: Calculate the count() based on the past 6 month interval for each rows

I have the following data (the data is available from 2017 - Present)
SELECT * FROM TABLE1 WHERE DATE > TO_DATE('01/01/2019','MM/DD/YYYY')
Emp_ID Date Vehicle_ID Working_Hours
1005 01/01/2019 X500 7
1005 01/02/2019 X500 6
1005 01/03/2019 X700 7
1005 01/04/2019 X500 5
1005 01/05/2019 X700 7
1005 01/06/2019 X500 7
1006 01/01/2019 X500 7
1006 01/02/2019 X500 6
1006 01/03/2019 X700 7
1006 01/04/2019 X500 5
1006 01/05/2019 X700 7
1006 01/06/2019 X500 7
I need to calculate two columns.
LAST_6M_UNIQ_Vehicle_Count ==> Count of Unique Vehicle ID in the last(past) 6 months for that employee
LAST_6M_Vehicle_Count ==> Count of all vehicle ID for that employee in the Past 6 months
Note: Past 6 month from the date column
Expected output:
Emp_ID Date Vehicle_ID Working_Hours LAST_6M_UNIQ_Vehicle_Count LAST_6M_Vehicle_Count
1005 01/01/2019 X500 7 6 66
1005 01/02/2019 X500 6 7 62
1005 01/03/2019 X700 7 6 63
1005 01/04/2019 X500 5 7 67
1005 01/05/2019 X700 7 7 66
1005 01/06/2019 X500 7 7 67
. . . .
. . . .
. . . .
1005 03/20/2019 X600 6 12 75
1006 01/01/2019 X500 7 11 74
1006 01/02/2019 X500 6 10 66
1006 01/03/2019 X700 7 11 72
1006 01/04/2019 X500 5 13 67
1006 01/05/2019 X700 7 12 64
1006 01/06/2019 X500 7 12 63
For example, in the first row, the value for LAST_6M_UNIQ_Vehicle_Count is 6 because for the employee id 1005, the unique count of vehicle id between ((01/01/2019) - 6 month) and 01/01/2019 has 6 different vehicle id in them.
I tried Over and Partition by but the 6 month interval is missing
SELECT t.*, COUNT(DISTINCT t.VEHICLE_ID) OVER (PARTITION BY t.EMP_ID ORDER BY t.DATE)
AS LAST_6M_UNIQ_Vehicle_Count
FROM TABLE1 t
I am not able to calculate the values based on 6 month interval for each rows.
Your help is much appreciated.
Oracle doesn't like COUNT( DISTINCT ... ) OVER ( ... ) when used in a windowed analytic function with a range and will raise an ORA-30487: ORDER BY not allowed here exception (otherwise, that would be the solution). It will work without the DISTINCT keyword but not with it.
Instead, you can use a correlated sub-query:
SELECT t.*,
( SELECT COUNT( DISTINCT vehicle_id )
FROM table_name c
WHERE c.emp_id = t.emp_id
AND c."DATE" <= t."DATE"
AND ADD_MONTHS( t."DATE", -6 ) <= c."DATE"
) AS last_6m_uniq_vehicle_count,
COUNT(t.vehicle_id) OVER (
PARTITION BY t.emp_id
ORDER BY t."DATE"
RANGE BETWEEN INTERVAL '6' MONTH PRECEDING
AND CURRENT ROW
) AS last_6m_vehicle_count
FROM table_name t
Which for the sample data:
CREATE TABLE table_name ( vehicle_id, emp_id, "DATE" ) AS
SELECT 1, 1, DATE '2020-08-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-07-31' FROM DUAL UNION ALL
SELECT 1, 1, DATE '2020-06-30' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-05-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-04-30' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-03-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-02-29' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-01-31' FROM DUAL UNION ALL
SELECT 3, 1, DATE '2020-01-31' FROM DUAL;
Outputs:
VEHICLE_ID | EMP_ID | DATE | LAST_6M_UNIQ_VEHICLE_COUNT | LAST_6M_VEHICLE_COUNT
---------: | -----: | :-------- | -------------------------: | --------------------:
2 | 1 | 31-JAN-20 | 2 | 2
3 | 1 | 31-JAN-20 | 2 | 2
2 | 1 | 29-FEB-20 | 2 | 3
2 | 1 | 31-MAR-20 | 2 | 4
2 | 1 | 30-APR-20 | 2 | 5
2 | 1 | 31-MAY-20 | 2 | 6
1 | 1 | 30-JUN-20 | 3 | 7
2 | 1 | 31-JUL-20 | 3 | 8
1 | 1 | 31-AUG-20 | 2 | 7
db<>fiddle here
You can do this with window functions, and a range frame specification.
Computing the distinct count is a bit tricky: Oracle does not support it directly, but we can proceed in two steps. First perform a window count within employee/vehicle partitions, and then take in account only the first occurence of each vehicle in the employee partition.
So:
select vehicle_id, emp_id, "DATE",
sum(case when flag = 1 then 1 else 0 end) over(
partition by emp_id
order by "DATE"
range between interval '6' month preceding and current row
) as last_6m_uniq_vehicle_count,
count(*) over (
partition by emp_id
order by "DATE"
range between interval '6' month preceding and current row
) as last_6m_vehicle_count
from (
select t.*,
count(*) over (
partition by emp_id , vehicle_id
order by "DATE"
range between interval '6' month preceding and current row
) as flag
from table_name t
) t
order by "DATE", vehicle_id
As MTO points out, count(distinct) cannot be used as a window function to solve this.
For that reason, I would go for a lateral join:
select t.*, l.*
from t cross join lateral
(select count(*) as last_6m_vehicle_count, count(distinct t.vehicle_id) as last_6m_uniq_vehicle_count
from t t2
where t2.emp_id = t.emp_id and
t2.dte <= t.dte and
t2.dte > add_months(t.dte, -6)
) l;
Here is a db<>fiddle.

With Oracle SQL how can I find 3 days where total sum >= 150

I have a report that needs to list activity where total is >= 150 over 3 consecutive days.
Let's say I've created a temp table foo, to summarize daily totals.
| ID | Day | Total |
| -- | ---------- | ----- |
| 01 | 2020-01-01 | 10 |
| 01 | 2020-01-02 | 50 |
| 01 | 2020-01-03 | 50 |
| 01 | 2020-01-04 | 50 |
| 01 | 2020-01-05 | 20 |
| 02 | 2020-01-01 | 10 |
| 02 | 2020-01-02 | 10 |
| 02 | 2020-01-03 | 10 |
| 02 | 2020-01-04 | 10 |
| 02 | 2020-01-05 | 10 |
How Would I write SQL to return ID 01, but not 02?
Example Result:
| ID |
| -- |
| 01 |
I suspect that you want window functions here:
select distinct id
from (
select
t.*,
sum(total) over(partition by id order by day rows between 2 preceding and current row) sum_total,
count(*) over(partition by id order by day rows between 2 preceding and current row) cnt
from mytable t
) t
where cnt = 3 and sum_total >= 150
This gives you the ids that have a total greater than the given threshold over 3 consecutive days - which is how I understood your question.
If you just want to output the rows that have 3 consecutive days with a sum >= 150, you can use an analytic function to determine the moving total across each 3 day period per id, and then find the aggregate max value of the moving total per id, returning the id where it's >= 150.
E.g.:
WITH your_table AS (SELECT 1 ID, to_date('01/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual UNION ALL
SELECT 1 ID, to_date('02/01/2020', 'dd/mm/yyyy') dy, 50 total FROM dual UNION ALL
SELECT 1 ID, to_date('03/01/2020', 'dd/mm/yyyy') dy, 50 total FROM dual UNION ALL
SELECT 1 ID, to_date('04/01/2020', 'dd/mm/yyyy') dy, 50 total FROM dual UNION ALL
SELECT 1 ID, to_date('05/01/2020', 'dd/mm/yyyy') dy, 20 total FROM dual UNION ALL
SELECT 2 ID, to_date('01/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual UNION ALL
SELECT 2 ID, to_date('02/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual UNION ALL
SELECT 2 ID, to_date('03/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual UNION ALL
SELECT 2 ID, to_date('04/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual UNION ALL
SELECT 2 ID, to_date('05/01/2020', 'dd/mm/yyyy') dy, 10 total FROM dual),
moving_sums AS (SELECT ID,
dy,
total,
SUM(total) OVER (PARTITION BY ID ORDER BY dy RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) moving_sum
FROM your_table)
SELECT ID
FROM moving_sums
GROUP BY ID
HAVING MAX(moving_sum) >= 150;
ID
----------
1
You can use a HAVING Clause GROUPED BY ID to list the desired ID values
SELECT ID
FROM foo
GROUP BY ID
HAVING COUNT( distinct day )>=3 AND SUM( NVL(Total,0) ) >= 150
Demo
Use this if you are to specify dates
WITH foo( ID, Day, Total ) AS
(
SELECT '01', date'2020-01-01' , 10 FROM dual
UNION ALL SELECT '01', date'2020-01-02' , 50 FROM dual
UNION ALL SELECT '01', date'2020-01-03' , 50 FROM dual
UNION ALL SELECT '01', date'2020-01-04' , 50 FROM dual
UNION ALL SELECT '01', date'2020-01-05' , 20 FROM dual
UNION ALL SELECT '02', date'2020-01-01' , 10 FROM dual
UNION ALL SELECT '02', date'2020-01-02' , 10 FROM dual
UNION ALL SELECT '02', date'2020-01-03' , 10 FROM dual
UNION ALL SELECT '02', date'2020-01-04' , 10 FROM dual
UNION ALL SELECT '02', date'2020-01-05' , 10 FROM dual
)SELECT
ID
FROM foo
WHERE day BETWEEN TO_DATE('2020-01-01', 'YYYY-MM-DD' ) AND TO_DATE('2020-01-04', 'YYYY-MM-DD' )
GROUP BY ID HAVING SUM(Total) >= 150;
RESULT:
ID|
--|
01|
Maybe you can try something like this :
SELECT
*
FROM foo
WHERE day BETWEEN 2020-01-01 AND 2020-01-04
AND total > 150

Date difference when group by value change

I have a table like this
|---------------------|------------------|---------------------|------------------|
| FILE | UNIT | DATE | ID SEQUENCE |
|---------------------|------------------|---------------------|------------------|
| 10 | 34 | 01/02/2000 | 10 |
|---------------------|------------------|---------------------|------------------|
| 10 | 34 | 01/05/2000 | 11 |
|---------------------|------------------|---------------------|------------------|
| 10 | 40 | 01/05/2000 | 12 |
|---------------------|------------------|---------------------|------------------|
| 10 | 40 | 01/02/2000 | 13 |
|---------------------|------------------|---------------------|------------------|
| 10 | 40 | 01/02/2000 | 14 |
|---------------------|------------------|---------------------|------------------|
| 10 | 40 | 01/15/2000 | 15 |
|---------------------|------------------|---------------------|------------------|
| 10 | 34 | 01/16/2000 | 16 |
|---------------------|------------------|---------------------|------------------|
| 10 | 70 | 01/17/2000 | 17 |
|---------------------|------------------|---------------------|------------------|
| 10 | 70 | 01/28/2000 | 18 |
|---------------------|------------------|---------------------|------------------|
I need to build a view like this (get the amount of days a file stay in every unit)
|---------------------|------------------|---------------------|
| FILE | UNIT | DAYS IN UNITY |
|---------------------|------------------|---------------------|
| 10 | 34 | 3 |
|---------------------|------------------|---------------------|
| 10 | 40 | 10 |
|---------------------|------------------|---------------------|
| 10 | 34 | 1 |
|---------------------|------------------|---------------------|
| 10 | 70 | 11 |
|---------------------|------------------|---------------------|
Any advice
Thanks in advance
This is a form of gaps-and-islands. For this purpose, I am thinking difference of row numbers:
select file, unit,
(lead(min(date), 1, max(date)) over (partition by file, unit) -
min(date)
) as days_in_unity
from (select t.*,
row_number() over (partition by file order by id_sequence) as seqnum,
row_number() over (partition by file, unit order by id_sequence) as seqnum_2
from t
) t
group by file, unit, (seqnum - seqnum_2)
Why this works is a little tricky to explain. If you look at the results of the subquery, you will see how the difference of the two row numbers is constant for the rows with the same unit value.
If you are using a newer version of Oracle (12c and above), it may be an idea to use MATCH_RECOGNIZE (pattern matching). In the following query, we define 2 patterns:
{1} multiple days, with a start date and an end date. This pattern may contain "iffy" dates eg the 01/02/2000 in between 01/05/2000 and 01/15/2000 for unit 40.
{2} single day: this occurs when the "start" and "end" are on one and the same day. In the MEASURES clause we pick up all the columns we need, and COALESCE them in the SELECT clause (MATCH_RECOGNIZE documentation here).
Table
create table fileandunit( FILE_, UNIT, DATE_, ID_SEQUENCE )
as
select 10, 34, to_date( '02-JAN-2000', 'DD-MON-YYYY'), 10 from dual union all
select 10, 34, to_date( '05-JAN-2000', 'DD-MON-YYYY'), 11 from dual union all
select 10, 40, to_date( '05-JAN-2000', 'DD-MON-YYYY'), 12 from dual union all
select 10, 40, to_date( '02-JAN-2000', 'DD-MON-YYYY'), 13 from dual union all
select 10, 40, to_date( '02-JAN-2000', 'DD-MON-YYYY'), 14 from dual union all
select 10, 40, to_date( '15-JAN-2000', 'DD-MON-YYYY'), 15 from dual union all
select 10, 34, to_date( '16-JAN-2000', 'DD-MON-YYYY'), 16 from dual union all
select 10, 70, to_date( '17-JAN-2000', 'DD-MON-YYYY'), 17 from dual union all
select 10, 70, to_date( '28-JAN-2000', 'DD-MON-YYYY'), 18 from dual ;
Data and patterns
select * from fileandunit order by id_sequence ;
FILE_ UNIT DATE_ ID_SEQUENCE
10 34 02-JAN-00 10 -- start
10 34 05-JAN-00 11 -- end
10 40 05-JAN-00 12 -- start
10 40 02-JAN-00 13 -- iffy
10 40 02-JAN-00 14 -- iffy
10 40 15-JAN-00 15 -- end
10 34 16-JAN-00 16 -- single day
10 70 17-JAN-00 17 -- start
10 70 28-JAN-00 18 -- end
Query
select
coalesce( RP.m_file, RP.s_file ) file_
, coalesce( RP.m_unit, RP.s_unit ) unit_
, coalesce( ( RP.m_end - RP.m_start ), 1 ) days_
from fileandunit
match_recognize(
partition by file_ order by id_sequence
measures
enddt.file_ as m_file -- m_: for multiple days
, enddt.unit as m_unit
, startdt.date_ as m_start
, enddt.date_ as m_end
, singledt.file_ as s_file -- s_: single day
, singledt.unit as s_unit
, singledt.date_ as s_date
one row per match
pattern ( ( startdt iffydt* enddt ) | singledt ) -- multiple days (or) single day
define
startdt as ( prev( date_ ) <= date_ or prev( date_ ) is null )
and ( prev( unit ) <> unit or prev( unit ) is null )
--
, enddt as ( next( date_ ) >= date_ or next( date_ ) is null )
and ( next( unit ) <> unit or next( unit ) is null )
--
, iffydt as ( prev( date_ ) >= date_ ) -- detect incorrect dates inside a multiple day block
and ( prev( unit ) = unit )
--
, singledt as ( prev( date_ ) = date_ - 1 and next( date_ ) = date_ + 1 )
and ( prev( unit ) <> unit and next( unit ) <> unit )
) RP ;
Result
FILE_ UNIT_ DAYS_
10 34 3
10 40 10
10 34 1
10 70 11
Tested with Oracle 18c. DBfiddle here.

Oracle Event Count Query

My SAMPLE table has the following five columns:
sample_id (PK) (NUMBER)
sampled_on (DATE)
received_on (DATE)
completed_on (DATE)
authorized_on (DATE)
I would like a query with one row per hour (constrained by a given date range) and five columns:
The hour YYYY-MM-DD HH24
Number of samples sampled during that hour
Number of samples received during that hour
Number of samples completed during that hour
Number of samples authorized during that hour
Please provide a query or at least a point in the right direction.
Reopened with bounty:
+300 reputation for the first person to incorporate Rob van Wijk's answer (single access to sample) into a view where I can efficiently query by date range (start_date/end_date or start_date/num_days).
Try:
CREATE OR REPLACE VIEW my_view AS
WITH date_bookends AS (
SELECT LEAST(MIN(t.sampled_on), MIN(t.received_on), MIN(t.completed_on), MIN(t.authorized_on)) 'min_date'
GREATEST(MAX(t.sampled_on), MAX(t.received_on), MAX(t.completed_on), MAX(t.authorized_on)) 'max_date'
FROM SAMPLE t),
all_hours AS (
SELECT t.min_date + numtodsinterval(LEVEL - 1,'hour') date_by_hour
FROM date_bookends t
CONNECT BY LEVEL <= ( t.max_date - t.min_date + 1) * 24)
SELECT h.date_by_hour,
COUNT(CASE WHEN h.hour = TRUNC(s.sampled_on,'hh24') THEN 1 END) sampled#
COUNT(CASE WHEN h.hour = TRUNC(s.received_on,'hh24') THEN 1 END) received#
COUNT(CASE WHEN h.hour = TRUNC(s.completed_on,'hh24') THEN 1 END) completed#
COUNT(CASE WHEN h.hour = TRUNC(s.authorized_on,'hh24') THEN 1 END) authorized#
FROM all_hours h
CROSS JOIN sample s
GROUP BY h.hour
Without using Subquery Factoring:
CREATE OR REPLACE VIEW my_view AS
SELECT h.date_by_hour,
COUNT(CASE WHEN h.hour = TRUNC(s.sampled_on,'hh24') THEN 1 END) sampled#
COUNT(CASE WHEN h.hour = TRUNC(s.received_on,'hh24') THEN 1 END) received#
COUNT(CASE WHEN h.hour = TRUNC(s.completed_on,'hh24') THEN 1 END) completed#
COUNT(CASE WHEN h.hour = TRUNC(s.authorized_on,'hh24') THEN 1 END) authorized#
FROM (SELECT t.min_date + numtodsinterval(LEVEL - 1,'hour') date_by_hour
FROM (SELECT LEAST(MIN(t.sampled_on), MIN(t.received_on), MIN(t.completed_on), MIN(t.authorized_on)) 'min_date'
GREATEST(MAX(t.sampled_on), MAX(t.received_on), MAX(t.completed_on), MAX(t.authorized_on)) 'max_date'
FROM SAMPLE t) t
CONNECT BY LEVEL <= ( t.max_date - t.min_date + 1) * 24) h
CROSS JOIN sample s
GROUP BY h.hour
The query accesses the SAMPLES table twice - the first time to get the earliest & latest date to frame the construction of the date_by_hour value.
This may not be the prettiest or most optimal solution, but it seems to work. Explanation: first convert all the dates to YYYY-MM-DD HH24 format, next gather number sampled/received/completed/authorized by date+HH24, finally join together.
with sample_hour as
(select sample_id,
to_char(sampled_on, 'YYYY-MM-DD HH24') sampled_on,
to_char(received_on, 'YYYY-MM-DD HH24') received_on,
to_char(completed_on, 'YYYY-MM-DD HH24') completed_on,
to_char(authorized_on, 'YYYY-MM-DD HH24') authorized_on
from sample),
s as
(select sampled_on thedate, count(*) num_sampled
from sample_hour
group by sampled_on),
r as
(select received_on thedate, count(*) num_received
from sample_hour
group by received_on),
c as
(select completed_on thedate, count(*) num_completed
from sample_hour
group by completed_on),
a as
(select authorized_on thedate, count(*) num_authorized
from sample_hour
group by authorized_on)
select s.thedate, num_sampled, num_received, num_completed, num_authorized
from s
left join r on s.thedate = r.thedate
left join c on s.thedate = c.thedate
left join a on s.thedate = a.thedate
;
This assumes a table "sample" created something like this:
create table sample
(sample_id number not null primary key,
sampled_on date,
received_on date,
completed_on date,
authorized_on date);
Here is an example. First create the table and insert some random data.
SQL> create table sample
2 ( sample_id number primary key
3 , sampled_on date
4 , received_on date
5 , completed_on date
6 , authorized_on date
7 )
8 /
Tabel is aangemaakt.
SQL> insert into sample
2 select level
3 , trunc(sysdate) + dbms_random.value(0,2)
4 , trunc(sysdate) + dbms_random.value(0,2)
5 , trunc(sysdate) + dbms_random.value(0,2)
6 , trunc(sysdate) + dbms_random.value(0,2)
7 from dual
8 connect by level <= 1000
9 /
1000 rijen zijn aangemaakt.
Then introduce the variables for your given date range and fill them.
SQL> var DATE_RANGE_START varchar2(10)
SQL> var DATE_RANGE_END varchar2(10)
SQL> exec :DATE_RANGE_START := '2009-10-23'
PL/SQL-procedure is geslaagd.
SQL> exec :DATE_RANGE_END := '2009-10-24'
PL/SQL-procedure is geslaagd.
First you'll have to generate all hours in your given date range. This makes sure that in case you have an hour where no dates are present, you'll still have a record with 4 zeros. The implementation is in the all_hours query. The rest of the query (with only one table access to your sample table!) can then be quite simple like this.
SQL> with all_hours as
2 ( select to_date(:DATE_RANGE_START,'yyyy-mm-dd') + numtodsinterval(level-1,'hour') hour
3 from dual
4 connect by level <=
5 ( to_date(:DATE_RANGE_END,'yyyy-mm-dd')
6 - to_date(:DATE_RANGE_START,'yyyy-mm-dd')
7 + 1
8 ) * 24
9 )
10 select h.hour
11 , count(case when h.hour = trunc(s.sampled_on,'hh24') then 1 end) sampled#
12 , count(case when h.hour = trunc(s.received_on,'hh24') then 1 end) received#
13 , count(case when h.hour = trunc(s.completed_on,'hh24') then 1 end) completed#
14 , count(case when h.hour = trunc(s.authorized_on,'hh24') then 1 end) authorized#
15 from all_hours h
16 cross join sample s
17 group by h.hour
18 /
HOUR SAMPLED# RECEIVED# COMPLETED# AUTHORIZED#
------------------- ---------- ---------- ---------- -----------
23-10-2009 00:00:00 18 25 20 20
23-10-2009 01:00:00 26 24 16 13
23-10-2009 02:00:00 16 26 17 15
23-10-2009 03:00:00 19 18 27 13
23-10-2009 04:00:00 28 20 18 23
23-10-2009 05:00:00 17 13 19 21
23-10-2009 06:00:00 18 23 16 15
23-10-2009 07:00:00 19 24 14 22
23-10-2009 08:00:00 21 19 23 22
23-10-2009 09:00:00 25 20 23 24
23-10-2009 10:00:00 16 21 25 18
23-10-2009 11:00:00 21 29 21 18
23-10-2009 12:00:00 33 28 24 20
23-10-2009 13:00:00 24 19 15 15
23-10-2009 14:00:00 20 27 16 25
23-10-2009 15:00:00 15 25 27 13
23-10-2009 16:00:00 19 14 27 18
23-10-2009 17:00:00 22 22 15 27
23-10-2009 18:00:00 20 19 29 23
23-10-2009 19:00:00 20 18 17 23
23-10-2009 20:00:00 11 18 20 27
23-10-2009 21:00:00 13 25 24 19
23-10-2009 22:00:00 22 13 22 29
23-10-2009 23:00:00 20 20 19 24
24-10-2009 00:00:00 18 17 18 29
24-10-2009 01:00:00 23 30 26 21
24-10-2009 02:00:00 28 19 28 25
24-10-2009 03:00:00 21 21 11 23
24-10-2009 04:00:00 23 20 21 17
24-10-2009 05:00:00 24 16 23 23
24-10-2009 06:00:00 23 26 22 30
24-10-2009 07:00:00 25 26 18 12
24-10-2009 08:00:00 24 20 23 17
24-10-2009 09:00:00 18 26 15 19
24-10-2009 10:00:00 20 19 25 18
24-10-2009 11:00:00 19 27 17 20
24-10-2009 12:00:00 23 16 18 20
24-10-2009 13:00:00 15 15 22 19
24-10-2009 14:00:00 23 23 16 29
24-10-2009 15:00:00 18 31 32 28
24-10-2009 16:00:00 22 15 18 13
24-10-2009 17:00:00 25 17 20 26
24-10-2009 18:00:00 19 20 21 16
24-10-2009 19:00:00 22 13 28 29
24-10-2009 20:00:00 23 17 23 14
24-10-2009 21:00:00 18 18 21 22
24-10-2009 22:00:00 22 20 18 21
24-10-2009 23:00:00 21 18 22 22
48 rijen zijn geselecteerd.
Hope this helps.
Regards,
Rob.
I'd do are 4 queries like this (one for each date):
SELECT <date to hour>, count(*) FROM sample GROUP BY <date to hour>
And then put the data together in the application. If you really want a single query, you can join the individual queries on hour.
Try this...
WITH src_data AS
( SELECT sample_id
, TRUNC( sampled_on, 'HH24' ) sampled_on
, TRUNC( received_on, 'HH24' ) received_on
, TRUNC( completed_on, 'HH24' ) completed_on
, TRUNC( authorized_on, 'HH24' ) authorized_on
FROM sample
)
, src_hours AS
( SELECT sampled_on the_date
FROM src_data
WHERE sampled_on IS NOT NULL
UNION
SELECT received_on the_date
FROM src_data
WHERE received_on IS NOT NULL
UNION
SELECT completed_on the_date
FROM src_data
WHERE completed_on IS NOT NULL
UNION
SELECT authorized_on the_date
FROM src_data
WHERE authorized_on IS NOT NULL
)
SELECT h.the_date
, ( SELECT COUNT(*)
FROM src_data s
WHERE s.sampled_on = h.the_date ) num_sampled_on
, ( SELECT COUNT(*)
FROM src_data r
WHERE r.received_on = h.the_date ) num_received_on
, ( SELECT COUNT(*)
FROM src_data c
WHERE c.completed_on = h.the_date ) num_completed_on
, ( SELECT COUNT(*)
FROM src_data a
WHERE a.authorized_on = h.the_date ) num_authorized_on
FROM src_hours h
Maybe somthing like creating this view:
create view hours as
select hour, max(cnt_sample) cnt_sample, max(cnt_received) cnt_received, max(cnt_completed) cnt_completed, max(cnt_authorized) cnt_authorized
from (
select to_char(sampled_on , 'yyyymmddhh24') hour,
count(sample_id) over (partition by to_char(sampled_on ,'yyyymmddhh24')) cnt_sample,
0 cnt_received,
0 cnt_completed,
0 cnt_authorized from sample union all
select to_char(received_on , 'yyyymmddhh24') hour,
0 cnt_sample,
count(sample_id) over (partition by to_char(received_on ,'yyyymmddhh24')) cnt_received,
0 cnt_completed,
0 cnt_authorized from sample union all
select to_char(completed_on , 'yyyymmddhh24') hour,
0 cnt_sample,
0 cnt_received,
count(sample_id) over (partition by to_char(completed_on ,'yyyymmddhh24')) cnt_completed,
0 cnt_authorized from sample union all
select to_char(authorized_on, 'yyyymmddhh24') hour,
0 cnt_sample,
0 cnt_received,
0 cnt_completed,
count(sample_id) over (partition by to_char(authorized_on ,'yyyymmddhh24')) cnt_authorized from sample
)
group by hour
;
and then selecting from the view:
select * from hours where hour >= '2001010102' and hour <= '2001010105'
order by hour;
I now propose:
create view hours_ as
with four as (
select 1 as n from dual union all
select 2 as n from dual union all
select 3 as n from dual union all
select 4 as n from dual )
select
case when four.n = 1 then trunc(sampled_on , 'hh24')
when four.n = 2 then trunc(received_on , 'hh24')
when four.n = 3 then trunc(completed_on , 'hh24')
when four.n = 4 then trunc(authorized_on, 'hh24')
end hour_,
sum ( case when four.n = 1 then 1
else 0
end ) sample_,
sum ( case when four.n = 2 then 1
else 0
end ) receive_,
sum ( case when four.n = 3 then 1
else 0
end ) complete_,
sum ( case when four.n = 4 then 1
else 0
end ) authorize_
from
four cross join sample
group by
case when four.n = 1 then trunc(sampled_on , 'hh24')
when four.n = 2 then trunc(received_on , 'hh24')
when four.n = 3 then trunc(completed_on , 'hh24')
when four.n = 4 then trunc(authorized_on, 'hh24')
end ;
In order to see if the view is indeed accessed only once:
explain plan for select * from hours_
where hour_ between sysdate -1 and sysdate;
select * from table (dbms_xplan.display);
Which results in:
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 61 | 16 (7)| 00:00:01 |
| 1 | VIEW | HOURS_ | 1 | 61 | 16 (7)| 00:00:01 |
| 2 | HASH GROUP BY | | 1 | 39 | 16 (7)| 00:00:01 |
|* 3 | FILTER | | | | | |
| 4 | NESTED LOOPS | | 1 | 39 | 15 (0)| 00:00:01 |
| 5 | VIEW | | 4 | 12 | 8 (0)| 00:00:01 |
| 6 | UNION-ALL | | | | | |
| 7 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 8 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 9 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
| 10 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
|* 11 | TABLE ACCESS FULL| SAMPLE | 1 | 36 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Here's what I'm thinking, but I'm not sure it's optimal enough for a view.
select
the_date,
sum(decode(the_type,'S',the_count,0)) samples,
sum(decode(the_type,'R',the_count,0)) receipts,
sum(decode(the_type,'C',the_count,0)) completions,
sum(decode(the_type,'A',the_count,0)) authorizations
from(
select
trunc(sampled_on,'HH24') the_date,
'S' the_type,
count(1) the_count
FROM sample
group by trunc(sampled_on,'HH24')
union all
select
trunc(received_on,'HH24'),
'R',
count(1)
FROM sample
group by trunc(received_on,'HH24')
union all
select
trunc(completed_on,'HH24'),
'C',
count(1)
FROM sample
group by trunc(completed_on,'HH24')
union all
select
trunc(authorized_on,'HH24'),
'A',
count(1)
FROM sample
group by trunc(authorized_on,'HH24')
)
group by the_date
Then, to query, you could just query with normal date contructs:
select * from magic_view where the_date > sysdate-1;
EDIT
Okay, so I created a sample table and did some metrics:
create table sample (
sample_id number primary key,
sampled_on date,
received_on date,
completed_on date,
authorized_on date
);
insert into sample (
select
level,
trunc(sysdate) + dbms_random.value(0,2),
trunc(sysdate) + dbms_random.value(0,2),
trunc(sysdate) + dbms_random.value(0,2),
trunc(sysdate) + dbms_random.value(0,2),
from dual
connect by level <= 1000
);
The explain plan is:
---------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
---------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4000 | 97K| 25 (20)|
| 1 | HASH GROUP BY | | 4000 | 97K| 25 (20)|
| 2 | VIEW | | 4000 | 97K| 24 (17)|
| 3 | UNION-ALL | | | | |
| 4 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 5 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
| 6 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 7 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
| 8 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 9 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
| 10 | HASH GROUP BY | | 1000 | 9000 | 6 (17)|
| 11 | TABLE ACCESS FULL| SAMPLE | 1000 | 9000 | 5 (0)|
---------------------------------------------------------------------
On my machine, the a query against this view for the past 24 hours completes in 23ms. Not bad, but it's only 1,000 rows. Before you discount the 4 separate queries, you'll need to do performance analysis of the individual solutions.
Similar to René Nyffenegger's idea. Filter by each type of date field, and then amalgamate the counts.
Note, that it's not possible to do this query in one Select, because you need to both Group and Order By each date field, this is impossible without splitting into separate sub-queries.
I have coded a date range of '2009-11-04' to '2009-11-04 23:59:59' for this example:
SELECT
DateHour,
SUM(sampled) total_sampled,
SUM(received) total_received,
SUM(completed) total_completed,
SUM(authorized) total_authorized
FROM
(SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
1 sampled,
0 received,
0 completed,
0 authorized
FROM
SAMPLE
WHERE
sampled_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND sampled_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
UNION ALL
SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
0 sampled,
1 received,
0 completed,
0 authorized
FROM
SAMPLE
WHERE
received_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND received_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
UNION ALL
SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
0 sampled,
0 received,
1 completed,
0 authorized
FROM
SAMPLE
WHERE
completed_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND completed_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
UNION ALL
SELECT
TO_CHAR(CREATED_DATE, 'YYYY-MM-DD HH24') DateHour,
0 sampled,
0 received,
0 completed,
1 authorized
FROM
SAMPLE
WHERE
authorized_on >= TO_DATE('2009-11-04', 'YYYY-MM-DD')
AND authorized_on <= TO_DATE('2009-11-04 23:59:59', 'YYYY-MM-DD HH24:MI:SS'))
GROUP BY
DateHour
ORDER BY
DateHour