how to use windows function during merge in sql - sql

I am working in oracle sql. I have two table which is linked to each other by one column - company_id (see on the picture); I want to merge table 1 to table 2 and calculate 6 month average (6 month before period from table 2) of income for each company_id and each date of table2. I appreciate any code/idea how to solve this task.

You can use an analytic range window to calculate the averages for table1 and then JOIN the result to table2:
SELECT t2.*,
t1.avg_income_6,
t1.avg_income_12
FROM table2 t2
LEFT OUTER JOIN (
SELECT company_id,
dt,
ROUND(AVG(income) OVER (
PARTITION BY company_id
ORDER BY dt
RANGE BETWEEN INTERVAL '5' MONTH PRECEDING
AND INTERVAL '0' MONTH FOLLOWING
), 2) AS avg_income_6,
ROUND(AVG(income) OVER (
PARTITION BY company_id
ORDER BY dt
RANGE BETWEEN INTERVAL '11' MONTH PRECEDING
AND INTERVAL '0' MONTH FOLLOWING
), 2) AS avg_income_12
FROM table1
) t1
ON (t2.company_id = t1.company_id AND t2.dt = t1.dt);
Which, for the sample data:
CREATE TABLE table1 (company_id, dt, income) AS
SELECT 1, date '2019-01-01', 65 FROM DUAL UNION ALL
SELECT 1, date '2019-02-01', 58 FROM DUAL UNION ALL
SELECT 1, date '2019-03-01', 12 FROM DUAL UNION ALL
SELECT 1, date '2019-04-01', 81 FROM DUAL UNION ALL
SELECT 1, date '2019-05-01', 38 FROM DUAL UNION ALL
SELECT 1, date '2019-06-01', 81 FROM DUAL UNION ALL
SELECT 1, date '2019-07-01', 38 FROM DUAL UNION ALL
SELECT 1, date '2019-08-01', 69 FROM DUAL UNION ALL
SELECT 1, date '2019-09-01', 54 FROM DUAL UNION ALL
SELECT 1, date '2019-10-01', 90 FROM DUAL UNION ALL
SELECT 1, date '2019-11-01', 10 FROM DUAL UNION ALL
SELECT 1, date '2019-12-01', 12 FROM DUAL UNION ALL
SELECT 1, date '2020-01-01', 11 FROM DUAL UNION ALL
SELECT 1, date '2020-02-01', 83 FROM DUAL UNION ALL
SELECT 1, date '2020-03-01', 18 FROM DUAL UNION ALL
SELECT 1, date '2020-04-01', 28 FROM DUAL UNION ALL
SELECT 1, date '2020-05-01', 52 FROM DUAL UNION ALL
SELECT 1, date '2020-06-01', 21 FROM DUAL UNION ALL
SELECT 1, date '2020-07-01', 54 FROM DUAL UNION ALL
SELECT 1, date '2020-08-01', 30 FROM DUAL UNION ALL
SELECT 1, date '2020-09-01', 12 FROM DUAL UNION ALL
SELECT 1, date '2020-10-01', 25 FROM DUAL UNION ALL
SELECT 1, date '2020-11-01', 86 FROM DUAL UNION ALL
SELECT 1, date '2020-12-01', 4 FROM DUAL UNION ALL
SELECT 1, date '2021-01-01', 10 FROM DUAL UNION ALL
SELECT 1, date '2021-02-01', 72 FROM DUAL UNION ALL
SELECT 1, date '2021-03-01', 65 FROM DUAL UNION ALL
SELECT 1, date '2021-04-01', 25 FROM DUAL;
CREATE TABLE table2 (company_id, dt) AS
SELECT 1, date '2019-06-01' FROM DUAL UNION ALL
SELECT 1, date '2019-09-01' FROM DUAL UNION ALL
SELECT 1, date '2019-12-01' FROM DUAL UNION ALL
SELECT 1, date '2020-01-01' FROM DUAL UNION ALL
SELECT 1, date '2020-07-01' FROM DUAL UNION ALL
SELECT 1, date '2020-08-01' FROM DUAL UNION ALL
SELECT 1, date '2021-03-01' FROM DUAL UNION ALL
SELECT 1, date '2021-04-01' FROM DUAL;
Outputs:
COMPANY_ID
DT
AVG_INCOME_6
AVG_INCOME_12
1
2019-06-01 00:00:00
55.83
55.83
1
2019-09-01 00:00:00
60.17
55.11
1
2019-12-01 00:00:00
45.5
50.67
1
2020-01-01 00:00:00
41
46.17
1
2020-07-01 00:00:00
42.67
41.83
1
2020-08-01 00:00:00
33.83
38.58
1
2021-03-01 00:00:00
43.67
38.25
1
2021-04-01 00:00:00
43.67
38
db<>fiddle here

I don't think you need any window function here (if you were thinking of analytic functions); ordinary avg with appropriate join conditions should do the job.
Sample data:
SQL> with
2 table1 (company_id, datum, income) as
3 (select 1, date '2019-01-01', 65 from dual union all
4 select 1, date '2019-02-01', 58 from dual union all
5 select 1, date '2019-03-01', 12 from dual union all
6 select 1, date '2019-04-01', 81 from dual union all
7 select 1, date '2019-05-01', 38 from dual union all
8 select 1, date '2019-06-01', 81 from dual union all
9 select 1, date '2019-07-01', 38 from dual union all
10 select 1, date '2019-08-01', 69 from dual union all
11 select 1, date '2019-09-01', 54 from dual union all
12 select 1, date '2019-10-01', 90 from dual union all
13 select 1, date '2019-11-01', 10 from dual union all
14 select 1, date '2019-12-01', 12 from dual
15 ),
16 table2 (company_id, datum) as
17 (select 1, date '2019-06-01' from dual union all
18 select 1, date '2019-09-01' from dual union all
19 select 1, date '2019-12-01' from dual union all
20 select 1, date '2020-01-01' from dual union all
21 select 1, date '2020-07-01' from dual
22 )
Query begins here:
23 select b.company_id,
24 b.datum ,
25 round(avg(a.income), 2) result
26 from table1 a join table2 b on a.company_id = b.company_id
27 and a.datum > add_months(b.datum, -6)
28 and a.datum <= b.datum
29 group by b.company_id, b.datum;
COMPANY_ID DATUM RESULT
---------- -------- ----------
1 01.06.19 55,83
1 01.09.19 60,17
1 01.12.19 45,5
1 01.01.20 47
SQL>

Related

How can I get comma separated values from a table in a single cell in Oracle SQL? How do I do it?

How can I get comma separated values from a table in a single cell in Oracle SQL? How do I do it?
For example, if the input table I have is the following::
id
value
datetime
9245
44
2021-10-15 00:00:00
9245
42
2021-09-14 00:00:00
9245
41
2021-08-13 00:00:00
9245
62
2021-05-14 00:00:00
9245
100
2021-04-15 00:00:00
9245
131
2021-03-16 00:00:00
9245
125
2021-02-12 00:00:00
9245
137
2021-01-18 00:00:00
8873
358
2021-10-15 00:00:00
8873
373
2021-09-14 00:00:00
8873
373
2021-08-13 00:00:00
8873
411
2021-07-14 00:00:00
8873
381
2021-06-14 00:00:00
8873
275
2021-05-14 00:00:00
8873
216
2021-04-15 00:00:00
8873
189
2021-03-16 00:00:00
8873
157
2021-02-12 00:00:00
8873
191
2021-01-18 00:00:00
My idea would be to achieve a grouping like the one below:
id
grouped_values
8873
191,157,Null,Null,Null,381,411,373,373,358
9245
137,125,131,100,62,Null,Null,41,42,44
As you can see in this case I have 2 different ids, when I group by id I would like the missing dates to have a null value and for the first value to correspond to the first date for that id. Also, when there are no values on that date, add a null value.
How can I put those null values in the correct place? How do I detect the absence of these values and set them as null? How to make the positions of the values correlate with the dates?
I've been trying to use the listgg or xmlagg function to group, but at the moment I don't know how to cover the missing places.
Another option; read comments within code. Sample data in lines #1 - 9; query begins at line #10.
SQL> with test(id, value, datum) as
2 (select 1, 5, date '2021-01-10' from dual union all --> missing February and March
3 select 1, 8, date '2021-04-13' from dual union all
4 select 1, 3, date '2021-05-22' from dual union all
5 --
6 select 2, 1, date '2021-03-21' from dual union all
7 select 2, 7, date '2021-04-22' from dual union all --> missing May and June
8 select 2, 9, date '2021-07-10' from dual
9 ),
10 -- calendar per ID
11 minimax as
12 (select id, trunc(min(datum), 'mm') mindat, trunc(max(datum), 'mm') maxdat
13 from test
14 group by id
15 ),
16 calendar as
17 (select m.id,
18 'null' value,
19 add_months(m.mindat, column_value - 1) datum
20 from minimax m
21 cross join table(cast(multiset(select level from dual
22 connect by level <= ceil(months_between(maxdat, mindat)) + 1
23 ) as sys.odcinumberlist))
24 )
25 select c.id,
26 listagg(nvl(to_char(t.value), c.value), ', ') within group (order by c.datum) result
27 from calendar c left join test t on t.id = c.id and trunc(t.datum, 'mm') = c.datum
28 group by c.id;
ID RESULT
---------- ----------------------------------------
1 5, null, null, 8, 3
2 1, 7, null, null, 9
SQL>
Use a PARTITIONed OUTER JOIN:
WITH calendar (day) AS (
SELECT DATE '2021-01-18' FROM DUAL UNION ALL
SELECT DATE '2021-02-12' FROM DUAL UNION ALL
SELECT DATE '2021-03-16' FROM DUAL UNION ALL
SELECT DATE '2021-04-15' FROM DUAL UNION ALL
SELECT DATE '2021-05-14' FROM DUAL UNION ALL
SELECT DATE '2021-06-14' FROM DUAL UNION ALL
SELECT DATE '2021-07-14' FROM DUAL UNION ALL
SELECT DATE '2021-08-13' FROM DUAL UNION ALL
SELECT DATE '2021-09-14' FROM DUAL UNION ALL
SELECT DATE '2021-10-15' FROM DUAL
-- Or
-- SELECT DISTINCT datetime FROM table_name
)
SELECT t.id,
LISTAGG(COALESCE(TO_CHAR(t.value), 'null'), ',')
WITHIN GROUP (ORDER BY c.day)
AS grouped_values
FROM calendar c
LEFT OUTER JOIN table_name t
PARTITION BY (t.id)
ON (c.day = t.datetime)
GROUP BY t.id
Or:
WITH calendar (day) AS (
SELECT ADD_MONTHS(DATE '2021-01-01', LEVEL - 1)
FROM DUAL
CONNECT BY LEVEL <= 10
-- or
-- SELECT ADD_MONTHS(min_dt, LEVEL - 1)
-- FROM (
-- SELECT MIN(TRUNC(datetime, 'MM')) AS min_dt,
-- MAX(TRUNC(datetime, 'MM')) AS max_dt
-- FROM table_name
-- )
-- CONNECT BY ADD_MONTHS(min_dt, LEVEL - 1) <= max_dt
)
SELECT t.id,
LISTAGG(COALESCE(TO_CHAR(t.value), 'null'), ',') WITHIN GROUP (ORDER BY c.day)
AS grouped_values
FROM calendar c
LEFT OUTER JOIN table_name t
PARTITION BY (t.id)
ON (c.day = TRUNC(t.datetime, 'MM'))
GROUP BY t.id
Which, for the sample data:
CREATE TABLE table_name (id, value, datetime) AS
SELECT 9245, 137, DATE '2021-01-18' FROM DUAL UNION ALL
SELECT 9245, 125, DATE '2021-02-12' FROM DUAL UNION ALL
SELECT 9245, 131, DATE '2021-03-16' FROM DUAL UNION ALL
SELECT 9245, 100, DATE '2021-04-15' FROM DUAL UNION ALL
SELECT 9245, 62, DATE '2021-05-14' FROM DUAL UNION ALL
SELECT 9245, 41, DATE '2021-08-13' FROM DUAL UNION ALL
SELECT 9245, 42, DATE '2021-09-14' FROM DUAL UNION ALL
SELECT 9245, 44, DATE '2021-10-15' FROM DUAL UNION ALL
SELECT 8873, 191, DATE '2021-01-18' FROM DUAL UNION ALL
SELECT 8873, 157, DATE '2021-02-12' FROM DUAL UNION ALL
SELECT 8873, 189, DATE '2021-03-16' FROM DUAL UNION ALL
SELECT 8873, 216, DATE '2021-04-15' FROM DUAL UNION ALL
SELECT 8873, 275, DATE '2021-05-14' FROM DUAL UNION ALL
SELECT 8873, 381, DATE '2021-06-14' FROM DUAL UNION ALL
SELECT 8873, 411, DATE '2021-07-14' FROM DUAL UNION ALL
SELECT 8873, 373, DATE '2021-08-13' FROM DUAL UNION ALL
SELECT 8873, 373, DATE '2021-09-14' FROM DUAL UNION ALL
SELECT 8873, 358, DATE '2021-10-15' FROM DUAL;
Both output:
ID
GROUPED_VALUES
8873
191,157,189,216,275,381,411,373,373,358
9245
137,125,131,100,62,null,null,41,42,44
db<>fiddle here
You can run this query directly without creating any tables. Here is a version with start date and end date with parameters:
SELECT
FE.id
,LISTAGG(NVL(TO_CHAR(TRUNC(CON.value)), 'null'), ',') WITHIN GROUP (ORDER BY FE.the_date ASC) GROUPED_VALUES
FROM
(--begin from1
SELECT id
,EXTRACT (YEAR FROM the_date) the_year
,EXTRACT (MONTH FROM the_date) the_month
,the_date
FROM
(
SELECT distinct id
FROM
(
SELECT 9245 id, 137 value, DATE '2021-01-18' datetime FROM DUAL UNION ALL
SELECT 9245, 125, DATE '2021-02-12' FROM DUAL UNION ALL
SELECT 9245, 131, DATE '2021-03-16' FROM DUAL UNION ALL
SELECT 9245, 100, DATE '2021-04-15' FROM DUAL UNION ALL
SELECT 9245, 62, DATE '2021-05-14' FROM DUAL UNION ALL
SELECT 9245, 41, DATE '2021-08-13' FROM DUAL UNION ALL
SELECT 9245, 42, DATE '2021-09-14' FROM DUAL UNION ALL
SELECT 9245, 44, DATE '2021-10-15' FROM DUAL UNION ALL
SELECT 8873, 191, DATE '2021-01-18' FROM DUAL UNION ALL
SELECT 8873, 157, DATE '2021-02-12' FROM DUAL UNION ALL
SELECT 8873, 189, DATE '2021-03-16' FROM DUAL UNION ALL
SELECT 8873, 216, DATE '2021-04-15' FROM DUAL UNION ALL
SELECT 8873, 275, DATE '2021-05-14' FROM DUAL UNION ALL
SELECT 8873, 381, DATE '2021-06-14' FROM DUAL UNION ALL
SELECT 8873, 411, DATE '2021-07-14' FROM DUAL UNION ALL
SELECT 8873, 373, DATE '2021-08-13' FROM DUAL UNION ALL
SELECT 8873, 373, DATE '2021-09-14' FROM DUAL UNION ALL
SELECT 8873, 358, DATE '2021-10-15' FROM DUAL
) table_name
) PS CROSS JOIN
( -- in this sub query you can change the **start date** and **end date** to change the ranges
SELECT
MIN(TO_DATE('2021-01-01' /*start date*/, 'YYYY-MM-DD') + LEVEL - 1) the_date
FROM DUAL
CONNECT BY
TO_DATE('2021-01-01' /*start date*/, 'YYYY-MM-DD') + LEVEL - 1 <= TO_DATE('2021-10-01' /*end date*/, 'YYYY-MM-DD')
GROUP BY EXTRACT (YEAR FROM TO_DATE('2021-01-01' /*start date*/, 'YYYY-MM-DD') + LEVEL - 1)
,EXTRACT (MONTH FROM TO_DATE('2021-01-01' /*start date*/, 'YYYY-MM-DD') + LEVEL - 1)
) the_dates
) FE LEFT OUTER JOIN --end from1
(
SELECT
table_name.id id
, EXTRACT(MONTH FROM table_name.datetime) the_month
, EXTRACT(YEAR FROM table_name.datetime) the_year
,MAX(table_name.datetime) datetime
,SUM(table_name.value) value
FROM
(
SELECT 9245 id, 137 value, DATE '2021-01-18' datetime FROM DUAL UNION ALL
SELECT 9245, 125, DATE '2021-02-12' FROM DUAL UNION ALL
SELECT 9245, 131, DATE '2021-03-16' FROM DUAL UNION ALL
SELECT 9245, 100, DATE '2021-04-15' FROM DUAL UNION ALL
SELECT 9245, 62, DATE '2021-05-14' FROM DUAL UNION ALL
SELECT 9245, 41, DATE '2021-08-13' FROM DUAL UNION ALL
SELECT 9245, 42, DATE '2021-09-14' FROM DUAL UNION ALL
SELECT 9245, 44, DATE '2021-10-15' FROM DUAL UNION ALL
SELECT 8873, 191, DATE '2021-01-18' FROM DUAL UNION ALL
SELECT 8873, 157, DATE '2021-02-12' FROM DUAL UNION ALL
SELECT 8873, 189, DATE '2021-03-16' FROM DUAL UNION ALL
SELECT 8873, 216, DATE '2021-04-15' FROM DUAL UNION ALL
SELECT 8873, 275, DATE '2021-05-14' FROM DUAL UNION ALL
SELECT 8873, 381, DATE '2021-06-14' FROM DUAL UNION ALL
SELECT 8873, 411, DATE '2021-07-14' FROM DUAL UNION ALL
SELECT 8873, 373, DATE '2021-08-13' FROM DUAL UNION ALL
SELECT 8873, 373, DATE '2021-09-14' FROM DUAL UNION ALL
SELECT 8873, 358, DATE '2021-10-15' FROM DUAL
) table_name
GROUP BY table_name.id, EXTRACT(YEAR FROM table_name.datetime), EXTRACT(MONTH FROM table_name.datetime)
) Con ON FE.id = Con.id AND FE.the_year = CON.the_year AND FE.the_month = CON.the_month
GROUP BY FE.id
Note: this query also recognizes the missing dates automatically

Complex query analyzing historical records

I am using Oracle and trying to retrieve the total number of days a person was out of the office during the year. I have 2 tables involved:
Statuses
1 - Active
2 - Out of the Office
3 - Other
ScheduleHistory
RecordID - primary key
PersonID
PreviousStatusID
NextStatusID
DateChanged
I can easily find when the person went on vacation and when they came back, using
SELECT DateChanged FROM ScheduleHistory WHERE PersonID=111 AND NextStatusID = 2
and
SELECT DateChanged FROM ScheduleHistory WHERE PersonID=111 AND PreviousStatusID = 2
But in case a person went on vacation more than once, how can I can I calculate total number of days a person was out of the office. Is it possible to do programmatically, given only PersonID?
Here is some sample data:
RecordID PersonID PreviousStatusID NextStatusID DateChanged
-----------------------------------------------------------------------------
1 111 1 2 03/11/2020
2 111 2 1 03/13/2020
3 111 1 3 04/01/2020
4 111 3 1 04/07/2020
5 111 1 2 06/03/2020
6 111 2 1 06/05/2020
7 111 1 2 09/14/2020
8 111 2 1 09/17/2020
So from the data above, for the year 2020 for PersonID 111 the query should return 7
Try this:
with aux1 AS (
SELECT
a.*,
to_date(datechanged, 'MM/DD/YYYY') - LAG(to_date(datechanged, 'MM/DD/YYYY')) OVER(
PARTITION BY personid
ORDER BY
recordid
) lag_date
FROM
ScheduleHistory a
)
SELECT
personid,
SUM(lag_date) tot_days_ooo
FROM
aux1
WHERE
previousstatusid = 2
GROUP BY
personid;
If you want total days (or weekdays) for each year (and to account for periods when it goes over the year boundary) then:
WITH date_ranges ( personid, status, start_date, end_date ) AS (
SELECT personid,
nextstatusid,
datechanged,
LEAD(datechanged, 1, datechanged) OVER(
PARTITION BY personid
ORDER BY datechanged
)
FROM table_name
),
split_year_ranges ( personid, year, start_date, end_date, max_date ) AS (
SELECT personid,
TRUNC( start_date, 'YY' ),
start_date,
LEAST(
end_date,
ADD_MONTHS( TRUNC( start_date, 'YY' ), 12 )
),
end_date
FROM date_ranges
WHERE status = 2
UNION ALL
SELECT personid,
end_date,
end_date,
LEAST( max_date, ADD_MONTHS( end_date, 12 ) ),
max_date
FROM split_year_ranges
WHERE end_date < max_date
)
SELECT personid,
EXTRACT( YEAR FROM year) AS year,
SUM( end_date - start_date ) AS total_days,
SUM(
( TRUNC( end_date, 'IW' ) - TRUNC( start_date, 'IW' ) ) * 5 / 7
+ LEAST( end_date - TRUNC( end_date, 'IW' ), 5 )
- LEAST( start_date - TRUNC( start_date, 'IW' ), 5 )
) AS total_weekdays
FROM split_year_ranges
GROUP BY personid, year
ORDER BY personid, year
Which, for the sample data:
CREATE TABLE table_name ( RecordID, PersonID, PreviousStatusID, NextStatusID, DateChanged ) AS
SELECT 1, 111, 1, 2, DATE '2020-03-11' FROM DUAL UNION ALL
SELECT 2, 111, 2, 1, DATE '2020-03-13' FROM DUAL UNION ALL
SELECT 3, 111, 1, 3, DATE '2020-04-01' FROM DUAL UNION ALL
SELECT 4, 111, 3, 1, DATE '2020-04-07' FROM DUAL UNION ALL
SELECT 5, 111, 1, 2, DATE '2020-06-03' FROM DUAL UNION ALL
SELECT 6, 111, 2, 1, DATE '2020-06-05' FROM DUAL UNION ALL
SELECT 7, 111, 1, 2, DATE '2020-09-14' FROM DUAL UNION ALL
SELECT 8, 111, 2, 1, DATE '2020-09-17' FROM DUAL UNION ALL
SELECT 9, 222, 1, 2, DATE '2019-12-31' FROM DUAL UNION ALL
SELECT 10, 222, 2, 2, DATE '2020-12-01' FROM DUAL UNION ALL
SELECT 11, 222, 2, 2, DATE '2021-01-02' FROM DUAL;
Outputs:
PERSONID
YEAR
TOTAL_DAYS
TOTAL_WEEKDAYS
111
2020
7
7
222
2019
1
1
222
2020
366
262
222
2021
1
1
db<>fiddle here
Provided no vacation crosses a year boundary
with grps as (
SELECT sh.*,
row_number() over (partition by PersonID, NextStatusID order by DateChanged) grp
FROM ScheduleHistory sh
WHERE NextStatusID in (1,2) and 3 not in (NextStatusID, PreviousStatusID)
), durations as (
SELECT PersonID, min(DateChanged) DateChanged, max(DateChanged) - min(DateChanged) duration
FROM grps
GROUP BY PersonID, grp
)
SELECT PersonID, sum(duration) days_out
FROM durations
GROUP BY PersonID;
db<>fiddle
year_span is used to split an interval spanning across two years in two different records
H1 adds a row number dependent from PersonID to get the right sequence for each person
H2 gets the periods for each status change and extract 1st day of the year of the interval end
H3 split records that span across two years and calculate the right date_start and date_end for each interval
H calculates days elapsed in each interval for each year
final query sum up the records to get output
EDIT
If you need workdays instead of total days, you should not use total_days/7*5 because it is a bad approximation and in some cases gives weird results.
I have posted a solution to jump on fridays to mondays here
with
statuses (sid, sdescr) as (
select 1, 'Active' from dual union all
select 2, 'Out of the Office' from dual union all
select 3, 'Other' from dual
),
ScheduleHistory(RecordID, PersonID, PreviousStatusID, NextStatusID , DateChanged) as (
select 1, 111, 1, 2, date '2020-03-11' from dual union all
select 2, 111, 2, 1, date '2020-03-13' from dual union all
select 3, 111, 1, 3, date '2020-04-01' from dual union all
select 4, 111, 3, 1, date '2020-04-07' from dual union all
select 5, 111, 1, 2, date '2020-06-03' from dual union all
select 6, 111, 2, 1, date '2020-06-05' from dual union all
select 7, 111, 1, 2, date '2020-09-14' from dual union all
select 8, 111, 2, 1, date '2020-09-17' from dual union all
SELECT 9, 222, 1, 2, date '2019-12-31' from dual UNION ALL
SELECT 10, 222, 2, 2, date '2020-12-01' from dual UNION ALL
SELECT 11, 222, 2, 2, date '2021-01-02' from dual
),
year_span (n) as (
select 1 from dual union all
select 2 from dual
),
H1 AS (
SELECT ROW_NUMBER() OVER (PARTITION BY PersonID ORDER BY RecordID) PID, H.*
FROM ScheduleHistory H
),
H2 as (
SELECT
H1.*, H2.DateChanged DateChanged2,
EXTRACT(YEAR FROM H2.DateChanged) - EXTRACT(YEAR FROM H1.DateChanged) + 1 Y,
trunc(H2.DateChanged,'YEAR') Y2
FROM H1 H1
LEFT JOIN H1 H2 ON H1.PID = H2.PID-1 AND H1.PersonID = H2.PersonID
),
H3 AS (
SELECT Y, N, H2.PID, H2.RecordID, H2.PersonID, H2.NextStatusID,
CASE WHEN Y=1 THEN H2.DateChanged ELSE CASE WHEN N=1 THEN H2.DateChanged ELSE Y2 END END D1,
CASE WHEN Y=1 THEN H2.DateChanged2 ELSE CASE WHEN N=1 THEN Y2 ELSE H2.DateChanged2 END END D2
FROM H2
JOIN year_span N ON N.N <=Y
),
H AS (
SELECT PersonID, NextStatusID, EXTRACT(year FROM d1) Y, d2-d1 D
FROM H3
)
select PersonID, sdescr Status, Y, sum(d) d
from H
join statuses s on NextStatusID = s.sid
group by PersonID, sdescr, Y
order by PersonID, sdescr, Y
output
PersonID Status Y d
111 Active 2020 177
111 Other 2020 6
111 Out of the Office 2020 7
222 Out of the Office 2019 1
222 Out of the Office 2020 366
222 Out of the Office 2021 1
check the fiddle here

SQL: Create multiple rows for a record based on months between two dates

My table has records as below for different Id's and different start and end dates
ID, Startdate, Enddate
1, 2017-02-14, 2018-11-05
I want to write an SQL without using date dimension table that gives below output: Basically one record for each month between start and end date.
1, 2017, 02
1, 2017, 03
1, 2017, 04
1, 2017, 05
1, 2017, 06
1, 2017, 07
1, 2017, 08
1, 2017, 09
1, 2017, 10
1, 2017, 11
1, 2017, 12
1, 2018, 01
1, 2018, 02
1, 2018, 03
1, 2018, 04
1, 2018, 05
1, 2018, 06
1, 2018, 07
1, 2018, 09
1, 2018, 10
1, 2018, 11
Please use below query example:
set #start_date = '2017-02-14';
set #end_date = LAST_DAY('2018-11-05');
WITH RECURSIVE date_range AS
(
select MONTH(#start_date) as month_, YEAR(#start_date) as year_, DATE_ADD(#start_date, INTERVAL 1 MONTH) as next_month_date
UNION
SELECT MONTH(dr.next_month_date) as month_, YEAR(dr.next_month_date) as year_, DATE_ADD(dr.next_month_date, INTERVAL 1 MONTH) as next_month_date
FROM date_range dr
where next_month_date <= #end_date
)
select month_, year_ from date_range
order by next_month_date desc
This is what I did and it worked like a charm:
-- sample data
WITH table_data
AS (
SELECT 1 AS id
,cast('2017-08-14' AS DATE) AS start_dt
,cast('2018-12-16' AS DATE) AS end_dt
UNION ALL
SELECT 2 AS id
,cast('2017-09-14' AS DATE) AS start_dt
,cast('2019-01-16' AS DATE) AS end_dt
)
-- find minimum date from the data
,starting_date (start_date)
AS (
SELECT min(start_dt)
FROM TABLE_DATA
)
--get all months between min and max dates
,all_dates
AS (
SELECT last_day(add_months(date_trunc('month', start_date), idx * 1)) month_date
FROM starting_date
CROSS JOIN _v_vector_idx
WHERE month_date <= add_months(start_date, abs(months_between((
SELECT min(start_dt) FROM TABLE_DATA), (SELECT max(end_dt) FROM TABLE_DATA))) + 1)
ORDER BY month_date
)
SELECT id
,extract(year FROM month_date)
,extract(month FROM month_date)
,td.start_dt
,td.end_dt
FROM table_data td
INNER JOIN all_dates ad
ON ad.month_date > td.start_dt
AND ad.month_date <= last_day(td.end_dt)
ORDER BY 1
,2
You have to generate date and from that have to pick year and month
select distinct year(date),month( date) from
(select * from (
select
date_add('2017-02-14 00:00:00.000', INTERVAL n5.num*10000+n4.num*1000+n3.num*100+n2.num*10+n1.num DAY ) as date
from
(select 0 as num
union all select 1
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9) n1,
(select 0 as num
union all select 1
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9) n2,
(select 0 as num
union all select 1
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9) n3,
(select 0 as num
union all select 1
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9) n4,
(select 0 as num
union all select 1
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9) n5
) a
where date >'2017-02-14 00:00:00.000' and date < '2018-11-05'
) as t

Select Query Oracle

My Table Structure is like below:
Carrier Terminal timestamp1
1 1 21-Mar-17
2 101 21-Mar-17
3 2 21-Mar-17
4 202 21-Mar-17
5 3 21-Mar-17
6 303 21-Mar-17
where carrier
flight 1,2 = Delta
flight 3,4 = Air France
flight 5,6 = Lufthanse
and
Terminal 1,101 = T1
terminal 2,202 = T2
terminal 3,303 = T3
I am trying output like below:
count(Delta), count(Air France), count(Lufthansa), terminal as column output
2, 0, 0, T1
0, 2, 0, T2
0, 0, 2, T3
I have started like this
select count(Delta), count(Air France), count(Lufthansa), terminal
from table_name
where timestamp between '01-Mar-18 07.00.00.000000 AM' and '30-Mar-18 07.59.59.999999 AM'
I am trying to write a query to have a count of different carriers flown through a particular day for each terminal
Any Advise will be highly appreciated
I'm making a whole lot of assumptions for this to work... I've extracted all the rules you've mentioned in your text and I've assumed that those structures are are already in place. Otherwise, let us know :)
with flights(carrier, terminal, departure) as(
select 1, 1, timestamp '2017-03-01 01:00:00' from dual union all
select 2, 101, timestamp '2017-03-01 02:00:00' from dual union all
select 3, 2, timestamp '2017-03-01 03:00:00' from dual union all
select 4, 202, timestamp '2017-03-01 04:00:00' from dual union all
select 5, 3, timestamp '2017-03-01 05:00:00' from dual union all
select 6, 303, timestamp '2017-03-01 06:00:00' from dual
)
,carriers(carrier, carrier_name) as(
select 1, 'Delta' from dual union all
select 2, 'Delta' from dual union all
select 3, 'Air France' from dual union all
select 4, 'Air France' from dual union all
select 5, 'Lufthanse' from dual union all
select 6, 'Lufthanse' from dual
)
,terminals(terminal, terminal_name) as(
select 1, 'T1' from dual union all
select 101, 'T1' from dual union all
select 2, 'T2' from dual union all
select 202, 'T2' from dual union all
select 3, 'T3' from dual union all
select 303, 'T3' from dual
)
select terminal_name
,count(case when carrier_name = 'Delta' then 1 end) as "Delta"
,count(case when carrier_name = 'Air France' then 1 end) as "Air France"
,count(case when carrier_name = 'Lufthanse' then 1 end) as "Lufthanse"
from flights f
join carriers c using(carrier)
join terminals t using(terminal)
where departure >= timestamp '2017-03-01 00:00:00'
and departure < timestamp '2017-04-01 00:00:00'
group by terminal_name
order by terminal_name;
with
t ( flight, gate, ts ) as (
select 1, 1, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual union all
select 2, 101, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual union all
select 3, 2, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual union all
select 4, 202, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual union all
select 5, 3, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual union all
select 6, 303, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual
)
-- End of simulated inputs (for testing only, not part of the solution).
-- SQL query begins below this line. Use your actual table and column names.
select count (case when flight in (1, 2) then 1 end) as delta
, count (case when flight in (3, 4) then 1 end) as air_france
, count (case when flight in (5, 6) then 1 end) as lufthansa
, case when gate in (1, 101) then 'T1'
when gate in (2, 202) then 'T2'
when gate in (3, 303) then 'T3' end as terminal
from t
where ts between '21-Mar-17 02.00.00.000000 AM' and '21-Mar-17 10.00.00.000000 AM'
group by case when gate in (1, 101) then 'T1'
when gate in (2, 202) then 'T2'
when gate in (3, 303) then 'T3' end
order by terminal
;
DELTA AIR_FRANCE LUFTHANSA TERMINAL
---------- ---------- ---------- --------
2 0 0 T1
0 2 0 T2
0 0 2 T3

Oracle: Identifying peak values in a time series

I have following values in a column of table. there are two columns in table. The other column is having distinct dates in descending order.
3
4
3
21
4
4
-1
3
21
-1
4
4
8
3
3
-1
21
-1
4
The graph will be
I need only peaks higlighted in graph with circles in output
4
21
21
8
21
4
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TEST ( datetime, value ) AS
SELECT DATE '2015-01-01', 3 FROM DUAL
UNION ALL SELECT DATE '2015-01-02', 4 FROM DUAL
UNION ALL SELECT DATE '2015-01-03', 3 FROM DUAL
UNION ALL SELECT DATE '2015-01-04', 21 FROM DUAL
UNION ALL SELECT DATE '2015-01-05', 4 FROM DUAL
UNION ALL SELECT DATE '2015-01-06', 4 FROM DUAL
UNION ALL SELECT DATE '2015-01-07', -1 FROM DUAL
UNION ALL SELECT DATE '2015-01-08', 3 FROM DUAL
UNION ALL SELECT DATE '2015-01-09', 21 FROM DUAL
UNION ALL SELECT DATE '2015-01-10', -1 FROM DUAL
UNION ALL SELECT DATE '2015-01-11', 4 FROM DUAL
UNION ALL SELECT DATE '2015-01-12', 4 FROM DUAL
UNION ALL SELECT DATE '2015-01-13', 8 FROM DUAL
UNION ALL SELECT DATE '2015-01-14', 3 FROM DUAL
UNION ALL SELECT DATE '2015-01-15', 3 FROM DUAL
UNION ALL SELECT DATE '2015-01-16', -1 FROM DUAL
UNION ALL SELECT DATE '2015-01-17', 21 FROM DUAL
UNION ALL SELECT DATE '2015-01-18', -1 FROM DUAL
UNION ALL SELECT DATE '2015-01-19', 4 FROM DUAL
Query 1:
SELECT datetime, value
FROM (
SELECT datetime,
LAG( value ) OVER ( ORDER BY datetime ) AS prv,
value,
LEAD( value ) OVER ( ORDER BY datetime ) AS nxt
FROM test
)
WHERE ( prv IS NULL OR prv < value )
AND ( nxt IS NULL OR nxt < value )
Results:
| DATETIME | VALUE |
|---------------------------|-------|
| January, 02 2015 00:00:00 | 4 |
| January, 04 2015 00:00:00 | 21 |
| January, 09 2015 00:00:00 | 21 |
| January, 13 2015 00:00:00 | 8 |
| January, 17 2015 00:00:00 | 21 |
| January, 19 2015 00:00:00 | 4 |
So the peak is defined as the previous value and next value being less than the current value, and you can retrieve the previous an next using LAG() and LEAD() functions.
You really need some other column (e.g. my_date) to define the order of the rows, then you can:
select my_date,
value
from (select value,
lag(value ) over (order by my_date) lag_value,
lead(value) over (order by my_date) lead_value
from my_table)
where value > coalesce(lag_value , value - 1) and
value > coalesce(lead_value, value - 1);
This would not allow for a "double-peak" such as:
1,
15,
15,
4
... for which much more complex logic would be needed.
Just for completeness the row pattern matching example:
WITH source_data(datetime, value) AS (
SELECT DATE '2015-01-01', 3 FROM DUAL UNION ALL
SELECT DATE '2015-01-02', 4 FROM DUAL UNION ALL
SELECT DATE '2015-01-03', 3 FROM DUAL UNION ALL
SELECT DATE '2015-01-04', 21 FROM DUAL UNION ALL
SELECT DATE '2015-01-05', 4 FROM DUAL UNION ALL
SELECT DATE '2015-01-06', 4 FROM DUAL UNION ALL
SELECT DATE '2015-01-07', -1 FROM DUAL UNION ALL
SELECT DATE '2015-01-08', 3 FROM DUAL UNION ALL
SELECT DATE '2015-01-09', 21 FROM DUAL UNION ALL
SELECT DATE '2015-01-10', -1 FROM DUAL UNION ALL
SELECT DATE '2015-01-11', 4 FROM DUAL UNION ALL
SELECT DATE '2015-01-12', 4 FROM DUAL UNION ALL
SELECT DATE '2015-01-13', 8 FROM DUAL UNION ALL
SELECT DATE '2015-01-14', 3 FROM DUAL UNION ALL
SELECT DATE '2015-01-15', 3 FROM DUAL UNION ALL
SELECT DATE '2015-01-16', -1 FROM DUAL UNION ALL
SELECT DATE '2015-01-17', 21 FROM DUAL UNION ALL
SELECT DATE '2015-01-18', -1 FROM DUAL UNION ALL
SELECT DATE '2015-01-19', 4 FROM DUAL
)
SELECT *
FROM
source_data MATCH_RECOGNIZE (
ORDER BY datetime
MEASURES
LAST(UP.datetime) AS datetime,
LAST(UP.value) AS value
ONE ROW PER MATCH
PATTERN ((UP DOWN) | UP$)
DEFINE
DOWN AS DOWN.value < PREV(DOWN.value),
UP AS UP.value > PREV(UP.value)
)
ORDER BY
datetime
There is a much more sophisticated method available in Oracle 12c, which is to use pattern matching SQL.
http://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8966
It would be overkill for a situation like this, but if you needed more complex patterns matched, such as W shaped patterns, then it would be worth investigating.
Using LAG function you can compare values from different rows. I assume the resultset you showed is ordered by another column named position.
select value
from
(select value,
lag(value,-1) over (order by position) prev,
lag(value,1) over (order by position) next
from table)
where value > prev
and value > next