Oracle: Generate rows for missing years within range (sysyear + 9) - sql

I have an Oracle 18c table that has rows for certain years:
with data (year_, amount) as (
select 2024, 100 from dual union all
select 2025, 200 from dual union all
select 2025, 300 from dual union all
select 2026, 400 from dual union all
select 2027, 500 from dual union all
select 2028, 600 from dual union all
select 2028, 700 from dual union all
select 2028, 800 from dual union all
select 2029, 900 from dual union all
select 2031, 100 from dual
)
select
*
from
data
YEAR_ AMOUNT
---------- ----------
2024 100
2025 200
2025 300
2026 400
2027 500
2028 600
2028 700
2028 800
2029 900
2031 100
I want at least one row for each year within this range: sysyear + 9. In other words, I want rows for 10 years, starting with the current year (currently 2023).
I'm missing rows for certain years: 2023, 2030, and 2032. So I want to generate filler rows for those missing years. The amount for the filler rows would be null.
It would look like this:
YEAR_ AMOUNT
---------- ----------
2023 --filler
2024 100
2025 200
2025 300
2026 400
2027 500
2028 600
2028 700
2028 800
2029 900
2030 --filler
2031 100
2032 --filler
In an Oracle SQL query, how can I select the rows and generate filler rows within the 10 year range?
Edit: I would prefer not to manually create a list of years in the query or in a table. I would rather create a dynamic range within the query.

Try it like here:
Select y.YEAR_, t.AMOUNT
From (Select EXTRACT(YEAR From SYSDATE) + LEVEL - 1 "YEAR_" From Dual Connect By LEVEL <= 10) y
Left Join tbl t ON(t.YEAR_ = y.YEAR_)
Order By y.YEAR_, t.AMOUNT
With your sample data:
WITH
tbl (YEAR_, AMOUNT) AS
(
Select 2024, 100 From Dual Union All
Select 2025, 200 From Dual Union All
Select 2025, 300 From Dual Union All
Select 2026, 400 From Dual Union All
Select 2027, 500 From Dual Union All
Select 2028, 600 From Dual Union All
Select 2028, 700 From Dual Union All
Select 2028, 800 From Dual Union All
Select 2029, 900 From Dual Union All
Select 2031, 100 From Dual
)
... the result is:
YEAR_
AMOUNT
2023
2024
100
2025
200
2025
300
2026
400
2027
500
2028
600
2028
700
2028
800
2029
900
2030
2031
100
2032

I broke out the CTE into multiple bits to help explain, but this should do the trick
SQL> with data (year_, amount) as (
2 select 2024, 100 from dual union all
3 select 2025, 200 from dual union all
4 select 2025, 300 from dual union all
5 select 2026, 400 from dual union all
6 select 2027, 500 from dual union all
7 select 2028, 600 from dual union all
8 select 2028, 700 from dual union all
9 select 2028, 800 from dual union all
10 select 2029, 900 from dual union all
11 select 2031, 100 from dual
12 ),
13 boundaries as (
14 select max(year_) maxy, min(year_) miny
15 from data
16 ),
17 all_the_years as
18 ( select miny+rownum-1 yr
19 from boundaries
20 connect by level <= maxy-miny+1
21 )
22 select *
23 from all_the_years a
24 left outer join data d
25 on ( a.yr = d.year_ )
26 order by 1;
YR YEAR_ AMOUNT
---------- ---------- ----------
2024 2024 100
2025 2025 200
2025 2025 300
2026 2026 400
2027 2027 500
2028 2028 700
2028 2028 800
2028 2028 600
2029 2029 900
2030
2031 2031 100
11 rows selected.
If its fixed at 10 years, then you don't need the MAX - just connect level <= 10

Related

How to perform rolling sum in BigQuery

I have sample data in BigQuery as -
with temp as (
select DATE("2016-10-02") date_field , 200 as salary
union all
select DATE("2016-10-09"), 500
union all
select DATE("2016-10-16"), 350
union all
select DATE("2016-10-23"), 400
union all
select DATE("2016-10-30"), 190
union all
select DATE("2016-11-06"), 550
union all
select DATE("2016-11-13"), 610
union all
select DATE("2016-11-20"), 480
union all
select DATE("2016-11-27"), 660
union all
select DATE("2016-12-04"), 690
union all
select DATE("2016-12-11"), 810
union all
select DATE("2016-12-18"), 950
union all
select DATE("2016-12-25"), 1020
union all
select DATE("2017-01-01"), 680
) ,
temp2 as (
select * , DATE("2017-01-01") as current_date
from temp
)
select * from temp2
I want to perform rolling sum on this table. As an example, I have set current date to 2017-01-01. Now, this being the current date, I want to go back 30 days and take sum of salary field. Hence, with 2017-01-01 being the current date, the total that should be returned is for the month of December , 2016, which is 690+810+950+1020. How can I do this using StandardSQL ?
Below is for BigQuery Standard SQL for Rolling last 30 days SUM
#standardSQL
SELECT *,
SUM(salary) OVER(
ORDER BY UNIX_DATE(date_field)
RANGE BETWEEN 30 PRECEDING AND 1 PRECEDING
) AS rolling_30_days_sum
FROM `project.dataset.your_table`
You can test, play with above using sample data from your question as below
#standardSQL
WITH temp AS (
SELECT DATE("2016-10-02") date_field , 200 AS salary UNION ALL
SELECT DATE("2016-10-09"), 500 UNION ALL
SELECT DATE("2016-10-16"), 350 UNION ALL
SELECT DATE("2016-10-23"), 400 UNION ALL
SELECT DATE("2016-10-30"), 190 UNION ALL
SELECT DATE("2016-11-06"), 550 UNION ALL
SELECT DATE("2016-11-13"), 610 UNION ALL
SELECT DATE("2016-11-20"), 480 UNION ALL
SELECT DATE("2016-11-27"), 660 UNION ALL
SELECT DATE("2016-12-04"), 690 UNION ALL
SELECT DATE("2016-12-11"), 810 UNION ALL
SELECT DATE("2016-12-18"), 950 UNION ALL
SELECT DATE("2016-12-25"), 1020 UNION ALL
SELECT DATE("2017-01-01"), 680
)
SELECT *,
SUM(salary) OVER(
ORDER BY UNIX_DATE(date_field)
RANGE BETWEEN 30 PRECEDING AND 1 PRECEDING
) AS rolling_30_days_sum
FROM temp
-- ORDER BY date_field
with result
Row date_field salary rolling_30_days_sum
1 2016-10-02 200 null
2 2016-10-09 500 200
3 2016-10-16 350 700
4 2016-10-23 400 1050
5 2016-10-30 190 1450
6 2016-11-06 550 1440
7 2016-11-13 610 1490
8 2016-11-20 480 1750
9 2016-11-27 660 1830
10 2016-12-04 690 2300
11 2016-12-11 810 2440
12 2016-12-18 950 2640
13 2016-12-25 1020 3110
14 2017-01-01 680 3470
This is not exactly a "rolling sum", but it's the exact answer to "I want to go back 30 days and take sum of salary field. Hence, with 2017-01-01 being the current date, the total that should be returned is for the month of December"
with temp as (
select DATE("2016-10-02") date_field , 200 as salary
union all
select DATE("2016-10-09"), 500
union all
select DATE("2016-10-16"), 350
union all
select DATE("2016-10-23"), 400
union all
select DATE("2016-10-30"), 190
union all
select DATE("2016-11-06"), 550
union all
select DATE("2016-11-13"), 610
union all
select DATE("2016-11-20"), 480
union all
select DATE("2016-11-27"), 660
union all
select DATE("2016-12-04"), 690
union all
select DATE("2016-12-11"), 810
union all
select DATE("2016-12-18"), 950
union all
select DATE("2016-12-25"), 1020
union all
select DATE("2017-01-01"), 680
) ,
temp2 as (
select * , DATE("2017-01-01") as current_date_x
from temp
)
select SUM(salary)
from temp2
WHERE date_field BETWEEN DATE_SUB(current_date_x, INTERVAL 30 DAY) AND DATE_SUB(current_date_x, INTERVAL 1 DAY)
3470
Note that I wasn't able to use current_date as a variable name, as it gets replaced by the actual current date.

Oracle Query to rollup QTY by Year

I have a requirement to find out MAX VALUE from SUM of Quantities Divided by YEAR (Need to write a Oracle Query).
For Example
ITEM_ID ORG_ID YEAR QTY
100 121 2015 10
100 121 2016 5
100 121 2017 8
101 146 2014 10
101 146 2015 11
101 146 2016 12
101 146 2017 13
My Output should be like this :-
for Item_id 100,121 the max_avg should be max(10+5+8/3, 5+10/2, 10/1)... max (7.6, 7.5, 8) = 8
for Item_id 101,146 the max_avg should be max(10+11+12+13/4, 11+12+13/3, 12+13/2, 13/1)... max(11.5, 12, 12.5, 13) = 13
ITEM_ID ORG_ID YEAR QTY MAX_AVG
100 121 2015 10 8
100 121 2016 5 8
100 121 2017 8 8
101 146 2014 10 13
101 146 2015 11 13
101 146 2016 12 13
101 146 2017 13 13
Any help would be greatly appreciated.
You need two layers of analytic functions: You need analytic MAX (of something) because you want to return all rows from the original table; and within the MAX you need analytic (rolling) average. Analytic functions can't be nested, so you need a subquery and an outer query. Something like this:
with inputs ( item_id, org_id, yr, qty ) as (
select 100, 121, 2015, 10 from dual
union all select 100, 121, 2016, 5 from dual
union all select 100, 121, 2017, 8 from dual
union all select 101, 146, 2014, 10 from dual
union all select 101, 146, 2015, 11 from dual
union all select 101, 146, 2016, 12 from dual
union all select 101, 146, 2017, 13 from dual
)
-- End of simulated inputs (for testing only, not part of the solution).
-- SQL query begins BELOW THIS LINE. Use your actual table and column names.
select item_id, org_id, yr, qty,
max(forward_avg) over ( partition by item_id, org_id ) as max_avg
from ( select item_id, org_id, yr, qty,
avg(qty) over ( partition by item_id, org_id
order by yr desc ) as forward_avg
from inputs i
) b
order by item_id, org_id, yr -- If needed
;
ITEM_ID ORG_ID YR QTY MAX_AVG
---------- ---------- ---------- ---------- ----------
100 121 2015 10 8
100 121 2016 5 8
100 121 2017 8 8
101 146 2014 10 13
101 146 2015 11 13
101 146 2016 12 13
101 146 2017 13 13

How to finish this LAG calculation in Oracle

I have month and value columns in a table,like
Month Value Market
2010/01 100 1
2010/02 200 1
2010/03 300 1
2010/04 400 1
2010/05 500 1
2010/01 100 2
2010/02 200 2
2010/03 300 2
2010/04 400 2
2010/05 500 2
What I want to do is get new Month and Value combinations using (value in month(n-1)+value in month(n))/2=value in month n, also this calculation is based on market column, it group by market number. So, for the above example, the new month and value combination should be
Month Value Market
2010/01 null 1
2010/02 (100+200)/2 1
2010/03 (200+300)/2 1
2010/04 (300+400)/2 1
2010/05 (400+500)/2 1
2010/01 null 2
2010/02 (100+200)/2 2
2010/03 (200+300)/2 2
2010/04 (300+400)/2 2
2010/05 (400+500)/2 2
Do you know how to achieve it in Oracle? Thank you!
If there is no gap in your data, you can use LAG:
SQL> WITH DATA AS (
2 SELECT DATE '2010-01-01' mon, 100 val FROM dual UNION ALL
3 SELECT DATE '2010-02-01' mon, 200 val FROM dual UNION ALL
4 SELECT DATE '2010-03-01' mon, 300 val FROM dual UNION ALL
5 SELECT DATE '2010-04-01' mon, 400 val FROM dual UNION ALL
6 SELECT DATE '2010-05-01' mon, 500 val FROM dual
7 )
8 SELECT mon, (LAG(val) OVER (ORDER BY mon) + val) / 2 avg_val FROM DATA;
MON AVG_VAL
----------- ----------
01/01/2010
01/02/2010 150
01/03/2010 250
01/04/2010 350
01/05/2010 450
However, if there is a gap the result might not be what you expect. In that case, you can either use a self-join or narrow the windowing clause:
SQL> WITH DATA AS (
2 SELECT DATE '2010-01-01' mon, 100 val FROM dual UNION ALL
3 SELECT DATE '2010-02-01' mon, 200 val FROM dual UNION ALL
4 SELECT DATE '2010-03-01' mon, 300 val FROM dual UNION ALL
5 /* gap ! */
6 SELECT DATE '2010-05-01' mon, 400 val FROM dual UNION ALL
7 SELECT DATE '2010-06-01' mon, 500 val FROM dual
8 )
9 SELECT mon, (first_value(val)
10 OVER (ORDER BY mon
11 RANGE BETWEEN INTERVAL '1' MONTH PRECEDING
12 AND INTERVAL '1' MONTH PRECEDING)
13 + val) / 2 avg_val
14 FROM DATA;
MON AVG_VAL
----------- ----------
01/01/2010
01/02/2010 150
01/03/2010 250
01/05/2010
01/06/2010 450
This does it:
SQL> select month,
2 (value+lag(value) over (order by month))/2 as value
3* from t1
MONTH VALUE
---------- ----------
2010/01
2010/02 150
2010/03 250
2010/04 350
2010/05 450
5 rows selected.

Oracle SQL Analytic query - recursive spreadsheet-like running total

I have the following data, composed of the A value, ordered by MM (month).
The B column is computed as GREATEST(current value of A + previous value of B, 0) in a spreadsheet-like fashion.
How can I compute B using a SQL Query?
I tried using Analytic Functions, but I was unable to succeed.
I know there is the Model Clause; I found a similar example, but I don't know where to begin.
I am using Oracle 10g, therefore I cannot use recursive queries.
Here is my test data:
MM | A | B
-----------+--------+------
2012-01-01 | 800 | 800
2012-02-01 | 1900 | 2700
2012-03-01 | 1750 | 4450
2012-04-01 | -20000 | 0
2012-05-01 | 900 | 900
2012-06-01 | 3900 | 4800
2012-07-01 | -2600 | 2200
2012-08-01 | -2600 | 0
2012-09-01 | 2100 | 2100
2012-10-01 | -2400 | 0
2012-11-01 | 1100 | 1100
2012-12-01 | 1300 | 2400
And here is the "table definition":
select t.* from (
select date'2012-01-01' as mm, 800 as a from dual union all
select date'2012-02-01' as mm, 1900 as a from dual union all
select date'2012-03-01' as mm, 1750 as a from dual union all
select date'2012-04-01' as mm, -20000 as a from dual union all
select date'2012-05-01' as mm, 900 as a from dual union all
select date'2012-06-01' as mm, 3900 as a from dual union all
select date'2012-07-01' as mm, -2600 as a from dual union all
select date'2012-08-01' as mm, -2600 as a from dual union all
select date'2012-09-01' as mm, 2100 as a from dual union all
select date'2012-10-01' as mm, -2400 as a from dual union all
select date'2012-11-01' as mm, 1100 as a from dual union all
select date'2012-12-01' as mm, 1300 as a from dual
) t;
So let's unleash the MODEL clause (a device whose mystery is only exceeded by its power) on this problem:
with data as (
select date'2012-01-01' as mm, 800 as a from dual union all
select date'2012-02-01' as mm, 1900 as a from dual union all
select date'2012-03-01' as mm, 1750 as a from dual union all
select date'2012-04-01' as mm, -20000 as a from dual union all
select date'2012-05-01' as mm, 900 as a from dual union all
select date'2012-06-01' as mm, 3900 as a from dual union all
select date'2012-07-01' as mm, -2600 as a from dual union all
select date'2012-08-01' as mm, -2600 as a from dual union all
select date'2012-09-01' as mm, 2100 as a from dual union all
select date'2012-10-01' as mm, -2400 as a from dual union all
select date'2012-11-01' as mm, 1100 as a from dual union all
select date'2012-12-01' as mm, 1300 as a from dual
)
select mm, a, b
from (
-- Add a dummy value for b, making it available to the MODEL clause
select mm, a, 0 b
from data
)
-- Generate a ROW_NUMBER() dimension, in order to access rows by RN
model dimension by (row_number() over (order by mm) rn)
-- Spreadsheet values / measures involved in calculations are mm, a, b
measures (mm, a, b)
-- A single rule will do. Any value of B should be calculated according to
-- GREATEST([previous value of B] + [current value of A], 0)
rules (
b[any] = greatest(nvl(b[cv(rn) - 1], 0) + a[cv(rn)], 0)
)
The above yields:
MM A B
01.01.2012 800 800
01.02.2012 1900 2700
01.03.2012 1750 4450
01.04.2012 -20000 0
01.05.2012 900 900
01.06.2012 3900 4800
01.07.2012 -2600 2200
01.08.2012 -2600 0
01.09.2012 2100 2100
01.10.2012 -2400 0
01.11.2012 1100 1100
01.12.2012 1300 2400
I came up with a user-defined aggregate function
create or replace type tsum1 as object
(
total number,
static function ODCIAggregateInitialize(nctx IN OUT tsum1 )
return number,
member function ODCIAggregateIterate(self IN OUT tsum1 ,
value IN number )
return number,
member function ODCIAggregateTerminate(self IN tsum1,
retVal OUT number,
flags IN number)
return number,
member function ODCIAggregateMerge(self IN OUT tsum1,
ctx2 IN tsum1)
return number
)
/
create or replace type body tsum1
is
static function ODCIAggregateInitialize(nctx IN OUT tsum1)
return number
is
begin
nctx := tsum1(0);
return ODCIConst.Success;
end;
member function ODCIAggregateIterate(self IN OUT tsum1,
value IN number )
return number
is
begin
self.total := self.total + value;
if (self.total < 0) then
self.total := 0;
end if;
return ODCIConst.Success;
end;
member function ODCIAggregateTerminate(self IN tsum1,
retVal OUT number,
flags IN number)
return number
is
begin
retVal := self.total;
return ODCIConst.Success;
end;
member function ODCIAggregateMerge(self IN OUT tsum1,
ctx2 IN tsum1)
return number
is
begin
self.total := self.total + ctx2.total;
return ODCIConst.Success;
end;
end;
/
CREATE OR REPLACE FUNCTION sum1(input number)
RETURN number
PARALLEL_ENABLE AGGREGATE USING tsum1;
/
Here is the query
with T1 as(
select date'2012-01-01' as mm, 800 as a from dual union all
select date'2012-02-01' as mm, 1900 as a from dual union all
select date'2012-03-01' as mm, 1750 as a from dual union all
select date'2012-04-01' as mm, -20000 as a from dual union all
select date'2012-05-01' as mm, 900 as a from dual union all
select date'2012-06-01' as mm, 3900 as a from dual union all
select date'2012-07-01' as mm, -2600 as a from dual union all
select date'2012-08-01' as mm, -2600 as a from dual union all
select date'2012-09-01' as mm, 2100 as a from dual union all
select date'2012-10-01' as mm, -2400 as a from dual union all
select date'2012-11-01' as mm, 1100 as a from dual union all
select date'2012-12-01' as mm, 1300 as a from dual
)
select mm
, a
, sum1(a) over(order by mm) as b
from t1
Mm a b
----------------------------
01.01.2012 800 800
01.02.2012 1900 2700
01.03.2012 1750 4450
01.04.2012 -20000 0
01.05.2012 900 900
01.06.2012 3900 4800
01.07.2012 -2600 2200
01.08.2012 -2600 0
01.09.2012 2100 2100
01.10.2012 -2400 0
01.11.2012 1100 1100
01.12.2012 1300 2400
with sample_data as (
select date'2012-01-01' as mm, 800 as a from dual union all
select date'2012-02-01' as mm, 1900 as a from dual union all
select date'2012-03-01' as mm, 1750 as a from dual union all
select date'2012-04-01' as mm, -20000 as a from dual union all
select date'2012-05-01' as mm, 900 as a from dual union all
select date'2012-06-01' as mm, 3900 as a from dual union all
select date'2012-07-01' as mm, -2600 as a from dual union all
select date'2012-08-01' as mm, -2600 as a from dual union all
select date'2012-09-01' as mm, 2100 as a from dual union all
select date'2012-10-01' as mm, -2400 as a from dual union all
select date'2012-11-01' as mm, 1100 as a from dual union all
select date'2012-12-01' as mm, 1300 as a from dual
)
select mm,
a,
greatest(nvl(a,0) + lag(a,1,0) over (order by mm), 0) as b
from sample_data;
It does however not produce this line:
2012-05-01 | 900 | 900
because it calculates 900 - 20000 in that row, and zero is bigger than the result of that. You can "fix" that if you use the abs function to get rid of the negative value in the computation.
Sorry if this is off topic, given the Oracle version of the question, but we can now use the SQL:2016 MATCH_RECOGNIZE clause:
select * from t
match_recognize(
order by mm
measures case classifier() when 'POS' then sum(a) else 0 end as b
all rows per match
pattern (pos* neg{0,1})
define pos as sum(a) > 0
);

help me in executing the sql query

I have a table like below. I want to calculate the sum of amount for the first 5% customers and then next 20% and next 25% and next 25% and finally remaining. This is just the sample of DB table.
5%=1, so the sum is 100
Next 20%=4, so sum=1800(200+500+300+800)
Next 25%=5, so sum=2900(600+800+500+400+600)
Next 25%=5, so sum=2500(300+800+300+800+300)
Rest=1400
Cus_ID Amount
1004 100
1064 200
1126 500
1280 300
1678 800
1719 600
1862 800
2109 500
2892 400
2957 600
3097 300
3205 800
3399 300
3460 800
4169 300
4380 800
4689 100
4886 200
4906 300
Result
5% 20% 25% next 25% Rest
100 1800 2900 2500 1400
WITH T(Cus_ID,Amount ) AS
(
SELECT 1004, 100 UNION ALL
SELECT 1064, 200 UNION ALL
SELECT 1126, 500 UNION ALL
SELECT 1280, 300 UNION ALL
SELECT 1678, 800 UNION ALL
SELECT 1719, 600 UNION ALL
SELECT 1862, 800 UNION ALL
SELECT 2109, 500 UNION ALL
SELECT 2892, 400 UNION ALL
SELECT 2957, 600 UNION ALL
SELECT 3097, 300 UNION ALL
SELECT 3205, 800 UNION ALL
SELECT 3399, 300 UNION ALL
SELECT 3460, 800 UNION ALL
SELECT 4169, 300 UNION ALL
SELECT 4380, 800 UNION ALL
SELECT 4689, 100 UNION ALL
SELECT 4886, 200 UNION ALL
SELECT 4906, 300
), T2 AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY Cus_ID) AS RN,
ROW_NUMBER() OVER (ORDER BY Cus_ID)/ CAST(COUNT(*) OVER() AS FLOAT) AS Pct
FROM T
), T3(Amount, Grp) AS
(
SELECT a.Amount, CASE WHEN ISNULL(b.Pct,0) < 0.05 THEN 1
WHEN b.Pct < 0.25 THEN 2
WHEN b.Pct < 0.50 THEN 3
WHEN b.Pct < 0.75 THEN 4
ELSE 5
END
FROM T2 a LEFT JOIN T2 b ON b.RN=a.RN-1
)
SELECT SUM(Amount) AS Amount, Grp
FROM T3
GROUP BY Grp