SQL NOOB - Oracle joins and Row Number - sql

I was hoping to get some guidance on a SQL script I am trying to put together for Oracle database 11g.
I am attempting to perform a count of claims from the 'claim' table, and order them by year / month / and enterprise.
I was able to get a count of claims and order them like I would like, however I need to pull data from another table and I am having trouble combining the 'row_number' function with a join.
Here is my script so far:
SELECT TO_CHAR (SYSTEM_ENTRY_DATE, 'YYYY') YEAR,
TO_CHAR (SYSTEM_ENTRY_DATE, 'MM') MONTH,
ENTERPRISE_IID,
COUNT (*) CLAIMS
FROM (SELECT CLAIM.CLAIM_EID,
CLAIM.SYSTEM_ENTRY_DATE,
CLAIM.ENTERPRISE_IID,
ROW_NUMBER () OVER (PARTITION BY CLAIM.CLAIM_EID, CLAIM.ENTERPRISE_IID
ORDER BY CLAIM.SYSTEM_ENTRY_DATE DESC) RN
FROM CLAIM
WHERE CLAIM_IID IN (SELECT DISTINCT (CLAIM_IID)
FROM CLAIM_LINE
WHERE STATUS <> 'D')
AND CLAIM.CONTEXT = '1'
AND CLAIM.CLAIM_STATUS = 'A'
AND CLAIM.LAST_ANALYSIS_DATE IS NOT NULL)
WHERE RN = 1
GROUP ENTERPRISE_IID,
TO_CHAR (SYSTEM_ENTRY_DATE, 'YYYY'),
TO_CHAR (SYSTEM_ENTRY_DATE, 'MM');
So far all of my data is coming from the 'claim' table. This pulls the following result:
YEAR MONTH ENTERPRISE_IID CLAIMS
---- ----- -------------- ----------
2016 01 6 1
2015 08 6 3
2016 02 6 2
2015 09 6 2
2015 07 6 2
2015 09 5 22
2015 11 5 29
2015 12 5 27
2016 04 5 8
2015 07 5 29
2015 05 5 15
2015 06 5 5
2015 10 5 45
2016 03 5 54
2015 03 5 10
2016 02 5 70
2016 01 5 55
2015 08 5 32
2015 04 5 12
19 rows selected.
The enterprise_IID is the primary key on the 'enterprise' table. The 'enterprise' table also contains the 'name' attribute for each entry. I would like to join the claim and enterprise table in order to show the enterprise name for this count, and not the enterprise_IID.
As you can tell I am rather new to Oracle and SQL, and I am a bit stuck on this one. I was thinking that I should do an inner join between the two tables, but I am not quite sure how to do that when using the row_number function.
Or perhaps I am taking the wrong approach here, and someone could push me in another direction.
Here is what I tried:
SELECT TO_CHAR (SYSTEM_ENTRY_DATE, 'YYYY') YEAR,
TO_CHAR (SYSTEM_ENTRY_DATE, 'MM') MONTH,
ENTERPRISE_IID,
ENTERPRISE.NAME,
COUNT (*) CLAIMS
FROM (SELECT CLAIM.CLAIM_EID,
CLAIM.SYSTEM_ENTRY_DATE,
CLAIM.ENTERPRISE_IID,
ROW_NUMBER () OVER (PARTITION BY CLAIM.CLAIM_EID, CLAIM.ENTERPRISE_IID
ORDER BY CLAIM.SYSTEM_ENTRY_DATE DESC) RN
FROM CLAIM, enterprise
INNER JOIN ENTERPRISE
ON CLAIM.ENTERPRISE_IID = ENTERPRISE.ENTERPRISE_IID
WHERE CLAIM_IID IN (SELECT DISTINCT (CLAIM_IID)
FROM CLAIM_LINE
WHERE STATUS <> 'D')
AND CLAIM.CONTEXT = '1'
AND CLAIM.CLAIM_STATUS = 'A'
AND CLAIM.LAST_ANALYSIS_DATE IS NOT NULL)
WHERE RN = 1
GROUP BY ENTERPRISE.NAME,
ENTERPRISE_IID,
TO_CHAR (SYSTEM_ENTRY_DATE, 'YYYY'),
TO_CHAR (SYSTEM_ENTRY_DATE, 'MM');
Thank you in advance!
"Desired Output"
YEAR MONTH NAME CLAIMS
---- ----- ---- ----------
2016 01 Ent1 1
2015 08 Ent1 3
2016 02 Ent1 2
2015 09 Ent1 2
2015 07 Ent1 2
2015 09 Ent2 22
2015 11 Ent2 29
2015 12 Ent2 27
2016 04 Ent2 8
2015 07 Ent2 29
2015 05 Ent2 15
2015 06 Ent2 5
2015 10 Ent2 45
2016 03 Ent2 54
2015 03 Ent2 10
2016 02 Ent2 70
2016 01 Ent2 55
2015 08 Ent2 32
2015 04 Ent2 12
19 rows selected.

You can try this. Joins can be used when calculating row numbers with row_number function.
SELECT TO_CHAR (SYSTEM_ENTRY_DATE, 'YYYY') YEAR,
TO_CHAR (SYSTEM_ENTRY_DATE, 'MM') MONTH,
ENTERPRISE_IID,
NAME,
COUNT (*) CLAIMS
FROM (SELECT CLAIM.CLAIM_EID,
CLAIM.SYSTEM_ENTRY_DATE,
CLAIM.ENTERPRISE_IID,
ENTERPRISE.NAME,
ROW_NUMBER () OVER (PARTITION BY CLAIM.CLAIM_EID, CLAIM.ENTERPRISE_IID
ORDER BY CLAIM.SYSTEM_ENTRY_DATE DESC) RN
FROM CLAIM --, enterprise (this is not required as the table is being joined already)
INNER JOIN ENTERPRISE ON CLAIM.ENTERPRISE_IID = ENTERPRISE.ENTERPRISE_IID
INNER JOIN (SELECT DISTINCT CLAIM_IID FROM CLAIM_LINE WHERE STATUS <> 'D') CLAIM_LINE
ON CLAIM.CLAIM_IID = CLAIM_LINE.CLAIM_IID
WHERE CLAIM.CONTEXT = '1'
AND CLAIM.CLAIM_STATUS = 'A'
AND CLAIM.LAST_ANALYSIS_DATE IS NOT NULL) t
WHERE RN = 1
GROUP BY NAME, --ENTERPRISE.NAME (The alias ENTERPRISE is not accessible here.)
ENTERPRISE_IID,
TO_CHAR(SYSTEM_ENTRY_DATE, 'YYYY'),
TO_CHAR(SYSTEM_ENTRY_DATE, 'MM');

I'd write the query like this:
SELECT TO_CHAR(TRUNC(c.system_entry_date,'MM'),'YYYY') AS year
, TO_CHAR(TRUNC(c.system_entry_date,'MM'),'MM') AS month
, e.enterprise_name AS name
, COUNT(*) AS claims
FROM (
SELECT r.claim_eid
, r.enterprise_iid
, MAX(r.system_entry_date) AS system_entry_date
FROM ( SELECT DISTINCT l.claim_iid
FROM claim_line l
WHERE l.status <> 'D'
) d
JOIN claim r
ON r.claim_iid = d.claim_iid
AND r.context = '1'
AND r.claim_status = 'A'
AND r.last_analysis_date IS NOT NULL
GROUP
BY r.claim_eid
, r.enterprise_iid
) c
JOIN enterprise e
ON e.enterprise_iid = c.enterprise_iid
GROUP
BY c.enterprise_iid
, TRUNC(c.system_entry_date,'MM')
, e.enterprise_name
ORDER
BY e.enterprise_name
, TRUNC(c.system_entry_date,'MM')
A few notes:
I prefer to qualify ALL column references with the table name or short table alias, and assign aliases to all inline views.
Since the usage of ROW_NUMBER() appears to be get the "latest" system_entry_date for a claim and eliminate duplicates, I'd prefer to use a GROUP BY and a MAX() aggregate.
I prefer to use a join operation rather than the NOT IN (subquery) pattern. (Or, I would tend to use a NOT EXISTS (correlated subquery) pattern.
I don't think it matters too much if you use TO_CHAR or EXTRACT. The TO_CHAR gets you the leading zero in the month, I don't think EXTRACT(MONTH ) gets you the leading zero. I'd use whichever gets me closest to the resultset I need.Personally, I would return just a single column, either containing the year and month as one string e.g. TO_CHAR( , 'YYYYMM') or just a DATE value. It all depends what I'm going to be doing with that.

Just hypothesis to start with, because requirement of query output unclear:
SELECT
C.ENTERPRISE_IID,
E.ENTERPRISE_NAME,
extract(year from CLAIM.SYSTEM_ENTRY_DATE) SYSTEM_ENTRY_YEAR,
extract(month from CLAIM.SYSTEM_ENTRY_DATE) SYSTEM_ENTRY_MONTH,
count(distinct C.CLAIM_EID) CLAIM_COUNT
FROM
CLAIM C,
ENTERPRISE E
WHERE
C.CLAIM_IID IN (
SELECT DISTINCT (CLAIM_IID)
FROM CLAIM_LINE
WHERE STATUS <> 'D'
)
AND C.CONTEXT = '1'
AND C.CLAIM_STATUS = 'A'
AND C.LAST_ANALYSIS_DATE IS NOT NULL
AND E.ENTERPRISE_IID = C.ENTERPRISE_IID
GROUP BY
C.ENTERPRISE_IID,
E.ENTERPRISE_NAME,
extract(year from CLAIM.SYSTEM_ENTRY_DATE),
extract(month from CLAIM.SYSTEM_ENTRY_DATE)
ORDER BY
extract(year from CLAIM.SYSTEM_ENTRY_DATE),
extract(month from CLAIM.SYSTEM_ENTRY_DATE),
E.ENTERPRISE_NAME

Related

Generate a range of records depending on from-to dates

I have a table of records like this:
Item
From
To
A
2018-01-03
2018-03-16
B
2021-05-25
2021-11-10
The output of select should look like:
Item
Month
Year
A
01
2018
A
02
2018
A
03
2018
B
05
2021
B
06
2021
B
07
2021
B
08
2021
Also the range should not exceed the current month. In example above we are asuming current day is 2021-08-01.
I am trying to do something similar to THIS with CONNECT BY LEVEL but as soon as I also select my table next to dual and try to order the records the selection never completes. I also have to join few other tables to the selection but I don't think that would make a difference.
I would very much appreciate your help.
Row generator it is, but not as you did it; most probably you're missing lines #11 - 16 in my query (or their alternative).
SQL> with test (item, date_from, date_to) as
2 -- sample data
3 (select 'A', date '2018-01-03', date '2018-03-16' from dual union all
4 select 'B', date '2021-05-25', date '2021-11-10' from dual
5 )
6 -- query that returns desired result
7 select item,
8 extract(month from (add_months(date_from, column_value - 1))) month,
9 extract(year from (add_months(date_from, column_value - 1))) year
10 from test cross join
11 table(cast(multiset
12 (select level
13 from dual
14 connect by level <=
15 months_between(trunc(least(sysdate, date_to), 'mm'), trunc(date_from, 'mm')) + 1
16 ) as sys.odcinumberlist))
17 order by item, year, month;
ITEM MONTH YEAR
----- ---------- ----------
A 1 2018
A 2 2018
A 3 2018
B 5 2021
B 6 2021
B 7 2021
B 8 2021
7 rows selected.
SQL>
Recursive CTEs are the standard SQL approach to this type of problem. In Oracle, this looks like:
with cte(item, fromd, tod) as (
select item, fromd, tod
from t
union all
select item, add_months(fromd, 1), tod
from cte
where add_months(fromd, 1) < last_day(tod)
)
select item, extract(year from fromd) as year, extract(month from fromd) as month
from cte
order by item, fromd;
Here is a db<>fiddle.

How to remove duplicates records based on some condition in oracle sql?

I have obtained this table by using multiple joins
E_name s_date year h_value l_value update_date
a 01-08-2012 2012 25 70 01-01-2012
a 23-06-2012 2010 20 55 01-01-2009
a 19-03-2020 2020 210 540 29-04-2020
a 14-02-2020 2020 78 765 29-04-2020
b 27-12-2018 2018 14 29 31-01-2019
b 19-12-2018 2018 17 30 19-12-2018
I want to remove duplicates based on E_name and year.
if the next record has the same E_name and year as previous, then
row with most recent update_date will be considered
if both update_date are the same then the row with the most recent s_date will be considered
Required Output
E_name s_date year h_value l_value update_date
a 01-08-2012 2012 25 70 01-01-2012
a 23-06-2012 2010 20 55 01-01-2009
a 19-03-2020 2020 210 540 29-04-2020
b 27-12-2018 2018 14 29 31-01-2019
You need a group by and a row_number() on top
Select * from
( Select e_name,"year",
maxdate,update_date,
row_number() over (partition by e_name,"year" order by
update_date desc) as rn
from
( Select e_name,"year",
update_date,max(s_date) as maxdate from
sample
group by
e_name,"year",update_date
)
)
where rn =1
check this output link fiddle :http://sqlfiddle.com/#!4/c1646/23
A query like below may do the tricks. Put your oriented data into a Temp Table and apply below query on your Temp Table
with MyCTE
as
(
select
E_Name
,S_Date
,year
,H_value
,L_value
,update_date
, RANK() over (partition by E_Name,year order by update_date desc,S_DATE desc) as ranking
from TempTable
)
select * from MyCTE where ranking=1

add a list of custom values as an additional column to the end of an SQL query

I have performed a query like this
SELECT date_trunc('month', "submitted_at" AS "submitted_at", count(distinct "user_id") AS "count"
FROM table
GROUP BY date_trunc('month', "submitted_at")
ORDER BY date_trunc('month', "submitted_at") ASC
which returns a table like this (date simplified)
Date count
may 2020 7
june 2020 9
july 2020 20
aug 2020 35
sep 2020 89
oct 2020 104
I want to add a 3rd column with custom values like below
Date count value
may 2020 7 9
june 2020 9 17
july 2020 20 38
aug 2020 35 94
sep 2020 89 41
oct 2020 104 78
how do I change my query such that I can add the additional column and get the above table with manually entered values? I've tried the below with no luck. The data is sat on different databases so I can't join the data directly. I'd also need to perform this query without creating any new tables within the database as am going to have to perform a similar query over a wide variety of cases and can't create tables each time I need to do this kind of thing. This is in a postgres database if that's relevant.
SELECT date_trunc('month', "submitted_at" AS "submitted_at", count(distinct "user_id") AS "count", (9,17,38,94,41,78) as "value"
FROM table
GROUP BY date_trunc('month', "submitted_at")
ORDER BY date_trunc('month', "submitted_at") ASC
I would use a join with values:
SELECT date_trunc('month', "submitted_at") AS "submitted_at",
count(distinct "user_id") AS "count",
v.val as "value"
FROM table t LEFT JOIN
(VALUES ('2020-05-01', 9),
('2020-06-01', 17),
. . .
) v(mon, val)
ON date_trunc('month', "submitted_at") = v.mon
GROUP BY date_trunc('month', "submitted_at"), v.val
ORDER BY date_trunc('month', "submitted_at") ASC;
I am making what seems like a reasonable assumption that you want the values aligned by month.

SQL - use only clients that are present in all months

I have a dataset with different clients, and their sales count. Over time, some clients get added and deleted from the data. How do I make sure that when I look at the sales counts, that I am only using a selection of the clients that were in the data set all the time? Ie if I have a client that doesn't have a record for 2018-03, then I don't want that client to be part of the entire query. If a clients does not have a record in 2020-03, then I also do not want this client to be part of the entire query.
For example, the following query:
select DATE_PART (y, sold_date)as year, DATE_PART (mm, sold_date) as month, count(distinct(client))
from sales_data
where sold_date > '2018-01-01'
group by year, month
order by year,month
Yields
year month count
2018 1 78
2018 2 83
2018 3 80
2018 4 83
2018 5 84
2018 6 81
2018 7 83
2018 8 90
2018 9 89
2018 10 95
2018 11 94
2018 12 97
2019 1 102
2019 2 103
2019 3 102
2019 4 105
2019 5 103
2019 6 104
2019 7 104
2019 8 106
2019 9 106
2019 10 108
2019 11 109
2019 12 104
2020 1 104
2020 2 102
2020 3 103
2020 4 98
2020 5 97
2020 6 79
So I want to only use the clients that are in all months, they should not be more than 78, because there can not be more users than the minimal month (2018-1).
FYI, I am using Amazon Redshift here but I am OK with a query that's rdbms agnostic or works for SQL-Server/Oracle/MySQL/PostgreSQL, I am just interested in a pattern on how to solve this issue effectively.
If I'm understanding what you want correctly, and if this is just a one-off query, you could use a correlated subquery in the where clause:
SELECT
DATE_PART(y, s.sold_date) AS year,
DATE_PART(mm, s.sold_date) AS month,
COUNT(DISTINCT s.client)
FROM
sales_data AS s
WHERE
EXISTS (
SELECT sd.client FROM sales_data AS sd WHERE DATE_PART(y,
sd.sold_date) = 2018 AND DATE_PART(mm, sd.sold_date) = 1 AND
sd.client = s.client
) AND
s.sold_date > '2018-01-01'
GROUP BY
year,
month
ORDER
DATE_PART(y, s.sold_date),
DATE_PART(mm, s.sold_date)
presence in all months can be done with 2-step aggregation:
group sales data by customer ID having all months
group sales data joined to (1) by year, month
like this (=12 can be a dynamic expression, depending on the amount of history you have)
with
stable_customers as (
select customer_id
from sales_data
group by 1
having count(distinct date_trunc('month' from sold_date)=12
)
select
DATE_PART (y, sold_date) as year
,DATE_PART (mm, sold_date) as month,
,count(1)
from sales_date
join stable_customers
using (customer_id)
where sold_date > '2018-01-01'
group by year, month
order by year,month
Use window functions. Unfortunately, SQL Server does not support count(distinct) as a window function. Fortunately, there is a simple work-around using dense_rank():
select year, month, count(distinct client)
from (select sd.*, year, month,
(dense_rank() over (order by year, month) +
dense_rank() over (order by year desc, month desc)
) as num_months,
(dense_rank() over (partition by client order by year, month) +
dense_rank() over (partition by client order by year desc, month desc)
) as num_months_client
from sales_data sd cross apply
(values (year(sold_date), month(sold_date))) v(year, month)
where sd.sold_date > '2018-01-01'
) sd
where num_months_client = num_months
group by year, month
order by year, month;
Note: This looks at all months that are in the data. If all clients are missing 2019-03, then that months is not considered at all.

Add missing data from previous month or year cumulatively

Say I have the following data:
select 1 id, 'A' name, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'A' name, '2007' year, '05' month, 2 sales from dual union all
select 3 id, 'B' name, '2008' year, '12' month, 3 sales from dual union all
select 4 id, 'B' name, '2009' year, '12' month, 56 sales from dual union all
select 5 id, 'C' name, '2009' year, '08' month, 89 sales from dual union all
select 13 id,'B' name, '2016' year, '01' month, 10 sales from dual union all
select 14 id,'A' name, '2016' year, '02' month, 8 sales from dual union all
select 15 id,'D' name, '2016' year, '03' month, 12 sales from dual union all
select 16 id,'E' name, '2016' year, '04' month, 34 sales from dual
I want to cumulatively add up all the sales across all years and their respective periods (months). The output should look like the following:
name year month sale opening bal closing bal
A 2007 04 5 0 5
A 2007 05 2 5 7
B 2008 12 3 12 15
A 2008 04 0 5 5 -- to be generated
A 2008 05 0 7 7 -- to be generated
B 2009 12 56 15 71
C 2009 08 89 71 160
A 2009 04 0 5 5 -- to be generated
A 2009 05 0 7 7 -- to be generated
B 2016 01 10 278 288
B 2016 12 0 71 71 -- to be generated
A 2016 02 8 288 296
A 2016 04 0 5 5 -- to be generated
A 2016 05 0 7 7 -- to be generated
D 2016 03 12 296 308
E 2016 04 34 308 342
C 2016 08 0 160 160 -- to be generated
The Opening balance is the closing balance of previous month, and if it goes into next year than the opening balance for next year is the closing balance of the previous year. It should be able to work like this for subsequent years. I've got this part working. However, I don't know how to get around ths missing in say 2009 that exists in 2008. For instance the key A,2008,04 and also A,2008,05 does not exist in 2009 and the code should be able to add it in 2009 like above. Same applies for other years and months.
I'm working on Oracle 12c.
Thanks in advance.
A variation on #boneists approach, starting with your sample data in a CTE:
with t as (
select 1 id, 'A' name, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'A' name, '2007' year, '05' month, 2 sales from dual union all
select 3 id, 'B' name, '2008' year, '12' month, 3 sales from dual union all
select 4 id, 'B' name, '2009' year, '12' month, 56 sales from dual union all
select 5 id, 'C' name, '2009' year, '08' month, 89 sales from dual union all
select 13 id,'B' name, '2016' year, '01' month, 10 sales from dual union all
select 14 id,'A' name, '2016' year, '02' month, 8 sales from dual union all
select 15 id,'D' name, '2016' year, '03' month, 12 sales from dual union all
select 16 id,'E' name, '2016' year, '04' month, 34 sales from dual
),
y (year, rnk) as (
select year, dense_rank() over (order by year)
from (select distinct year from t)
),
r (name, year, month, sales, rnk) as (
select t.name, t.year, t.month, t.sales, y.rnk
from t
join y on y.year = t.year
union all
select r.name, y.year, r.month, 0, y.rnk
from y
join r on r.rnk = y.rnk - 1
where not exists (
select 1 from t where t.year = y.year and t.month = r.month and t.name = r.name
)
)
select name, year, month, sales,
nvl(sum(sales) over (partition by name order by year, month
rows between unbounded preceding and 1 preceding), 0) as opening_bal,
nvl(sum(sales) over (partition by name order by year, month
rows between unbounded preceding and current row), 0) as closing_bal
from r
order by year, month, name;
Which gets the same result too, though it also doesn't match the expected results in the question:
NAME YEAR MONTH SALES OPENING_BAL CLOSING_BAL
---- ---- ----- ---------- ----------- -----------
A 2007 04 5 0 5
A 2007 05 2 5 7
A 2008 04 0 7 7
A 2008 05 0 7 7
B 2008 12 3 0 3
A 2009 04 0 7 7
A 2009 05 0 7 7
C 2009 08 89 0 89
B 2009 12 56 3 59
B 2016 01 10 59 69
A 2016 02 8 7 15
D 2016 03 12 0 12
A 2016 04 0 15 15
E 2016 04 34 0 34
A 2016 05 0 15 15
C 2016 08 0 89 89
B 2016 12 0 69 69
The y CTE (feel free to use more meaningful names!) generates all the distinct years from your original data, and also adds a ranking, so 2007 is 1, 2008 is 2, 2009 is 3, and 2016 is 4.
The r recursive CTE combines your actual data with dummy rows with zero sales, based on the name/month data from previous years.
From what that recursive CTE produces you can do your analytic cumulative sum to add the opening/closing balances. This is using windowing clauses to decide which sales values to include - essentially the opening and closing balances are the sum of all values up to this point, but opening doesn't include the current row.
This is the closest I can get to your result, although I realise it's not an exact match. For example, your opening balances don't look correct (where did the opening balance of 12 come from for the output row for id = 3?). Anyway, hopefully the following will enable you to amend as appropriate:
with sample_data as (select 1 id, 'A' name, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'A' name, '2007' year, '05' month, 2 sales from dual union all
select 3 id, 'B' name, '2008' year, '12' month, 3 sales from dual union all
select 4 id, 'B' name, '2009' year, '12' month, 56 sales from dual union all
select 5 id, 'C' name, '2009' year, '08' month, 89 sales from dual union all
select 13 id, 'B' name, '2016' year, '01' month, 10 sales from dual union all
select 14 id, 'A' name, '2016' year, '02' month, 8 sales from dual union all
select 15 id, 'D' name, '2016' year, '03' month, 12 sales from dual union all
select 16 id, 'E' name, '2016' year, '04' month, 34 sales from dual),
dts as (select distinct year
from sample_data),
res as (select sd.name,
dts.year,
sd.month,
nvl(sd.sales, 0) sales,
min(sd.year) over (partition by sd.name, sd.month) min_year_per_name_month,
sum(nvl(sd.sales, 0)) over (partition by name order by to_date(dts.year||'-'||sd.month, 'yyyy-mm')) - nvl(sd.sales, 0) as opening,
sum(nvl(sd.sales, 0)) over (partition by name order by to_date(dts.year||'-'||sd.month, 'yyyy-mm')) as closing
from dts
left outer join sample_data sd partition by (sd.name, sd.month) on (sd.year = dts.year))
select name,
year,
month,
sales,
opening,
closing
from res
where (opening != 0 or closing != 0)
and year >= min_year_per_name_month
order by to_date(year||'-'||month, 'yyyy-mm'),
name;
NAME YEAR MONTH SALES OPENING CLOSING
---- ---- ----- ---------- ---------- ----------
A 2007 04 5 0 5
A 2007 05 2 5 7
A 2008 04 0 7 7
A 2008 05 0 7 7
B 2008 12 3 0 3
A 2009 04 0 7 7
A 2009 05 0 7 7
C 2009 08 89 0 89
B 2009 12 56 3 59
B 2016 01 10 59 69
A 2016 02 8 7 15
D 2016 03 12 0 12
A 2016 04 0 15 15
E 2016 04 34 0 34
A 2016 05 0 15 15
C 2016 08 0 89 89
B 2016 12 0 69 69
I've used Partition Outer Join to link any month and name combination in the table (in my query, the sample_data subquery - you wouldn't need that subquery, you'd just use your table instead!) to any year in the same table, and then working out the opening / closing balances. I then discard any rows that have an opening and closing balance of 0.