Add missing data from previous month or year cumulatively - sql

Say I have the following data:
select 1 id, 'A' name, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'A' name, '2007' year, '05' month, 2 sales from dual union all
select 3 id, 'B' name, '2008' year, '12' month, 3 sales from dual union all
select 4 id, 'B' name, '2009' year, '12' month, 56 sales from dual union all
select 5 id, 'C' name, '2009' year, '08' month, 89 sales from dual union all
select 13 id,'B' name, '2016' year, '01' month, 10 sales from dual union all
select 14 id,'A' name, '2016' year, '02' month, 8 sales from dual union all
select 15 id,'D' name, '2016' year, '03' month, 12 sales from dual union all
select 16 id,'E' name, '2016' year, '04' month, 34 sales from dual
I want to cumulatively add up all the sales across all years and their respective periods (months). The output should look like the following:
name year month sale opening bal closing bal
A 2007 04 5 0 5
A 2007 05 2 5 7
B 2008 12 3 12 15
A 2008 04 0 5 5 -- to be generated
A 2008 05 0 7 7 -- to be generated
B 2009 12 56 15 71
C 2009 08 89 71 160
A 2009 04 0 5 5 -- to be generated
A 2009 05 0 7 7 -- to be generated
B 2016 01 10 278 288
B 2016 12 0 71 71 -- to be generated
A 2016 02 8 288 296
A 2016 04 0 5 5 -- to be generated
A 2016 05 0 7 7 -- to be generated
D 2016 03 12 296 308
E 2016 04 34 308 342
C 2016 08 0 160 160 -- to be generated
The Opening balance is the closing balance of previous month, and if it goes into next year than the opening balance for next year is the closing balance of the previous year. It should be able to work like this for subsequent years. I've got this part working. However, I don't know how to get around ths missing in say 2009 that exists in 2008. For instance the key A,2008,04 and also A,2008,05 does not exist in 2009 and the code should be able to add it in 2009 like above. Same applies for other years and months.
I'm working on Oracle 12c.
Thanks in advance.

A variation on #boneists approach, starting with your sample data in a CTE:
with t as (
select 1 id, 'A' name, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'A' name, '2007' year, '05' month, 2 sales from dual union all
select 3 id, 'B' name, '2008' year, '12' month, 3 sales from dual union all
select 4 id, 'B' name, '2009' year, '12' month, 56 sales from dual union all
select 5 id, 'C' name, '2009' year, '08' month, 89 sales from dual union all
select 13 id,'B' name, '2016' year, '01' month, 10 sales from dual union all
select 14 id,'A' name, '2016' year, '02' month, 8 sales from dual union all
select 15 id,'D' name, '2016' year, '03' month, 12 sales from dual union all
select 16 id,'E' name, '2016' year, '04' month, 34 sales from dual
),
y (year, rnk) as (
select year, dense_rank() over (order by year)
from (select distinct year from t)
),
r (name, year, month, sales, rnk) as (
select t.name, t.year, t.month, t.sales, y.rnk
from t
join y on y.year = t.year
union all
select r.name, y.year, r.month, 0, y.rnk
from y
join r on r.rnk = y.rnk - 1
where not exists (
select 1 from t where t.year = y.year and t.month = r.month and t.name = r.name
)
)
select name, year, month, sales,
nvl(sum(sales) over (partition by name order by year, month
rows between unbounded preceding and 1 preceding), 0) as opening_bal,
nvl(sum(sales) over (partition by name order by year, month
rows between unbounded preceding and current row), 0) as closing_bal
from r
order by year, month, name;
Which gets the same result too, though it also doesn't match the expected results in the question:
NAME YEAR MONTH SALES OPENING_BAL CLOSING_BAL
---- ---- ----- ---------- ----------- -----------
A 2007 04 5 0 5
A 2007 05 2 5 7
A 2008 04 0 7 7
A 2008 05 0 7 7
B 2008 12 3 0 3
A 2009 04 0 7 7
A 2009 05 0 7 7
C 2009 08 89 0 89
B 2009 12 56 3 59
B 2016 01 10 59 69
A 2016 02 8 7 15
D 2016 03 12 0 12
A 2016 04 0 15 15
E 2016 04 34 0 34
A 2016 05 0 15 15
C 2016 08 0 89 89
B 2016 12 0 69 69
The y CTE (feel free to use more meaningful names!) generates all the distinct years from your original data, and also adds a ranking, so 2007 is 1, 2008 is 2, 2009 is 3, and 2016 is 4.
The r recursive CTE combines your actual data with dummy rows with zero sales, based on the name/month data from previous years.
From what that recursive CTE produces you can do your analytic cumulative sum to add the opening/closing balances. This is using windowing clauses to decide which sales values to include - essentially the opening and closing balances are the sum of all values up to this point, but opening doesn't include the current row.

This is the closest I can get to your result, although I realise it's not an exact match. For example, your opening balances don't look correct (where did the opening balance of 12 come from for the output row for id = 3?). Anyway, hopefully the following will enable you to amend as appropriate:
with sample_data as (select 1 id, 'A' name, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'A' name, '2007' year, '05' month, 2 sales from dual union all
select 3 id, 'B' name, '2008' year, '12' month, 3 sales from dual union all
select 4 id, 'B' name, '2009' year, '12' month, 56 sales from dual union all
select 5 id, 'C' name, '2009' year, '08' month, 89 sales from dual union all
select 13 id, 'B' name, '2016' year, '01' month, 10 sales from dual union all
select 14 id, 'A' name, '2016' year, '02' month, 8 sales from dual union all
select 15 id, 'D' name, '2016' year, '03' month, 12 sales from dual union all
select 16 id, 'E' name, '2016' year, '04' month, 34 sales from dual),
dts as (select distinct year
from sample_data),
res as (select sd.name,
dts.year,
sd.month,
nvl(sd.sales, 0) sales,
min(sd.year) over (partition by sd.name, sd.month) min_year_per_name_month,
sum(nvl(sd.sales, 0)) over (partition by name order by to_date(dts.year||'-'||sd.month, 'yyyy-mm')) - nvl(sd.sales, 0) as opening,
sum(nvl(sd.sales, 0)) over (partition by name order by to_date(dts.year||'-'||sd.month, 'yyyy-mm')) as closing
from dts
left outer join sample_data sd partition by (sd.name, sd.month) on (sd.year = dts.year))
select name,
year,
month,
sales,
opening,
closing
from res
where (opening != 0 or closing != 0)
and year >= min_year_per_name_month
order by to_date(year||'-'||month, 'yyyy-mm'),
name;
NAME YEAR MONTH SALES OPENING CLOSING
---- ---- ----- ---------- ---------- ----------
A 2007 04 5 0 5
A 2007 05 2 5 7
A 2008 04 0 7 7
A 2008 05 0 7 7
B 2008 12 3 0 3
A 2009 04 0 7 7
A 2009 05 0 7 7
C 2009 08 89 0 89
B 2009 12 56 3 59
B 2016 01 10 59 69
A 2016 02 8 7 15
D 2016 03 12 0 12
A 2016 04 0 15 15
E 2016 04 34 0 34
A 2016 05 0 15 15
C 2016 08 0 89 89
B 2016 12 0 69 69
I've used Partition Outer Join to link any month and name combination in the table (in my query, the sample_data subquery - you wouldn't need that subquery, you'd just use your table instead!) to any year in the same table, and then working out the opening / closing balances. I then discard any rows that have an opening and closing balance of 0.

Related

Counts and divide from two different selects with dates

I have a table with this kind of structure (Sample only)
ID | STATUS | DATE |
--- -------- ------
1 OPEN 31-01-2022
2 CLOSE 15-11-2021
3 CLOSE 21-10-2021
4 OPEN 11-10-2021
5 OPEN 28-09-2021
I would like to know the counts of close vs open records by week. So it will be count(close)/count(open) where close.week = open.week
If there are no matching values, need to return 0 of course.
I got to this query below
SELECT *
FROM
(SELECT COUNT(*) AS 'CLOSE', DATEPART(WEEK, DATE) AS 'WEEKSA', DATEPART(YEAR, DATE) AS 'YEARA' FROM TABLE
WHERE STATUS IN ('CLOSE')
GROUP BY DATEPART(WEEK, DATE),DATEPART(YEAR, DATE)) TMPA
FULL OUTER JOIN
(SELECT COUNT(*) AS 'OPEN', DATEPART(WEEK, DATE) AS 'WEEKSB', DATEPART(YEAR, DATE) AS 'YEARB' FROM TABLE
WHERE STATUS IN ('OPEN')
GROUP BY DATEPART(WEEK, DATE),DATEPART(YEAR, DATE)) TMPB
ON TMPA.WEEKSA = TMPB.WEEKSB AND TMPA.YEARA = TMPB.YEARB
My results are as below (sample only)
close | weeksa | yeara | open | weeksb | yearb |
------ -------- ------ ------- ------- ------
3 2 2021
1 3 2021
1 4 2021
2 20 2021 2 20 2021
7 22 2021
2 23 2021
7 26 2021
7 27 2021
2 28 2021 14 28 2021
2 29 2021
10 30
24 31 2021
2 32 2021 5 32
4 33 2021
1 34 2021 13 34 2021
6 35 2021
1 36 2021
1 38 2021
1 39 2021
2 41 2021
4 43 2021
1 45 2021
2 46 2021 25 46 2021
1 47 2021 5 47 2021
4 48 2021
1 49 2021 20 49 2021
1 50 2021 17 50 2021
1 51 2021
How do I do the math now?
If I do another select the query fails. So I guess either syntax is bad or the whole concept is wrong.
The required result should look like this (Sample)
WEEK | YEAR | RATIO |
----- ------ -------
2 2021 0
3 2021 0
4 2021 0
5 2021 0.93
20 2021 0.1
22 2021 0
23 2021 0
26 2021 0
1 2022 0.75
2 2022 0.23
4 2022 0.07
Cheers!
I have added some test data to check the logic, adding the same in the code.
;with cte as(
select 1 ID, 'OPEN' as STATUS, cast('2021 -01-31' as DATE) DATE
union select 10 ID, 'CLOSE' as STATUS, cast('2021 -01-31' as DATE) DATE
union select 11 ID, 'CLOSE' as STATUS, cast('2021 -01-31' as DATE) DATE
union select 12 ID, 'CLOSE' as STATUS, cast('2021 -01-31' as DATE) DATE
union select 22 ID, 'CLOSE' as STATUS, cast('2021 -01-31' as DATE) DATE
union select 32 ID, 'CLOSE' as STATUS, cast('2021 -01-31' as DATE) DATE
union select 2,'CLOSE',cast('2021-11-28' as DATE)
union select 3,'CLOSE',cast('2021-10-21' as DATE)
union select 8,'CLOSE',cast('2021-10-21' as DATE)
union select 9,'CLOSE',cast('2021-10-21' as DATE)
union select 4,'OPEN', cast('2021-10-11' as DATE)
union select 5,'CLOSE', cast('2021-09-28' as DATE)
union select 6,'OPEN', cast('2021-09-27' as DATE)
union select 7,'CLOSE', cast('2021-09-26' as DATE) )
, cte2 as (
select DATEPART(WEEK,date) as week_number,* from cte)
,cte3 as(
select week_number,year(date) yr,count(case when status = 'open' then 1 end)open_count,count(case when status <> 'open' then 1 end) close_count from cte2 group by week_number,year(date))
select week_number as week,yr as year,
cast(case when open_count = 0 then 1.0 else open_count end /
case when close_count = 0 then 1.0 else close_count end as numeric(3,2)) as ratio
from cte3

Generate a range of records depending on from-to dates

I have a table of records like this:
Item
From
To
A
2018-01-03
2018-03-16
B
2021-05-25
2021-11-10
The output of select should look like:
Item
Month
Year
A
01
2018
A
02
2018
A
03
2018
B
05
2021
B
06
2021
B
07
2021
B
08
2021
Also the range should not exceed the current month. In example above we are asuming current day is 2021-08-01.
I am trying to do something similar to THIS with CONNECT BY LEVEL but as soon as I also select my table next to dual and try to order the records the selection never completes. I also have to join few other tables to the selection but I don't think that would make a difference.
I would very much appreciate your help.
Row generator it is, but not as you did it; most probably you're missing lines #11 - 16 in my query (or their alternative).
SQL> with test (item, date_from, date_to) as
2 -- sample data
3 (select 'A', date '2018-01-03', date '2018-03-16' from dual union all
4 select 'B', date '2021-05-25', date '2021-11-10' from dual
5 )
6 -- query that returns desired result
7 select item,
8 extract(month from (add_months(date_from, column_value - 1))) month,
9 extract(year from (add_months(date_from, column_value - 1))) year
10 from test cross join
11 table(cast(multiset
12 (select level
13 from dual
14 connect by level <=
15 months_between(trunc(least(sysdate, date_to), 'mm'), trunc(date_from, 'mm')) + 1
16 ) as sys.odcinumberlist))
17 order by item, year, month;
ITEM MONTH YEAR
----- ---------- ----------
A 1 2018
A 2 2018
A 3 2018
B 5 2021
B 6 2021
B 7 2021
B 8 2021
7 rows selected.
SQL>
Recursive CTEs are the standard SQL approach to this type of problem. In Oracle, this looks like:
with cte(item, fromd, tod) as (
select item, fromd, tod
from t
union all
select item, add_months(fromd, 1), tod
from cte
where add_months(fromd, 1) < last_day(tod)
)
select item, extract(year from fromd) as year, extract(month from fromd) as month
from cte
order by item, fromd;
Here is a db<>fiddle.

Dynamically adding zero-valued records for subsequent APs for analytical function to work

with data as (
select 1 id, 'A' name, 'fruit' r_group, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'Z' name, 'fruit' r_group, '2007' year, '04' month, 99 sales from dual union all
select 3 id, 'A' name, 'fruit' r_group, '2008' year, '05' month, 10 sales from dual union all
select 4 id, 'B' name, 'vegetable' r_group, '2008' year, '07' month, 20 sales from dual
)
select t.*,
(sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) -sales ) as opening,
sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) as closing
from data t
order by year , month
Output will be:
year | month | name | r_group | sales | opening | closing |
2007 | 04 | 'A' | fruit | 5 | 0 | 5 |
2007 | 04 | 'Z' | fruit | 99 | 0 | 99 |
2008 | 05 | 'A' | fruit | 10 | 5 | 15 |
2008 | 07 | 'B' | vegetable | 20 | 0 | 20 |
If I aggregate now on top of this select statement using this:
select year, month, r_group, sum(sales) sales, sum(opening) opening, sum(closing) closing from (
select t.*,
(sum(sales) over........
)
group by year, month, r_group
order by year, month
I get the following result:
year | month | r_group | sales | opening | closing |
2007 | 04 | fruit | 104 | 0 | 104 |
2008 | 05 | fruit | 10 | 5 | 15 |
2008 | 07 | vegetable | 20 | 0 | 20 |
which is wrong. Notice that the value of name='Z' has not been taken into account at all in 2008. Since the cumulative function works backwards it didn't have a name='Z' record in 2008 to go backwards with. If I put a zero-value record in 2008, for name = 'Z' then it will work. I want to avoid adding dummy zero-valued records and have this done dynamically in the query. If I add the zero-valued record in the data like this:
select 1 id, 'A' name, 'fruit' r_group, '2007', year '04' month, 5 sales from dual union all
select 2 id, 'Z' name, 'fruit' r_group, '2007', year '04' month, 99 sales from dual union all
select 3 id, 'A' name, 'fruit' r_group, '2008', year '05' month, 10 sales from dual union all
select 4 id, 'Z' name, 'fruit' r_group, '2008', year '05' month, 0 sales from dual union all
select 5 id, 'B' name, 'vegetable' r_group, '2008', year '07' month, 20 sales from dual ))
then the first query will output:
year | month | name | r_group | sales | opening | closing |
2007 | 04 | 'A' | fruit | 5 | 0 | 5 |
2007 | 04 | 'Z' | fruit | 99 | 0 | 99 |
2008 | 05 | 'A' | fruit | 10 | 5 | 15 |
2008 | 05 | 'Z' | fruit | 0 | 99 | 99 |
2008 | 07 | 'B' | vegetable | 20 | 0 | 20 |
and If i aggregate again using the second outer select I will get:
year | month | r_group | sales | opening | closing |
2007 | 04 | fruit | 104 | 0 | 104 |
2008 | 05 | fruit | 10 | 104 | 114 |
2008 | 07 | vegetable | 20 | 0 | 20 |
which is correct. However, as I mentioned, I do not want to add zero-valued records. There is discussion on just this topic here: https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:8912311513313 but I haven't been able to make this work.
A fairly simplistic approach (and similar to what that AskTom link shows) is to extract all the year/month pairs, and all the name/r_group pairs, and then cross-join those:
with data as (
select 1 id, 'A' name, 'fruit' r_group, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'Z' name, 'fruit' r_group, '2007' year, '04' month, 99 sales from dual union all
select 3 id, 'A' name, 'fruit' r_group, '2008' year, '05' month, 10 sales from dual union all
select 4 id, 'B' name, 'vegetable' r_group, '2008' year, '07' month, 20 sales from dual
)
select a.year, a.month, b.name, b.r_group, nvl(d.sales, 0) as sales
from (select distinct year, month from data) a
cross join (select distinct name, r_group from data) b
left join data d on d.year = a.year and d.month = a.month and d.name = b.name and d.r_group = b.r_group
order by year, month, name, r_group;
YEAR MO N R_GROUP SALES
---- -- - --------- ----------
2007 04 A fruit 5
2007 04 B vegetable 0
2007 04 Z fruit 99
2008 05 A fruit 10
2008 05 B vegetable 0
2008 05 Z fruit 0
2008 07 A fruit 0
2008 07 B vegetable 20
2008 07 Z fruit 0
But that produces more rows than you wanted with your first level fo aggregation:
YEAR MO N R_GROUP SALES OPENING CLOSING
---- -- - --------- ---------- ---------- ----------
2007 04 A fruit 5 0 5
2007 04 B vegetable 0 0 0
2007 04 Z fruit 99 0 99
2008 05 A fruit 10 5 15
2008 05 B vegetable 0 0 0
2008 05 Z fruit 0 99 99
2008 07 A fruit 0 15 15
2008 07 B vegetable 20 0 20
2008 07 Z fruit 0 99 99
and when aggregated with your second level (from the other query) would produce extra rows for, say, 2007/04/vegetable:
YEAR MO R_GROUP SALES OPENING CLOSING
---- -- --------- ---------- ---------- ----------
2007 04 fruit 104 0 104
2007 04 vegetable 0 0 0
2008 05 fruit 10 104 114
2008 05 vegetable 0 0 0
2008 07 fruit 0 114 114
2008 07 vegetable 20 0 20
which you could partially filter those out before aggregating because all the intermediate columns would be zero:
with data as (
select 1 id, 'A' name, 'fruit' r_group, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'Z' name, 'fruit' r_group, '2007' year, '04' month, 99 sales from dual union all
select 3 id, 'A' name, 'fruit' r_group, '2008' year, '05' month, 10 sales from dual union all
select 4 id, 'B' name, 'vegetable' r_group, '2008' year, '07' month, 20 sales from dual
)
select year,
month,
r_group,
sum(sales) sales,
sum(opening) opening,
sum(closing) closing
from (
select t.*,
(sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) -sales ) as opening,
sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) as closing
from (
select a.year, a.month, b.name, b.r_group, nvl(d.sales, 0) as sales
from (select distinct year, month from data) a
cross join (select distinct name, r_group from data) b
left join data d
on d.year = a.year and d.month = a.month and d.name = b.name and d.r_group = b.r_group
) t
)
where sales != 0 or opening != 0 or closing != 0
group by year, month, r_group
order by year, month;
to get:
YEAR MO R_GROUP SALES OPENING CLOSING
---- -- --------- ---------- ---------- ----------
2007 04 fruit 104 0 104
2008 05 fruit 10 104 114
2008 07 fruit 0 114 114
2008 07 vegetable 20 0 20
You could further filter that result to remove rows where the aggregated sales value is still zero, though if you're doing that the filter before aggregation isn't needed any more; but it's still a bit messy. And it isn't clear if your outermost aggregation can be modified to do that.
This can be done using a partitioned outer join - but first you have to find the distinct name/r_group combinations and then partition outer join accordingly:
with data as (select 1 id, 'A' name, 'fruit' r_group, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'Z' name, 'fruit' r_group, '2007' year, '04' month, 99 sales from dual union all
select 3 id, 'A' name, 'fruit' r_group, '2008' year, '05' month, 10 sales from dual union all
select 4 id, 'B' name, 'vegetable' r_group, '2008' year, '07' month, 20 sales from dual),
data2 as (select distinct name, r_group
from data),
res as (select d.year,
d.month,
d2.r_group,
d.id,
d2.name,
nvl(d.sales, 0) sales,
sum(nvl(d.sales, 0)) over (partition by d2.name, d2.r_group
order by d.year, d.month
rows between unbounded preceding and current row) - nvl(d.sales,0) as opening,
sum(nvl(d.sales, 0)) over (partition by d2.name, d2.r_group
order by d.year, d.month
rows between unbounded preceding and current row) as closing
from data2 d2
left outer join data d partition by (d.year, d.month) on (d.name = d2.name and d.r_group = d2.r_group))
select year,
month,
r_group,
sum(sales) sales,
sum(opening) opening,
sum(closing) closing
from res
where sales != 0
or opening != 0
or closing != 0
group by year,
month,
r_group
order by year,
month;
YEAR MONTH R_GROUP SALES OPENING CLOSING
---- ----- --------- ---------- ---------- ----------
2007 04 fruit 104 0 104
2008 05 fruit 10 104 114
2008 07 fruit 0 114 114
2008 07 vegetable 20 0 20
This is very similar to Alex's answer, but the use of the partition outer join negates the need to find the distinct year/month pairs, as that is taken care of in the join clause.

SQL NOOB - Oracle joins and Row Number

I was hoping to get some guidance on a SQL script I am trying to put together for Oracle database 11g.
I am attempting to perform a count of claims from the 'claim' table, and order them by year / month / and enterprise.
I was able to get a count of claims and order them like I would like, however I need to pull data from another table and I am having trouble combining the 'row_number' function with a join.
Here is my script so far:
SELECT TO_CHAR (SYSTEM_ENTRY_DATE, 'YYYY') YEAR,
TO_CHAR (SYSTEM_ENTRY_DATE, 'MM') MONTH,
ENTERPRISE_IID,
COUNT (*) CLAIMS
FROM (SELECT CLAIM.CLAIM_EID,
CLAIM.SYSTEM_ENTRY_DATE,
CLAIM.ENTERPRISE_IID,
ROW_NUMBER () OVER (PARTITION BY CLAIM.CLAIM_EID, CLAIM.ENTERPRISE_IID
ORDER BY CLAIM.SYSTEM_ENTRY_DATE DESC) RN
FROM CLAIM
WHERE CLAIM_IID IN (SELECT DISTINCT (CLAIM_IID)
FROM CLAIM_LINE
WHERE STATUS <> 'D')
AND CLAIM.CONTEXT = '1'
AND CLAIM.CLAIM_STATUS = 'A'
AND CLAIM.LAST_ANALYSIS_DATE IS NOT NULL)
WHERE RN = 1
GROUP ENTERPRISE_IID,
TO_CHAR (SYSTEM_ENTRY_DATE, 'YYYY'),
TO_CHAR (SYSTEM_ENTRY_DATE, 'MM');
So far all of my data is coming from the 'claim' table. This pulls the following result:
YEAR MONTH ENTERPRISE_IID CLAIMS
---- ----- -------------- ----------
2016 01 6 1
2015 08 6 3
2016 02 6 2
2015 09 6 2
2015 07 6 2
2015 09 5 22
2015 11 5 29
2015 12 5 27
2016 04 5 8
2015 07 5 29
2015 05 5 15
2015 06 5 5
2015 10 5 45
2016 03 5 54
2015 03 5 10
2016 02 5 70
2016 01 5 55
2015 08 5 32
2015 04 5 12
19 rows selected.
The enterprise_IID is the primary key on the 'enterprise' table. The 'enterprise' table also contains the 'name' attribute for each entry. I would like to join the claim and enterprise table in order to show the enterprise name for this count, and not the enterprise_IID.
As you can tell I am rather new to Oracle and SQL, and I am a bit stuck on this one. I was thinking that I should do an inner join between the two tables, but I am not quite sure how to do that when using the row_number function.
Or perhaps I am taking the wrong approach here, and someone could push me in another direction.
Here is what I tried:
SELECT TO_CHAR (SYSTEM_ENTRY_DATE, 'YYYY') YEAR,
TO_CHAR (SYSTEM_ENTRY_DATE, 'MM') MONTH,
ENTERPRISE_IID,
ENTERPRISE.NAME,
COUNT (*) CLAIMS
FROM (SELECT CLAIM.CLAIM_EID,
CLAIM.SYSTEM_ENTRY_DATE,
CLAIM.ENTERPRISE_IID,
ROW_NUMBER () OVER (PARTITION BY CLAIM.CLAIM_EID, CLAIM.ENTERPRISE_IID
ORDER BY CLAIM.SYSTEM_ENTRY_DATE DESC) RN
FROM CLAIM, enterprise
INNER JOIN ENTERPRISE
ON CLAIM.ENTERPRISE_IID = ENTERPRISE.ENTERPRISE_IID
WHERE CLAIM_IID IN (SELECT DISTINCT (CLAIM_IID)
FROM CLAIM_LINE
WHERE STATUS <> 'D')
AND CLAIM.CONTEXT = '1'
AND CLAIM.CLAIM_STATUS = 'A'
AND CLAIM.LAST_ANALYSIS_DATE IS NOT NULL)
WHERE RN = 1
GROUP BY ENTERPRISE.NAME,
ENTERPRISE_IID,
TO_CHAR (SYSTEM_ENTRY_DATE, 'YYYY'),
TO_CHAR (SYSTEM_ENTRY_DATE, 'MM');
Thank you in advance!
"Desired Output"
YEAR MONTH NAME CLAIMS
---- ----- ---- ----------
2016 01 Ent1 1
2015 08 Ent1 3
2016 02 Ent1 2
2015 09 Ent1 2
2015 07 Ent1 2
2015 09 Ent2 22
2015 11 Ent2 29
2015 12 Ent2 27
2016 04 Ent2 8
2015 07 Ent2 29
2015 05 Ent2 15
2015 06 Ent2 5
2015 10 Ent2 45
2016 03 Ent2 54
2015 03 Ent2 10
2016 02 Ent2 70
2016 01 Ent2 55
2015 08 Ent2 32
2015 04 Ent2 12
19 rows selected.
You can try this. Joins can be used when calculating row numbers with row_number function.
SELECT TO_CHAR (SYSTEM_ENTRY_DATE, 'YYYY') YEAR,
TO_CHAR (SYSTEM_ENTRY_DATE, 'MM') MONTH,
ENTERPRISE_IID,
NAME,
COUNT (*) CLAIMS
FROM (SELECT CLAIM.CLAIM_EID,
CLAIM.SYSTEM_ENTRY_DATE,
CLAIM.ENTERPRISE_IID,
ENTERPRISE.NAME,
ROW_NUMBER () OVER (PARTITION BY CLAIM.CLAIM_EID, CLAIM.ENTERPRISE_IID
ORDER BY CLAIM.SYSTEM_ENTRY_DATE DESC) RN
FROM CLAIM --, enterprise (this is not required as the table is being joined already)
INNER JOIN ENTERPRISE ON CLAIM.ENTERPRISE_IID = ENTERPRISE.ENTERPRISE_IID
INNER JOIN (SELECT DISTINCT CLAIM_IID FROM CLAIM_LINE WHERE STATUS <> 'D') CLAIM_LINE
ON CLAIM.CLAIM_IID = CLAIM_LINE.CLAIM_IID
WHERE CLAIM.CONTEXT = '1'
AND CLAIM.CLAIM_STATUS = 'A'
AND CLAIM.LAST_ANALYSIS_DATE IS NOT NULL) t
WHERE RN = 1
GROUP BY NAME, --ENTERPRISE.NAME (The alias ENTERPRISE is not accessible here.)
ENTERPRISE_IID,
TO_CHAR(SYSTEM_ENTRY_DATE, 'YYYY'),
TO_CHAR(SYSTEM_ENTRY_DATE, 'MM');
I'd write the query like this:
SELECT TO_CHAR(TRUNC(c.system_entry_date,'MM'),'YYYY') AS year
, TO_CHAR(TRUNC(c.system_entry_date,'MM'),'MM') AS month
, e.enterprise_name AS name
, COUNT(*) AS claims
FROM (
SELECT r.claim_eid
, r.enterprise_iid
, MAX(r.system_entry_date) AS system_entry_date
FROM ( SELECT DISTINCT l.claim_iid
FROM claim_line l
WHERE l.status <> 'D'
) d
JOIN claim r
ON r.claim_iid = d.claim_iid
AND r.context = '1'
AND r.claim_status = 'A'
AND r.last_analysis_date IS NOT NULL
GROUP
BY r.claim_eid
, r.enterprise_iid
) c
JOIN enterprise e
ON e.enterprise_iid = c.enterprise_iid
GROUP
BY c.enterprise_iid
, TRUNC(c.system_entry_date,'MM')
, e.enterprise_name
ORDER
BY e.enterprise_name
, TRUNC(c.system_entry_date,'MM')
A few notes:
I prefer to qualify ALL column references with the table name or short table alias, and assign aliases to all inline views.
Since the usage of ROW_NUMBER() appears to be get the "latest" system_entry_date for a claim and eliminate duplicates, I'd prefer to use a GROUP BY and a MAX() aggregate.
I prefer to use a join operation rather than the NOT IN (subquery) pattern. (Or, I would tend to use a NOT EXISTS (correlated subquery) pattern.
I don't think it matters too much if you use TO_CHAR or EXTRACT. The TO_CHAR gets you the leading zero in the month, I don't think EXTRACT(MONTH ) gets you the leading zero. I'd use whichever gets me closest to the resultset I need.Personally, I would return just a single column, either containing the year and month as one string e.g. TO_CHAR( , 'YYYYMM') or just a DATE value. It all depends what I'm going to be doing with that.
Just hypothesis to start with, because requirement of query output unclear:
SELECT
C.ENTERPRISE_IID,
E.ENTERPRISE_NAME,
extract(year from CLAIM.SYSTEM_ENTRY_DATE) SYSTEM_ENTRY_YEAR,
extract(month from CLAIM.SYSTEM_ENTRY_DATE) SYSTEM_ENTRY_MONTH,
count(distinct C.CLAIM_EID) CLAIM_COUNT
FROM
CLAIM C,
ENTERPRISE E
WHERE
C.CLAIM_IID IN (
SELECT DISTINCT (CLAIM_IID)
FROM CLAIM_LINE
WHERE STATUS <> 'D'
)
AND C.CONTEXT = '1'
AND C.CLAIM_STATUS = 'A'
AND C.LAST_ANALYSIS_DATE IS NOT NULL
AND E.ENTERPRISE_IID = C.ENTERPRISE_IID
GROUP BY
C.ENTERPRISE_IID,
E.ENTERPRISE_NAME,
extract(year from CLAIM.SYSTEM_ENTRY_DATE),
extract(month from CLAIM.SYSTEM_ENTRY_DATE)
ORDER BY
extract(year from CLAIM.SYSTEM_ENTRY_DATE),
extract(month from CLAIM.SYSTEM_ENTRY_DATE),
E.ENTERPRISE_NAME

SQL Aggregate function: Sum of some rows in another row

I have following scenario with sql server 2008
**** Original Result ****
================================================
Year month Category Count_days
================================================
2001 09 Leave 03
2001 09 Worked Below 8hrs 18
2001 09 Worked Above 8hrs 05
2001 09 Present 0 <----- current value
2001 10 Leave 01
2001 10 Worked Below 8hrs 10
2001 10 Worked Above 8hrs 09
2001 10 Present 0 <------ current value
Following is the criteria
criteria
===========
Present Count of 'x'th Month = SUM(Worked Below 8hrs count of 'x'th month) +
SUM(Worked Above 8hrs count of 'x'th month )
;where x is the month
I want following result with satisfying above criteria
**** Expected Result ****
===============================================
Year month Category Count_days
================================================
2001 09 Leave 03
2001 09 Worked Below 8hrs 18
2001 09 Worked Above 8hrs 05
2001 09 Present 23 <-----(expecting sum 18+05 =23)
2001 10 Leave 01
2001 10 Worked Below 8hrs 10
2001 10 Worked Above 8hrs 09
2001 10 Present 19 <-----(expecting sum 10+09 = 19)
Problem is the original result is generated by very complex query hence cant call same set again i.e.
Cannot use this (This will hamper the performance of my application.)
=================
select * from original (some join) select * from original
may be need to use the single query or It can be subquery, use of aggregate function etc.
Expecting any aggregation trick to generate my expected result????
Please help me out guys....
you can use sum as analytic function
SELECT
year, month, cat, count_days as count_days_orig,
case cat
when 'Present'
then
sum (
case
when cat in ('Worked Below 8hrs', 'Worked Above 8hrs')
then count_days
else 0
end
)
over (partition by year, month)
else count_days
end as count_days_calc
FROM
(
SELECT 2001 as year, 09 as month , 'Leave ' as cat , 03 as count_days FROM dual
UNION all
SELECT 2001 as year, 09 as month , 'Worked Below 8hrs' as cat , 18 as count_days FROM dual
UNION all
SELECT 2001 as year, 09 as month , 'Worked Above 8hrs' as cat , 05 as count_days FROM dual
UNION all
SELECT 2001 as year, 09 as month , 'Present' as cat , 0 as count_days FROM dual
UNION all
SELECT 2001 as year, 10 as month , 'Leave ' as cat , 01 as count_days FROM dual
UNION all
SELECT 2001 as year, 10 as month , 'Worked Below 8hrs' as cat , 10 as count_days FROM dual
UNION all
SELECT 2001 as year, 10 as month , 'Worked Above 8hrs' as cat , 09 as count_days FROM dual
UNION all
SELECT 2001 as year, 10 as month , 'Present' as cat , 0 as count_days FROM dual
)
;
year month cat count_days_orig count_days_calc
--------------------------------------------------------------------------
2001 9 Leave 3 3
2001 9 Worked Below 8hrs 18 18
2001 9 Worked Above 8hrs 5 5
2001 9 Present 0 23
2001 10 Leave 1 1
2001 10 Worked Below 8hrs 10 10
2001 10 Worked Above 8hrs 9 9
2001 10 Present 0 19
Something like this, don't know if column names are correct and stuff.
SELECT Year, month, category,
CASE Category
WHEN 'Present'
THEN (
SELECT Sum(T2.Count_days)
FROM table T2
WHERE T2.year = T.year
AND T2.month = T.month
AND T2.Category NOT IN ('Present', 'Leave')
)
ELSE Count_days
END
FROM table T
But this really feels like a wrong design...