sql payment distribution - sql

I am looking for a query where a certain amount gets distributed to each invoice below based on the account_num and item_order. Also, if partial_payment_allowed is set to 'N' then distribution of the above amount should only happen if the distributed amount is greater than the invoice_amt else it should skip the row and carry on to next invoice of the account.
Item_order inv_amount Partial_pmt account_num cr_amt
1 1256 Y 12 1000
2 1134 Y 12 1000
1 800 Y 13 1200
2 200 N 13 1200
3 156 N 13 1200
In above data, each account has a cr_amt which can be distributed according to item_order. So after distribution result would be
account_num Item_order inv_amount Partial_pmt Dist_amt Bal_amt
12 1 1256 Y 1000 256
12 2 1134 Y 256 878
13 1 800 Y 800 400
13 2 200 N 200 200
13 3 156 N 100 100
We are trying to avoid loops, any comments are highly appreciated.Thank you.

Extending the answer to this question:
payment distrubution oracle sql query
You can still use the SQL MODEL clause. In this version, you need separate calculations for each distinct account_num. You can achieve this using the PARTITION keyword of the SQL MODEL clause to partition by account_num.
Like this (see SQL comments for step-by-step explanation):
-- Set up test data (since I don't have your table)
WITH inv_raw (item_order, inv_amount, partial_pmt_allowed, account_num, cr_amt) AS (
SELECT 1, 1256, 'Y', 12, 1000 FROM DUAL UNION ALL
SELECT 2, 1134, 'Y', 12, 1000 FROM DUAL UNION ALL
SELECT 3, 800, 'Y', 13, 1200 FROM DUAL UNION ALL
SELECT 4, 200, 'N',13, 1200 FROM DUAL UNION ALL
SELECT 5, 156, 'N',13, 1200 FROM DUAL),
-- Ensure that the column we are ordering by is densely populated
inv_dense (dense_item_order, item_order, inv_amount, partial_pmt_allowed, account_num, cr_amt) AS
( SELECT DENSE_RANK() OVER ( PARTITION BY account_num ORDER BY item_order ), item_order, inv_amount, partial_pmt_allowed, account_num, cr_amt FROM inv_raw )
-- Give us a way to input the payment amount
--param AS ( SELECT 1100 p_payment_amount FROM DUAL )
-- The actual query starts here
SELECT
account_num,
item_order,
inv_amount,
partial_pmt_allowed,
applied dist_amount,
remaining_out balance_amt,
cr_amt
FROM inv_dense
MODEL
-- We want a completely separate calculation for each distinct account_num
PARTITION BY ( account_num )
-- We'll output one row for each value of dense_item_order.
-- We made item_order "dense" so we can do things like CV()-1 to get the
-- previous row's values.
DIMENSION BY ( dense_item_order )
MEASURES ( cr_amt, item_order, inv_amount,
partial_pmt_allowed, 0 applied,
0 remaining_in, 0 remaining_out )
RULES AUTOMATIC ORDER (
-- The amount carried into the first row is the payment amount
remaining_in[1] = cr_amt[1],
-- The amount carried into subsequent rows is the amount we carried out of the prior row
remaining_in[dense_item_order > 1] = remaining_out[CV()-1],
-- The amount applied depends on whether the amount remaining can cover the invoice
-- and whether partial payments are allowed
applied[ANY] = CASE WHEN remaining_in[CV()] >= inv_amount[CV()] OR partial_pmt_allowed[CV()] = 'Y' THEN LEAST(inv_amount[CV()], remaining_in[CV()]) ELSE 0 END,
-- The amount we carry out is the amount we brought in minus what we applied
remaining_out[ANY] = remaining_in[CV()] - applied[CV()]
)
ORDER BY account_num, item_order;

Related

subtract and add between columns and rows

I have some data look like this
id date total amount adj amount
1 2017-01-02 100 50
1 2017-01-02 50 0
2 2017-01-15 100 35
2 2017-01-15 35 0
3 2017-01-30 120 50
3 2017-01-30 -120 -50
3 2017-01-30 100 50
3 2017-01-30 50 0
3 2017-01-30 60 40
the output should look like, I have no clue how to do the subtraction between rows and columns.
id date due amount
1 2017-01-02 0
2 2017-01-15 0
3 2017-01-30 40
here is my current code, but it only works on maybe 1 and 2 but definitely not working for 3.
the logic for this part is to find the due amount between each entry for each id. for example, id 1 has two entry, total amount 100, then he paid 50, so the adj amount is 50, and the second entry, the total amount is 50, he paid 50, te adj amount is 0. so id 1 due amount is 0 in the end.
id 3 who has 5 entries, first there is entry show the total amount for ID 3 is 120 and he paid 70, so the adj amount is 50, but the first entry is a mistake, so all amount revised. then the third entry shows the total amount is 100, ID 3 paid 50, so the adj amount is 50. then the fourth entry shows the total amount is 50, ID 3 also paid 50, so the adj amount is 0. and the fifth entry shows that the total amount is 60, and ID 3 paid 20, so the adj amount is 40. so in final, ID 3 due amount is 40;
select distinct a.id,
a.date,
case when a.date=b.date and a.total_amount = b.adj_amount then a.adj_amount
when a.date=b.date and a.total_amount <> b.adj_amount then ABS(a.adj_amount + b.adj_amount)
else a.adj_amount
end as due_amount
from table a,
table b
where a.id=b.id;
I just wonder if there has any function which can do this kind of calculation between rows and columns.
Use GROUP BY and SUM().
SELECT the_date, SUM(due_amount)
FROM tab
GROUP BY the_date;
Something like this could work - if the transactions can be ordered. Note that I've renamed some of the columns to help clarify their meaning. I've also added a trans_seq_num column to indicate the order of a customer's transactions on a particular date. I think you're looking for the amount that the customer still owes as of their last payment.
WITH sample (id, trans_seq_num, some_date, starting_balance, ending_balance) AS
(
SELECT '1',1,'2017-01-02','100','50' FROM dual UNION ALL
SELECT '1',2,'2017-01-02','50','0' FROM dual UNION ALL
SELECT '2',1,'2017-01-15','35','0' FROM dual UNION ALL
SELECT '2',2,'2017-01-15','100','35' FROM dual UNION ALL
SELECT '3',1,'2017-01-30','120','50' FROM dual UNION ALL
SELECT '3',2,'2017-01-30','-120','-50' FROM dual UNION ALL
SELECT '3',3,'2017-01-30','100','50' FROM dual UNION ALL
SELECT '3',4,'2017-01-30','50','0' FROM dual UNION ALL
SELECT '3',5,'2017-01-30','60','40' FROM dual
)
SELECT DISTINCT id,
some_date,
LAST_VALUE(ending_balance) OVER (PARTITION BY id ORDER BY trans_seq_num RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) day_balance
FROM sample
ORDER BY 1,2,3;
ID SOME_DATE AMOUNT_DUE
----- --------------- ---------------
1 2017-01-02 0
2 2017-01-15 35
3 2017-01-30 40
The others already said: you should have any way of numbering rows. Simple sequence will do the job. With such unique column solution is trivial, we only find last row for each id.
But you have no order. Here is my try which looks OK so far and may temporary help:
with q as (
select table_a.*,
row_number() over (partition by id, date_, total_amount, adj_amount
order by null) rn
from table_a),
t as (
select a.*,
row_number() over (partition by id, date_, total_amount
order by null) r1,
row_number() over (partition by id, date_, adj_amount
order by null) r2
from q a
where not exists (
select 1 from q b
where a.id = b.id and a.date_ = b.date_ and a.rn = b.rn
and a.total_amount = -b.total_amount and a.adj_amount = -b.adj_amount))
select id, date_, max(adj_amount) due
from t
where connect_by_isleaf = 1
connect by prior id = id and prior date_ = date_
and prior adj_amount = total_amount and prior r2 = r1
group by id, date_;
dbfiddle
First I eliminate mistakes. Subquery t does this, it is simple not exists with added row_number to handle properly multiple cases ( like (120, 50) => (-120, -50) and again (120, 50) ).
Data is cleared so we can recursively find connected rows by previous adj_amount = total_amount. We have to use row_numbers again to handle identical rows (60, 40) => (40, 0) => (60, 40) again.
Then only leaves are taken and finally max value of these leaves which should contain orphaned non zero value if such exists for each id. You can add connect_by_path() clause to see if connection works properly.
Hierarchical queries are slower than others, so if your table is big, be warned. Filter data at first, if needed.
This query works for your examples and some others which I imagined and tested. But even if it works you should add ordering column (if possible) and have guaranteed, simple way to obtain correct results.

Max Grade_points for Repeated Course by Student

I am working on a student enrollment database project. Students enroll in courses and receive grades. Sometimes a student repeats a course and gets a better grade. I need to calculate the sum of grade_point and credit by using only the best grade. So for each student that repeats courses in different semester, I have to determine what the highest grade is. emplid represents the student, course_id identifies a course,credit is credits hr of course, Grade_point is the numeric value of the letter grade, and term represents the semester session.
Here is an example of what I am trying to accomplish.
emplid couse_id credit_hr grade_pt term
0001 6001 3 4 Fall15
0001 6002 3 3.5 Fall15
0001 6003 3 2 Fall15
0001 6004 4 2.5 Sp16
0001 6002 3 3.0 Sp16
total(requirrd) 13 12
Sample code from OP's comment:
SELECT a.emplid
,a.subject
,a.CATALOG_NBR
,a.strm
,a.CRSE_GRADE_OFF
,a.R‌​EPEAT_CANDIDATE
,a.un‌​t_taken AS cr
,a.CRSE_ID
,MAX(a.grade_points)
OVER (PARTITION BY A.emplid ,crse_id)
FROM ps_CLASS_TBL_SE_VW a
WHERE emplid LIKE '06381313011%'
Here is one way to do this.
The problem is complicated, because - it seems - you want to show ALL the input rows in the result set, but then the various aggregates should consider only some rows.
The first part is easy: the subquery orders the rows for each combination of EMPLID and COURSE_ID in decreasing order of grade received and assigns them a row number (within the group) based on that order.
The outer query does the aggregation. I use GROUP BY ROLLUP, which allows a lot of flexibility. When the "rollup" is in fact each row by itself I show the actual credit hours and grade points for that row, but in the aggregates I sum over something else: namely, over the credit hours (and the credit hours multiplied by grade points) when the row number is 1, and NULL otherwise (which is treated as if it didn't exist when computing SUM()).
I created a second employee, who got the same score twice for the same course (so I can check that my solution doesn't give wrong answers in a case like that). I simulate a separate table for credit hours for each course id, and the join needed to get that in the result. Also, I do not sum grade points, since that is not what's done with grade point calculations; rather, I use the correct calculation, where each grade point is multiplied by the course hours, and then these products (only for the highest grade for each course, separately for each EMPLID), are added together.
with
grades ( emplid, course_id, grade_pt, term ) as (
select '0001', '6001', 4 , 'Fall15' from dual union all
select '0001', '6002', 3.5, 'Fall15' from dual union all
select '0001', '6003', 2 , 'Fall15' from dual union all
select '0001', '6004', 2.5, 'Sp16' from dual union all
select '0001', '6002', 3 , 'Sp16' from dual union all
select '0003', '6002', 3.5, 'Sp16' from dual union all
select '0003', '6003', 2.5, 'Fall16' from dual union all
select '0003', '6003', 2.5, 'Sp15' from dual
),
credits ( course_id, credit_hr ) as (
select '6001', 3 from dual union all
select '6002', 3 from dual union all
select '6003', 3 from dual union all
select '6004', 4 from dual
)
-- End of simulated inputs (for testing purposes only, not part of the solution).
-- SQL query begins BELOW THIS LINE.
select emplid, course_id,
case when grouping(term) = 0 then credit_hr
else sum(case when rn = 1 then credit_hr end) end as credit_hr,
case when grouping(term) = 0 then grade_pt
else sum(case when rn = 1 then credit_hr * grade_pt end)
end as total_grade_pt,
term
from ( select g.emplid, g.course_id, c.credit_hr, g.grade_pt, g.term,
row_number() over (partition by g.emplid, g.course_id
order by g.grade_pt desc) as rn
from grades g join credits c on g.course_id = c.course_id
)
group by rollup(emplid, course_id, credit_hr, grade_pt, term)
having grouping(term) = 0 or (grouping(course_id) = 1 and grouping(emplid) = 0)
;
Output:
EMPLID COURSE_ID CREDIT_HR TOTAL_GRADE_PT TERM
------ --------- --------- -------------- ------
0001 6001 3 4 Fall15
0001 6002 3 3 Sp16
0001 6002 3 3.5 Fall15
0001 6003 3 2 Fall15
0001 6004 4 2.5 Sp16
0001 13 38.5
0003 6002 3 3.5 Sp16
0003 6003 3 2.5 Sp15
0003 6003 3 2.5 Fall16
0003 6 18
select emplid, sum(credit_hr), sum(grade_pt)
from (
select emplid, course_id, credit_hr, grade_pt, term,
row_number() over(partition by emplid, course_id
order by grade_pt desc) rn
from your_table
)
where rn = 1
group by emplid

Oracle SQL Trending MTD Data

I am trying to solve a trending problem at work very similar to the below example. I think I have a method but don't know how to do it in SQL.
The input data is:
MTD LOC_ID RAINED
1-Apr-16 1 Y
1-Apr-16 2 N
1-May-16 1 N
1-May-16 2 N
1-Jun-16 1 N
1-Jun-16 2 N
1-Jul-16 1 Y
1-Jul-16 2 N
1-Aug-16 1 N
1-Aug-16 2 Y
The desired output is:
MTD LOC_ID RAINED TRENDS
1-Apr-16 1 Y New
1-May-16 1 N No Rain
1-Jun-16 1 N No Rain
1-Jul-16 1 Y Carryover
1-Aug-16 1 N No Rain
1-Apr-16 2 N No Rain
1-May-16 2 N No Rain
1-Jun-16 2 N No Rain
1-Jul-16 2 N No Rain
1-Aug-16 2 Y New
I'm trying to produce the output from the input by trending on MTD without depending on it. This way, when new months are added to the input, the output changes without editing the query.
The logic for TRENDS will occur on each unique LOC_ID. Trends will have three values: "New" in the first month RAINED is "Y", "Carryover" in any following months where RAINED is "Y", and "No Rain" in any months where RAINED is "N".
I'd like to automate this problem by introducing an intermediate step with a listagg. For example, for LOC_ID = "1":
MTD LOC_ID RAINED PREV_RAINED
1-Apr-16 1 Y (null) / 0 / (I don't care)
1-May-16 1 N Y
1-Jun-16 1 N Y;N
1-Jul-16 1 Y Y;N;N
1-Aug-16 1 N Y;N;N;Y
This way, to produce "TRENDS" in the output, I can say:
case when RAINED = 'Y' then
case when not regexp_like(PREV_RAINED, 'Y', 'i') then
'New'
else
'Carryover'
end
else
'No Rain'
end as TRENDS
My problem is that I'm not sure how to produce PREV_RAINED for each unique LOC_ID. I have a feeling it needs to combine LAG() statements and partition by LOC_ID order by MTD, but the number of lags I need to do depends on each month.
Is there an easy way to produce PREV_RAINED or a simpler way to solve my overall problem while preserving automation each month?
Thanks for reading all of this! :)
In the below SQL there are two parts.
(i) Calculating the ROWNUMBER value for rained attribute at loc_id,rained level.
(ii) Get the count at partition level loc_id,rained.
By computing the above two we can write the CASE WHEN logic to calculate the trends based on your requirement.
SELECT mtd,
loc_id,
rained,
CASE WHEN rained = 'N' THEN 'No Rain'
WHEN rained = 'Y' AND rn = 1 THEN 'New'
ELSE 'Carry Over'
END AS Trends
FROM
(
SELECT mtd,
loc_id,
rained,
ROW_NUMBER() OVER ( PARTITION BY loc_id,rained ORDER BY mtd ) AS rn,
COUNT(*) OVER ( PARTITION BY loc_id,rained ) AS count_locid_rained
FROM INPUT
ORDER BY loc_id,mtd,rained,rn
) X;
Here is a solution for older versions. The WITH clause is for input data; the solution starts right after the WITH clause.
I'll work on a MATCH_RECOGNIZE solution next, I may add it to this answer.
with
input_data ( mtd, loc_id, rained ) as (
select to_date('1-Apr-16', 'dd-Mon-rr'), 1, 'Y' from dual union all
select to_date('1-Apr-16', 'dd-Mon-rr'), 2, 'N' from dual union all
select to_date('1-May-16', 'dd-Mon-rr'), 1, 'N' from dual union all
select to_date('1-May-16', 'dd-Mon-rr'), 2, 'N' from dual union all
select to_date('1-Jun-16', 'dd-Mon-rr'), 1, 'N' from dual union all
select to_date('1-Jun-16', 'dd-Mon-rr'), 2, 'N' from dual union all
select to_date('1-Jul-16', 'dd-Mon-rr'), 1, 'Y' from dual union all
select to_date('1-Jul-16', 'dd-Mon-rr'), 2, 'N' from dual union all
select to_date('1-Aug-16', 'dd-Mon-rr'), 1, 'N' from dual union all
select to_date('1-Aug-16', 'dd-Mon-rr'), 2, 'Y' from dual
)
select mtd, loc_id, rained,
case rained when 'N' then 'No Rain'
else case when rn = 1 then 'New'
else 'Carryover' end
end as trends
from ( select mtd, loc_id, rained,
row_number() over (partition by loc_id, rained order by mtd) rn
from input_data
)
order by loc_id, mtd
;
Output
MTD LOC_ID RAINED TRENDS
------------------- ---------- ------ ---------
01/04/2016 00:00:00 1 Y New
01/05/2016 00:00:00 1 N No Rain
01/06/2016 00:00:00 1 N No Rain
01/07/2016 00:00:00 1 Y Carryover
01/08/2016 00:00:00 1 N No Rain
01/04/2016 00:00:00 2 N No Rain
01/05/2016 00:00:00 2 N No Rain
01/06/2016 00:00:00 2 N No Rain
01/07/2016 00:00:00 2 N No Rain
01/08/2016 00:00:00 2 Y New
10 rows selected
Solution using MATCH_RECOGNIZE (for Oracle 12c only). Test the different solutions on your dataset; I am told that MATCH_RECOGNIZE may be significantly faster than other solutions, but this depends on many factors.
select loc_id, mtd, rained, trends
from input_data
match_recognize (
partition by loc_id, rained
order by mtd
measures mtd as mtd,
case when rained = 'N' then 'No Rain'
else case when match_number() = 1 then 'New' else 'Carryover' end
end as trends
pattern (a)
define a as 0 = 0
)
order by loc_id, mtd;

Oracle SQL sum up values till another value is reached

I hope I can describe my challenge in an understandable way.
I have two tables on a Oracle Database 12c which look like this:
Table name "Invoices"
I_ID | invoice_number | creation_date | i_amount
------------------------------------------------------
1 | 10000000000 | 01.02.2016 00:00:00 | 30
2 | 10000000001 | 01.03.2016 00:00:00 | 25
3 | 10000000002 | 01.04.2016 00:00:00 | 13
4 | 10000000003 | 01.05.2016 00:00:00 | 18
5 | 10000000004 | 01.06.2016 00:00:00 | 12
Table name "payments"
P_ID | reference | received_date | p_amount
------------------------------------------------------
1 | PAYMENT01 | 12.02.2016 13:14:12 | 12
2 | PAYMENT02 | 12.02.2016 15:24:21 | 28
3 | PAYMENT03 | 08.03.2016 23:12:00 | 2
4 | PAYMENT04 | 23.03.2016 12:32:13 | 30
5 | PAYMENT05 | 12.06.2016 00:00:00 | 15
So I want to have a select statement (maybe with oracle analytic functions but I am not really familiar with it) where the payments are getting summed up till the amount of an invoice is reached, ordered by dates. If the sum of for example two payments is more than the invoice amount the rest of the last payment amount should be used for the next invoice.
In this example the result should be like this:
invoice_number | reference | used_pay_amount | open_inv_amount
----------------------------------------------------------
10000000000 | PAYMENT01 | 12 | 18
10000000000 | PAYMENT02 | 18 | 0
10000000001 | PAYMENT02 | 10 | 15
10000000001 | PAYMENT03 | 2 | 13
10000000001 | PAYMENT04 | 13 | 0
10000000002 | PAYMENT04 | 13 | 0
10000000003 | PAYMENT04 | 4 | 14
10000000003 | PAYMENT05 | 14 | 0
10000000004 | PAYMENT05 | 1 | 11
It would be nice if there is a solution with a "simple" select statement.
thx in advance for your time ...
Oracle Setup:
CREATE TABLE invoices ( i_id, invoice_number, creation_date, i_amount ) AS
SELECT 1, 100000000, DATE '2016-01-01', 30 FROM DUAL UNION ALL
SELECT 2, 100000001, DATE '2016-02-01', 25 FROM DUAL UNION ALL
SELECT 3, 100000002, DATE '2016-03-01', 13 FROM DUAL UNION ALL
SELECT 4, 100000003, DATE '2016-04-01', 18 FROM DUAL UNION ALL
SELECT 5, 100000004, DATE '2016-05-01', 12 FROM DUAL;
CREATE TABLE payments ( p_id, reference, received_date, p_amount ) AS
SELECT 1, 'PAYMENT01', DATE '2016-01-12', 12 FROM DUAL UNION ALL
SELECT 2, 'PAYMENT02', DATE '2016-01-13', 28 FROM DUAL UNION ALL
SELECT 3, 'PAYMENT03', DATE '2016-02-08', 2 FROM DUAL UNION ALL
SELECT 4, 'PAYMENT04', DATE '2016-02-23', 30 FROM DUAL UNION ALL
SELECT 5, 'PAYMENT05', DATE '2016-05-12', 15 FROM DUAL;
Query:
WITH total_invoices ( i_id, invoice_number, creation_date, i_amount, i_total ) AS (
SELECT i.*,
SUM( i_amount ) OVER ( ORDER BY creation_date, i_id )
FROM invoices i
),
total_payments ( p_id, reference, received_date, p_amount, p_total ) AS (
SELECT p.*,
SUM( p_amount ) OVER ( ORDER BY received_date, p_id )
FROM payments p
)
SELECT invoice_number,
reference,
LEAST( p_total, i_total )
- GREATEST( p_total - p_amount, i_total - i_amount ) AS used_pay_amount,
GREATEST( i_total - p_total, 0 ) AS open_inv_amount
FROM total_invoices
INNER JOIN
total_payments
ON ( i_total - i_amount < p_total
AND i_total > p_total - p_amount );
Explanation:
The two sub-query factoring (WITH ... AS ()) clauses just add an extra virtual column to the invoices and payments tables with the cumulative sum of the invoice/payment amount.
You can associate a range with each invoice (or payment) as the cumulative amount owing (paid) before the invoice (payment) was placed and the cumulative amount owing (paid) after. The two tables can then be joined where there is an overlap of these ranges.
The open_inv_amount is the positive difference between the cumulative amount invoiced and the cumulative amount paid.
The used_pay_amount is slightly more complicated but you need to find the difference between the lower of the current cumulative invoice and payment totals and the higher of the previous cumulative invoice and payment totals.
Output:
INVOICE_NUMBER REFERENCE USED_PAY_AMOUNT OPEN_INV_AMOUNT
-------------- --------- --------------- ---------------
100000000 PAYMENT01 12 18
100000000 PAYMENT02 18 0
100000001 PAYMENT02 10 15
100000001 PAYMENT03 2 13
100000001 PAYMENT04 13 0
100000002 PAYMENT04 13 0
100000003 PAYMENT04 4 14
100000003 PAYMENT05 14 0
100000004 PAYMENT05 1 11
Update:
Based on mathguy's method of using UNION to join the data, I came up with a different solution re-using some of my code.
WITH combined ( invoice_number, reference, i_amt, i_total, p_amt, p_total, total ) AS (
SELECT invoice_number,
NULL,
i_amount,
SUM( i_amount ) OVER ( ORDER BY creation_date, i_id ),
NULL,
NULL,
SUM( i_amount ) OVER ( ORDER BY creation_date, i_id )
FROM invoices
UNION ALL
SELECT NULL,
reference,
NULL,
NULL,
p_amount,
SUM( p_amount ) OVER ( ORDER BY received_date, p_id ),
SUM( p_amount ) OVER ( ORDER BY received_date, p_id )
FROM payments
ORDER BY 7,
2 NULLS LAST,
1 NULLS LAST
),
filled ( invoice_number, reference, i_prev, i_total, p_prev, p_total ) AS (
SELECT FIRST_VALUE( invoice_number ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
FIRST_VALUE( reference ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
FIRST_VALUE( i_total - i_amt ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
FIRST_VALUE( i_total ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
FIRST_VALUE( p_total - p_amt ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
COALESCE(
p_total,
LEAD( p_total ) IGNORE NULLS OVER ( ORDER BY ROWNUM ),
LAG( p_total ) IGNORE NULLS OVER ( ORDER BY ROWNUM )
)
FROM combined
),
vals ( invoice_number, reference, upa, oia, prev_invoice ) AS (
SELECT invoice_number,
reference,
COALESCE( LEAST( p_total - i_total ) - GREATEST( p_prev, i_prev ), 0 ),
GREATEST( i_total - p_total, 0 ),
LAG( invoice_number ) OVER ( ORDER BY ROWNUM )
FROM filled
)
SELECT invoice_number,
reference,
upa AS used_pay_amount,
oia AS open_inv_amount
FROM vals
WHERE upa > 0
OR ( reference IS NULL AND invoice_number <> prev_invoice AND oia > 0 );
Explanation:
The combined sub-query factoring clause joins the two tables with a UNION ALL and generates the cumulative totals for the amounts invoiced and paid. The final thing it does is order the rows by their ascending cumulative total (and if there are ties it will put the payments, in order created, before the invoices).
The filled sub-query factoring clause will fill the previously generated table so that if a value is null then it will take the value from the next non-null row (and if there is an invoice with no payments then it will find the total of the previous payments from the preceding rows).
The vals sub-query factoring clause applies the same calculations as my previous query (see above). It also adds the prev_invoice column to help identify invoices which are entirely unpaid.
The final SELECT takes the values and filters out the unnecessary rows.
Here is a solution that doesn't require a join. This is important if the amount of data is significant. I did some testing on my laptop (nothing commercial), using the free edition (XE) of Oracle 11.2. Using MT0's solution, the query with the join takes about 11 seconds if there are 10k invoices and 10k payments. For 50k invoices and 50k payments, the query took 287 seconds (almost 5 minutes). This is understandable, since joining two 50k tables requires 2.5 billion comparisons.
The alternative below uses a union. It uses lag() and last_value() to do the work the join does in the other solution. This union-based solution, with 50k invoices and 50k payments, took less than 0.5 seconds on my laptop (!)
I simplified the setup a bit; i_id, invoice_number and creation_date are all used for one purpose only: to order the invoice amounts. I use just an inv_id (invoice id) for that purpose, and similar for payments..
For testing purposes, I created tables invoices and payments like so:
create table invoices (inv_id, inv_amt) as
(select level, trunc(dbms_random.value(20, 80)) from dual connect by level <= 50000);
create table payments (pmt_id, pmt_amt) as
(select level, trunc(dbms_random.value(20, 80)) from dual connect by level <= 50000);
Then, to test the solutions, I use the queries to populate a CTAS, like this:
create table bal_of_pmts as
[select query, including the WITH clause but without the setup CTE's, comes here]
In my solution, I look to show the allocation of payments to one or more invoice, and the payment of invoices from one or more payments; the output discussed in the original post only covers half of this information, but for symmetry it makes more sense to me to show both halves. The output (for the same inputs as in the original post) looks like this, with my version of inv_id and pmt_id:
INV_ID PAID UNPAID PMT_ID USED AVAILABLE
---------- ---------- ---------- ---------- ---------- ----------
1 12 18 101 12 0
1 18 0 103 18 10
2 10 15 103 10 0
2 2 13 105 2 0
2 13 0 107 13 17
3 13 0 107 13 4
4 4 14 107 4 0
4 14 0 109 14 1
5 1 11 109 1 0
5 11 0 11
Notice how the left half is what the original post requested. There is an extra row at the end. Notice the NULL for payment id, for a payment of 11 - that shows how much of the last payment is left uncovered. If there was an invoice with id = 6, for an amount of, say, 22, then there would be one more row - showing the entire amount (22) of that invoice as "paid" from a payment with no id - meaning actually not covered (yet).
The query may be a little easier to understand than the join approach. To see what it does, it may help to look closely at intermediate results, especially the CTE c (in the WITH clause).
with invoices (inv_id, inv_amt) as (
select 1, 30 from dual union all
select 2, 25 from dual union all
select 3, 13 from dual union all
select 4, 18 from dual union all
select 5, 12 from dual
),
payments (pmt_id, pmt_amt) as (
select 101, 12 from dual union all
select 103, 28 from dual union all
select 105, 2 from dual union all
select 107, 30 from dual union all
select 109, 15 from dual
),
c (kind, inv_id, inv_cml, pmt_id, pmt_cml, cml_amt) as (
select 'i', inv_id, sum(inv_amt) over (order by inv_id), null, null,
sum(inv_amt) over (order by inv_id)
from invoices
union all
select 'p', null, null, pmt_id, sum(pmt_amt) over (order by pmt_id),
sum(pmt_amt) over (order by pmt_id)
from payments
),
d (inv_id, paid, unpaid, pmt_id, used, available) as (
select last_value(inv_id) ignore nulls over (order by cml_amt desc),
cml_amt - lead(cml_amt, 1, 0) over (order by cml_amt desc),
case kind when 'i' then 0
else last_value(inv_cml) ignore nulls
over (order by cml_amt desc) - cml_amt end,
last_value(pmt_id) ignore nulls over (order by cml_amt desc),
cml_amt - lead(cml_amt, 1, 0) over (order by cml_amt desc),
case kind when 'p' then 0
else last_value(pmt_cml) ignore nulls
over (order by cml_amt desc) - cml_amt end
from c
)
select inv_id, paid, unpaid, pmt_id, used, available
from d
where paid != 0
order by inv_id, pmt_id
;
In most cases, CTE d is all we need. However, if the cumulative sum for several invoices is exactly equal to the cumulative sum for several payments, my query would add a row with paid = unpaid = 0. (MT0's join solution does not have this problem.) To cover all possible cases, and not have rows with no information, I had to add the filter for paid != 0.

Query the Minimum Value per day within a month's worth of data

I have two sets of pricing data (A and B). Set A consists of all of my pricing data per order over a month. Set B consists of all of my competitor's pricing data over the same month. I want to compare my competitor's lowest price to each of my prices per day.
Graphically, the data appears like this:
Date:-- Set A: -- Set B:
1---------25---------31
1---------54---------47
1---------23---------56
1---------12---------23
1---------76---------40
1---------42
I want pass only the lowest price to a case statement which evaluates which prices are better. I would like to process an entire month's worth of data all at one time, so in my example, Dates 1 thru 30(1) would be included and crunched all at once, and for each day, there would only be one value from set B included: the lowest price in the set.
Important notes: Set B does not have a datapoint for each point in Set A
Hopefully this makes sense. Thanks in advance for any help you may be able to render.
That's a strange example you have - do you really have prices ranging from 12 to 76 within a single day?
Anyway, left joining your (grouped) data with their (grouped) data should work (untested):
with
my_prices as (
select price_date, min(price_value) min_price from my_prices group by price_date),
their_prices as (
select price_date, min(price_value) min_price from their_prices group by price_date)
select
mine.price_date,
(case
when theirs.min_price is null then mine.min_price
when theirs.min_price >= mine.min_price then mine.min_price
else theirs.min_price
end) min_price
from
my_min_prices mine
left join their_prices theirs on mine.price_date = theirs.price_date
I'm still not sure that I understand your requirements. My best guess is that you want something like
SQL> ed
Wrote file afiedt.buf
1 with your_data as (
2 select 1 date_id, 25 price_a,31 price_b from dual
3 union all
4 select 1, 54, 47 from dual union all
5 select 1, 23, 56 from dual union all
6 select 1, 12, 23 from dual union all
7 select 1, 76, 40 from dual union all
8 select 1, 42, null from dual)
9 select date_id,
10 sum( case when price_a < min_price_b
11 then 1
12 else 0
13 end) better,
14 sum( case when price_a = min_price_b
15 then 1
16 else 0
17 end) tie,
18 sum( case when price_a > min_price_b
19 then 1
20 else 0
21 end) worse
22 from( select date_id,
23 price_a,
24 min(price_b) over (partition by date_id) min_price_b
25 from your_data )
26* group by date_id
SQL> /
DATE_ID BETTER TIE WORSE
---------- ---------- ---------- ----------
1 1 1 4