How to shift a year-week field in bigquery - sql

This question is about shifting values of a year-week field in bigquery.
run_id year_week value
0001 201451 13
0001 201452 6
0001 201503 3
0003 201351 8
0003 201352 5
0003 201403 1
Here for each year the week can range from 01 to 53. For example year 2014 has last week which is 201452 but year 2015 has last week which is 201553.
Now I want to shift the values for each year_week in each run_id by 5 weeks. For the weeks there is no value it is assumed that they have a value of 0. For example the output from the example table above should look like this:
run_id year_week value
0001 201504 13
0001 201505 6
0001 201506 0
0001 201507 0
0001 201508 3
0003 201404 8
0003 201405 5
0003 201406 0
0003 201407 0
0003 201408 1
Explanation of the output: In the table above for run_id 0001 the year_week 201504 has a value of 13 because in the input table we had a value of 13 for year_week 201451 which is 5 weeks before 201504.
I could create a table programmatically by creating a mapping from a year_week to a shifted year_week and then doing a join to get the output, but I was wondering if there is any other way to do it by just using sql.

#standardSQL
WITH `project.dataset.table` AS (
SELECT '001' run_id, 201451 year_week, 13 value UNION ALL
SELECT '001', 201452, 6 UNION ALL
SELECT '001', 201503, 3
), weeks AS (
SELECT 100 * year + week year_week
FROM UNNEST([2013, 2014, 2015, 2016, 2017]) year,
UNNEST(GENERATE_ARRAY(1, IF(EXTRACT(ISOWEEK FROM DATE(1+year,1,1)) = 1, 52, 53))) week
), temp AS (
SELECT i.run_id, w.year_week, d.year_week week2, value
FROM weeks w
CROSS JOIN (SELECT DISTINCT run_id FROM `project.dataset.table`) i
LEFT JOIN `project.dataset.table` d
USING(year_week, run_id)
)
SELECT * FROM (
SELECT run_id, year_week,
SUM(value) OVER(win) value
FROM temp
WINDOW win AS (
PARTITION BY run_id ORDER BY year_week ROWS BETWEEN 5 PRECEDING AND 5 PRECEDING
)
)
WHERE NOT value IS NULL
ORDER BY run_id, year_week
with result as
Row run_id year_week value
1 001 201504 13
2 001 201505 6
3 001 201508 3
if you need to "preserve" zero rows - just change below portion
SELECT i.run_id, w.year_week, d.year_week week2, value
FROM weeks w
to
SELECT i.run_id, w.year_week, d.year_week week2, IFNULL(value, 0) value
FROM weeks w
or
SUM(value) OVER(win) value
FROM temp
to
SUM(IFNULL(value, 0)) OVER(win) value
FROM temp

If you have data in the table for all year-weeks, then you can do:
with yw as (
select year_week, row_number() over (order by year_week) as seqnum
from t
group by year_week
)
select t.*, yw5, year_week as new_year_week
from t join
yw
on t.year_week = yw.year_week left join
yw yw5
on yw5.seqnum = yw.seqnum + 5;
If you don't have a table of year weeks, then I would advise you to create such a table, so you can do such manipulations -- or a more general calendar table.

Related

How can I select records from the last value accumulated

I have the next data: TABLE_A
RegisteredDate
Quantity
2022-03-01 13:00
100
2022-03-01 13:10
20
2022-03-01 13:20
-80
2022-03-01 13:30
-40
2022-03-02 09:00
10
2022-03-02 22:00
-5
2022-03-03 02:00
-5
2022-03-03 03:00
25
2022-03-03 03:20
-10
If I add cumulative column
select RegisteredDate, Quantity
, sum(Quantity) over ( order by RegisteredDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as Summary
from TABLE_A
RegisteredDate
Quantity
Summary
2022-03-01 13:00
100
100
2022-03-01 13:10
20
120
2022-03-01 13:20
-80
40
2022-03-01 13:30
-40
0
2022-03-02 09:00
10
10
2022-03-02 22:00
-5
5
2022-03-03 02:00
-5
0
2022-03-03 03:00
25
25
2022-03-03 03:20
-10
15
Is there a way to get the following result with a query?
RegisteredDate
Quantity
Summary
2022-03-03 03:00
25
25
2022-03-03 03:20
-10
15
This result is the last records after the last zero.
EDIT:
Really for the solution to this problem I need the: 2022-03-03 03:00 is the first date of the last records after the last zero.
You can try to use SUM aggregate window function to calculation grp column which part represent to last value accumulated.
Query 1:
WITH cte AS
(
SELECT RegisteredDate,
Quantity,
sum(Quantity) over (order by RegisteredDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as Summary
FROM TABLE_A
), cte2 AS (
SELECT *,
SUM(CASE WHEN Summary = 0 THEN 1 ELSE 0 END) OVER(order by RegisteredDate desc) grp
FROM cte
)
SELECT RegisteredDate,
Quantity
FROM cte2
WHERE grp = 0
ORDER BY RegisteredDate
Results:
| RegisteredDate | Quantity |
|----------------------|----------|
| 2022-03-03T03:00:00Z | 25 |
| 2022-03-03T03:20:00Z | -10 |
Use a CTE that returns the summary column and NOT EXISTS to filter out the rows that you don't need:
WITH cte AS (SELECT *, SUM(Quantity) OVER (ORDER BY RegisteredDate) Summary FROM TABLE_A)
SELECT c1.*
FROM cte c1
WHERE NOT EXISTS (
SELECT 1
FROM cte c2 WHERE c2.RegisteredDate >= c1.RegisteredDate AND c2.Summary = 0
)
ORDER BY c1.RegisteredDate;
There is no need for ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW in the OVER clause of the window function, because this is the default behavior.
See the demo.
Try this:
with u as
(select RegisteredDate,
Quantity,
sum(Quantity) over (order by RegisteredDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as Summary
from TABLE_A)
select * from u
where RegisteredDate >= all(select RegisteredDate from u where Summary = 0)
and Summary <> 0;
Fiddle
Basically what you want is for RegisteredDate to be >= all RegisteredDatess where Summary = 0, and you want Summary <> 0.
When using window functions, it is necessary to take into account that RegisteredDate column is not unique in TABLE_A, so ordering only by RegisteredDate column is not enough to get a stable result on the same dataset.
With A As (
Select ROW_NUMBER() Over (Order by RegisteredDate, Quantity) As ID, RegisteredDate, Quantity
From TABLE_A),
B As (
Select A.*, SUM(Quantity) Over (Order by ID) As Summary
From A)
Select Top 1 *
From B
Where ID > (Select MAX(ID) From B Where Summary=0)
ID
RegisteredDate
Quantity
Summary
8
2022-03-03 03:00
25
25

Sliding window aggregate for year-week in bigquery

My question is about sliding window sum up in bigquery.
I have a table like the following
run_id year_week value
001 201451 5
001 201452 8
001 201501 1
001 201505 5
003 201352 8
003 201401 1
003 201405 5
Here for each year the week can range from 01 to 53. For example year 2014 has last week which is 201452 but year 2015 has last week which is 201553. If it makes life easier I only have 5 years, 2013, 2014, 2015, 2016 and 2017 and only year 2015 has weeks those go upto 53.
Now for each run I am trying to get a sliding window sum of the values. Each year_week would assume the sum of the values next 5 year_week (including itself) for the current run_id (e.g. 001). For example the following could be a an output from the current table
run_id year_week aggregate_sum
001 201451 5+8+1+0+0
001 201452 8+1+0+0+0
001 201501 1+0+0+0+5
001 201502 0+0+0+5+0
001 201503 0+0+5+0+0
001 201504 0+5+0+0+0
001 201505 5+0+0+0+0
003 201352 8+1+0+0+0
003 201401 1+0+0+0+5
003 201402 0+0+0+5+0
003 201403 0+0+5+0+0
003 201404 0+5+0+0+0
003 201405 5+0+0+0+0
To explain what is happening, the next 5 weeks for 201451 including itself would be 201451,201452,201501,201502,201503 . If there is a value for those weeks in the table for current run_id we just sum them up which would be, 5+8+1+0+0, because the corresponding value for a year_week is 0 if it is not in the table.
Is it possible to do it using sliding window operation in bigquery?
Below is for BigQuery Standard SQL
#standardSQL
WITH weeks AS (
SELECT 100* year + week year_week
FROM UNNEST([2013, 2014, 2015, 2016, 2017]) year,
UNNEST(GENERATE_ARRAY(1, IF(EXTRACT(ISOWEEK FROM DATE(1+year,1,1)) = 1, 52, 53))) week
), temp AS (
SELECT i.run_id, w.year_week, d.year_week week2, value
FROM weeks w
CROSS JOIN (SELECT DISTINCT run_id FROM `project.dataset.table`) i
LEFT JOIN `project.dataset.table` d
USING(year_week, run_id)
)
SELECT * FROM (
SELECT run_id, year_week,
SUM(value) OVER(win) aggregate_sum
FROM temp
WINDOW win AS (
PARTITION BY run_id ORDER BY year_week ROWS BETWEEN CURRENT row AND 4 FOLLOWING
)
)
WHERE NOT aggregate_sum IS NULL
You can test / play with above using dummy data from your question as below
#standardSQL
WITH `project.dataset.table` AS (
SELECT '001' run_id, 201451 year_week, 5 value UNION ALL
SELECT '001', 201452, 8 UNION ALL
SELECT '001', 201501, 1 UNION ALL
SELECT '001', 201505, 5
), weeks AS (
SELECT 100* year + week year_week
FROM UNNEST([2013, 2014, 2015, 2016, 2017]) year,
UNNEST(GENERATE_ARRAY(1, IF(EXTRACT(ISOWEEK FROM DATE(1+year,1,1)) = 1, 52, 53))) week
), temp AS (
SELECT i.run_id, w.year_week, d.year_week week2, value
FROM weeks w
CROSS JOIN (SELECT DISTINCT run_id FROM `project.dataset.table`) i
LEFT JOIN `project.dataset.table` d
USING(year_week, run_id)
)
SELECT * FROM (
SELECT run_id, year_week,
SUM(value) OVER(win) aggregate_sum
FROM temp
WINDOW win AS (
PARTITION BY run_id ORDER BY year_week ROWS BETWEEN CURRENT row AND 4 FOLLOWING
)
)
WHERE NOT aggregate_sum IS NULL
-- ORDER BY run_id, year_week
with result as
Row run_id year_week aggregate_sum
1 001 201447 5
2 001 201448 13
3 001 201449 14
4 001 201450 14
5 001 201451 14
6 001 201452 9
7 001 201501 6
8 001 201502 5
9 001 201503 5
10 001 201504 5
11 001 201505 5
12 003 201348 8
13 003 201349 9
14 003 201350 9
15 003 201351 9
16 003 201352 9
17 003 201401 6
18 003 201402 5
19 003 201403 5
20 003 201404 5
21 003 201405 5
note; this is for - I only have 5 years, 2013, 2014, 2015, 2016 and 2017 but can easily be extended in weeks CTE

Querying average and rolling 12 month average

I want to be able to find out the average per month and rolling average over the last 12 months of a count for the number of changes per customer.
SELECT
crq_requested_by_company as 'Customer',
COUNT(crq_number) as 'Number of Changes'
FROM
change_information ci1
GROUP BY
crq_requested_by_company
At the moment I am just doing the count of the total and my results look like this
crq_requested_by_company count
A 4
B 2
C 2269
D 7696
E 110
F 91
G 33
The date column I will be using is called 'start_date'.
I assume GETDATE() will be needed to work out the rolling average for the last 12 months.
Additional info after comments:
Using the code
;WITH CTE as
(
SELECT
crq_requested_by_company as Customer,
COUNT(crq_number) Nuc,
dateadd(month, datediff(month, 0, crq_start_date),0) m
FROM
change_information ci1
WHERE
crq_start_date >= dateadd(month,datediff(month, 0,getdate()) - 12,0)
GROUP BY
crq_requested_by_company,
datediff(month, 0, crq_start_date)
)
SELECT
Customer,
avg(Nuc) over (partition by Customer order by m) running_avg,
m start_month,
avg(Nuc) over (partition by Customer) simply_average
FROM
CTE
ORDER BY Customer, start_month
This gives the results
Customer running_avg start_month simply_average
A 8 01/01/2016 00:00 13
A 10 01/02/2016 00:00 13
A 10 01/03/2016 00:00 13
A 11 01/04/2016 00:00 13
A 14 01/05/2016 00:00 13
A 13 01/06/2016 00:00 13
B 1 01/01/2016 00:00 1
C 3 01/01/2016 00:00 2
C 3 01/02/2016 00:00 2
C 2 01/03/2016 00:00 2
C 2 01/04/2016 00:00 2
C 2 01/05/2016 00:00 2
C 2 01/06/2016 00:00 2
It needs to look like this so the average of the results above - the average of the 6 months above (I only currently have 6 months of data and needs to be 12 eventually)
Customer avg_of_running_avg
A 11
B 1
C 2
Try this, it should work for sqlserver 2012 using running average:
;WITH CTE as
(
SELECT
crq_requested_by_company as Customer,
COUNT(crq_number) Nuc,
dateadd(month, datediff(month, 0, start_date),0) m
FROM
change_information ci1
WHERE
start_date >= dateadd(month,datediff(month, 0,getdate()) - 12,0)
GROUP BY
crq_requested_by_company,
datediff(month, 0, start_date)
)
SELECT
Customer,
avg(Nuc) over (partition by Customer order by m) running_avg,
m start_month,
avg(Nuc) over (partition by Customer) simply_average
FROM
CTE
ORDER BY Customer, start_month

Oracle SQL sum up values till another value is reached

I hope I can describe my challenge in an understandable way.
I have two tables on a Oracle Database 12c which look like this:
Table name "Invoices"
I_ID | invoice_number | creation_date | i_amount
------------------------------------------------------
1 | 10000000000 | 01.02.2016 00:00:00 | 30
2 | 10000000001 | 01.03.2016 00:00:00 | 25
3 | 10000000002 | 01.04.2016 00:00:00 | 13
4 | 10000000003 | 01.05.2016 00:00:00 | 18
5 | 10000000004 | 01.06.2016 00:00:00 | 12
Table name "payments"
P_ID | reference | received_date | p_amount
------------------------------------------------------
1 | PAYMENT01 | 12.02.2016 13:14:12 | 12
2 | PAYMENT02 | 12.02.2016 15:24:21 | 28
3 | PAYMENT03 | 08.03.2016 23:12:00 | 2
4 | PAYMENT04 | 23.03.2016 12:32:13 | 30
5 | PAYMENT05 | 12.06.2016 00:00:00 | 15
So I want to have a select statement (maybe with oracle analytic functions but I am not really familiar with it) where the payments are getting summed up till the amount of an invoice is reached, ordered by dates. If the sum of for example two payments is more than the invoice amount the rest of the last payment amount should be used for the next invoice.
In this example the result should be like this:
invoice_number | reference | used_pay_amount | open_inv_amount
----------------------------------------------------------
10000000000 | PAYMENT01 | 12 | 18
10000000000 | PAYMENT02 | 18 | 0
10000000001 | PAYMENT02 | 10 | 15
10000000001 | PAYMENT03 | 2 | 13
10000000001 | PAYMENT04 | 13 | 0
10000000002 | PAYMENT04 | 13 | 0
10000000003 | PAYMENT04 | 4 | 14
10000000003 | PAYMENT05 | 14 | 0
10000000004 | PAYMENT05 | 1 | 11
It would be nice if there is a solution with a "simple" select statement.
thx in advance for your time ...
Oracle Setup:
CREATE TABLE invoices ( i_id, invoice_number, creation_date, i_amount ) AS
SELECT 1, 100000000, DATE '2016-01-01', 30 FROM DUAL UNION ALL
SELECT 2, 100000001, DATE '2016-02-01', 25 FROM DUAL UNION ALL
SELECT 3, 100000002, DATE '2016-03-01', 13 FROM DUAL UNION ALL
SELECT 4, 100000003, DATE '2016-04-01', 18 FROM DUAL UNION ALL
SELECT 5, 100000004, DATE '2016-05-01', 12 FROM DUAL;
CREATE TABLE payments ( p_id, reference, received_date, p_amount ) AS
SELECT 1, 'PAYMENT01', DATE '2016-01-12', 12 FROM DUAL UNION ALL
SELECT 2, 'PAYMENT02', DATE '2016-01-13', 28 FROM DUAL UNION ALL
SELECT 3, 'PAYMENT03', DATE '2016-02-08', 2 FROM DUAL UNION ALL
SELECT 4, 'PAYMENT04', DATE '2016-02-23', 30 FROM DUAL UNION ALL
SELECT 5, 'PAYMENT05', DATE '2016-05-12', 15 FROM DUAL;
Query:
WITH total_invoices ( i_id, invoice_number, creation_date, i_amount, i_total ) AS (
SELECT i.*,
SUM( i_amount ) OVER ( ORDER BY creation_date, i_id )
FROM invoices i
),
total_payments ( p_id, reference, received_date, p_amount, p_total ) AS (
SELECT p.*,
SUM( p_amount ) OVER ( ORDER BY received_date, p_id )
FROM payments p
)
SELECT invoice_number,
reference,
LEAST( p_total, i_total )
- GREATEST( p_total - p_amount, i_total - i_amount ) AS used_pay_amount,
GREATEST( i_total - p_total, 0 ) AS open_inv_amount
FROM total_invoices
INNER JOIN
total_payments
ON ( i_total - i_amount < p_total
AND i_total > p_total - p_amount );
Explanation:
The two sub-query factoring (WITH ... AS ()) clauses just add an extra virtual column to the invoices and payments tables with the cumulative sum of the invoice/payment amount.
You can associate a range with each invoice (or payment) as the cumulative amount owing (paid) before the invoice (payment) was placed and the cumulative amount owing (paid) after. The two tables can then be joined where there is an overlap of these ranges.
The open_inv_amount is the positive difference between the cumulative amount invoiced and the cumulative amount paid.
The used_pay_amount is slightly more complicated but you need to find the difference between the lower of the current cumulative invoice and payment totals and the higher of the previous cumulative invoice and payment totals.
Output:
INVOICE_NUMBER REFERENCE USED_PAY_AMOUNT OPEN_INV_AMOUNT
-------------- --------- --------------- ---------------
100000000 PAYMENT01 12 18
100000000 PAYMENT02 18 0
100000001 PAYMENT02 10 15
100000001 PAYMENT03 2 13
100000001 PAYMENT04 13 0
100000002 PAYMENT04 13 0
100000003 PAYMENT04 4 14
100000003 PAYMENT05 14 0
100000004 PAYMENT05 1 11
Update:
Based on mathguy's method of using UNION to join the data, I came up with a different solution re-using some of my code.
WITH combined ( invoice_number, reference, i_amt, i_total, p_amt, p_total, total ) AS (
SELECT invoice_number,
NULL,
i_amount,
SUM( i_amount ) OVER ( ORDER BY creation_date, i_id ),
NULL,
NULL,
SUM( i_amount ) OVER ( ORDER BY creation_date, i_id )
FROM invoices
UNION ALL
SELECT NULL,
reference,
NULL,
NULL,
p_amount,
SUM( p_amount ) OVER ( ORDER BY received_date, p_id ),
SUM( p_amount ) OVER ( ORDER BY received_date, p_id )
FROM payments
ORDER BY 7,
2 NULLS LAST,
1 NULLS LAST
),
filled ( invoice_number, reference, i_prev, i_total, p_prev, p_total ) AS (
SELECT FIRST_VALUE( invoice_number ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
FIRST_VALUE( reference ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
FIRST_VALUE( i_total - i_amt ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
FIRST_VALUE( i_total ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
FIRST_VALUE( p_total - p_amt ) IGNORE NULLS OVER ( ORDER BY ROWNUM ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ),
COALESCE(
p_total,
LEAD( p_total ) IGNORE NULLS OVER ( ORDER BY ROWNUM ),
LAG( p_total ) IGNORE NULLS OVER ( ORDER BY ROWNUM )
)
FROM combined
),
vals ( invoice_number, reference, upa, oia, prev_invoice ) AS (
SELECT invoice_number,
reference,
COALESCE( LEAST( p_total - i_total ) - GREATEST( p_prev, i_prev ), 0 ),
GREATEST( i_total - p_total, 0 ),
LAG( invoice_number ) OVER ( ORDER BY ROWNUM )
FROM filled
)
SELECT invoice_number,
reference,
upa AS used_pay_amount,
oia AS open_inv_amount
FROM vals
WHERE upa > 0
OR ( reference IS NULL AND invoice_number <> prev_invoice AND oia > 0 );
Explanation:
The combined sub-query factoring clause joins the two tables with a UNION ALL and generates the cumulative totals for the amounts invoiced and paid. The final thing it does is order the rows by their ascending cumulative total (and if there are ties it will put the payments, in order created, before the invoices).
The filled sub-query factoring clause will fill the previously generated table so that if a value is null then it will take the value from the next non-null row (and if there is an invoice with no payments then it will find the total of the previous payments from the preceding rows).
The vals sub-query factoring clause applies the same calculations as my previous query (see above). It also adds the prev_invoice column to help identify invoices which are entirely unpaid.
The final SELECT takes the values and filters out the unnecessary rows.
Here is a solution that doesn't require a join. This is important if the amount of data is significant. I did some testing on my laptop (nothing commercial), using the free edition (XE) of Oracle 11.2. Using MT0's solution, the query with the join takes about 11 seconds if there are 10k invoices and 10k payments. For 50k invoices and 50k payments, the query took 287 seconds (almost 5 minutes). This is understandable, since joining two 50k tables requires 2.5 billion comparisons.
The alternative below uses a union. It uses lag() and last_value() to do the work the join does in the other solution. This union-based solution, with 50k invoices and 50k payments, took less than 0.5 seconds on my laptop (!)
I simplified the setup a bit; i_id, invoice_number and creation_date are all used for one purpose only: to order the invoice amounts. I use just an inv_id (invoice id) for that purpose, and similar for payments..
For testing purposes, I created tables invoices and payments like so:
create table invoices (inv_id, inv_amt) as
(select level, trunc(dbms_random.value(20, 80)) from dual connect by level <= 50000);
create table payments (pmt_id, pmt_amt) as
(select level, trunc(dbms_random.value(20, 80)) from dual connect by level <= 50000);
Then, to test the solutions, I use the queries to populate a CTAS, like this:
create table bal_of_pmts as
[select query, including the WITH clause but without the setup CTE's, comes here]
In my solution, I look to show the allocation of payments to one or more invoice, and the payment of invoices from one or more payments; the output discussed in the original post only covers half of this information, but for symmetry it makes more sense to me to show both halves. The output (for the same inputs as in the original post) looks like this, with my version of inv_id and pmt_id:
INV_ID PAID UNPAID PMT_ID USED AVAILABLE
---------- ---------- ---------- ---------- ---------- ----------
1 12 18 101 12 0
1 18 0 103 18 10
2 10 15 103 10 0
2 2 13 105 2 0
2 13 0 107 13 17
3 13 0 107 13 4
4 4 14 107 4 0
4 14 0 109 14 1
5 1 11 109 1 0
5 11 0 11
Notice how the left half is what the original post requested. There is an extra row at the end. Notice the NULL for payment id, for a payment of 11 - that shows how much of the last payment is left uncovered. If there was an invoice with id = 6, for an amount of, say, 22, then there would be one more row - showing the entire amount (22) of that invoice as "paid" from a payment with no id - meaning actually not covered (yet).
The query may be a little easier to understand than the join approach. To see what it does, it may help to look closely at intermediate results, especially the CTE c (in the WITH clause).
with invoices (inv_id, inv_amt) as (
select 1, 30 from dual union all
select 2, 25 from dual union all
select 3, 13 from dual union all
select 4, 18 from dual union all
select 5, 12 from dual
),
payments (pmt_id, pmt_amt) as (
select 101, 12 from dual union all
select 103, 28 from dual union all
select 105, 2 from dual union all
select 107, 30 from dual union all
select 109, 15 from dual
),
c (kind, inv_id, inv_cml, pmt_id, pmt_cml, cml_amt) as (
select 'i', inv_id, sum(inv_amt) over (order by inv_id), null, null,
sum(inv_amt) over (order by inv_id)
from invoices
union all
select 'p', null, null, pmt_id, sum(pmt_amt) over (order by pmt_id),
sum(pmt_amt) over (order by pmt_id)
from payments
),
d (inv_id, paid, unpaid, pmt_id, used, available) as (
select last_value(inv_id) ignore nulls over (order by cml_amt desc),
cml_amt - lead(cml_amt, 1, 0) over (order by cml_amt desc),
case kind when 'i' then 0
else last_value(inv_cml) ignore nulls
over (order by cml_amt desc) - cml_amt end,
last_value(pmt_id) ignore nulls over (order by cml_amt desc),
cml_amt - lead(cml_amt, 1, 0) over (order by cml_amt desc),
case kind when 'p' then 0
else last_value(pmt_cml) ignore nulls
over (order by cml_amt desc) - cml_amt end
from c
)
select inv_id, paid, unpaid, pmt_id, used, available
from d
where paid != 0
order by inv_id, pmt_id
;
In most cases, CTE d is all we need. However, if the cumulative sum for several invoices is exactly equal to the cumulative sum for several payments, my query would add a row with paid = unpaid = 0. (MT0's join solution does not have this problem.) To cover all possible cases, and not have rows with no information, I had to add the filter for paid != 0.

SQL Ranking Dates to Get Year Over Year Order

I have a list of dates in a YYYYMM format and I am trying to rank them in a Year over Year format that would look like the following:
MonthDisplay YearMonth Rank MonthNumber YearNumber
Aug-2013 201308 1 8 2013
Aug-2012 201208 2 8 2012
Jul-2013 201307 3 7 2013
Jul-2012 201207 4 7 2012
I have been able to get it close by using the following Rank and get the results below:
RANK() OVER(PARTITION BY 1 ORDER BY MonthNumber DESC, YearNumber DESC)
Month YearMonth Rank
Dec-2012 201212 1
Dec-2011 201112 2
Nov-2012 201211 114
Nov-2011 201111 115
Oct-2012 201210 227
Oct-2011 201110 228
However, this starts with Dec-2012 instead of the Aug-2013 (current month). I can't figure out how to get it to start with the current month. I am sure it something super easy and I am just missing it. Thanks!
select
T.YearMonth,
rank() over (order by R.rnk asc, D.YearNumber desc) as [Rank],
D.MonthNumber, D.YearNumber
from Table1 as T
outer apply (
select
month(getdate()) as CurMonthNumber,
cast(right(T.YearMonth, 2) as int) as MonthNumber,
cast(left(T.YearMonth, 4) as int) as YearNumber
) as D
outer apply (
select
case
when D.MonthNumber <= D.CurMonthNumber then
D.CurMonthNumber - D.MonthNumber
else
12 + D.CurMonthNumber - D.MonthNumber
end as rnk
) as R
sql fiddle example