Oracle SQL recursive adding values - sql

I have the following data in the table
Period Total_amount R_total
01/01/20 2 2
01/02/20 5 null
01/03/20 3 null
01/04/20 8 null
01/05/20 31 null
Based on the above data I would like to have the following situation.
Period Total_amount R_total
01/01/20 2 2
01/02/20 5 3
01/03/20 3 0
01/04/20 8 8
01/05/20 31 23
Additional data
01/06/20 21 0 (previously it would be -2)
01/07/20 25 25
01/08/20 29 4
Pattern to the additional data is:
if total_amount < previous(r_total) then 0
Based on the filled data, we can spot the pattern is:
R_total = total_amount - previous(R_total)
Could you please help me out with this issue?

As Gordon Linoff suspected, it is possible to solve this problem with analytic functions. The benefit is that the query will likely be much faster. The price to pay for that benefit is that you need to do a bit of math beforehand (before ever thinking about "programming" and "computers").
A bit of elementary arithmetic shows that R_TOTAL is an alternating sum of TOTAL_AMOUNT. This can be arranged easily by using ROW_NUMBER() (to get the signs) and then an analytic SUM(), as shown below.
Table setup:
create table sample_data (period, total_amount) as
select to_date('01/01/20', 'mm/dd/rr'), 2 from dual union all
select to_date('01/02/20', 'mm/dd/rr'), 5 from dual union all
select to_date('01/03/20', 'mm/dd/rr'), 3 from dual union all
select to_date('01/04/20', 'mm/dd/rr'), 8 from dual union all
select to_date('01/05/20', 'mm/dd/rr'), 31 from dual
;
Query and result:
with
prep (period, total_amount, sgn) as (
select period, total_amount,
case mod(row_number() over (order by period), 2) when 0 then 1 else -1 end
from sample_data
)
select period, total_amount,
sgn * sum(sgn * total_amount) over (order by period) as r_total
from prep
;
PERIOD TOTAL_AMOUNT R_TOTAL
-------- ------------ ----------
01/01/20 2 2
01/02/20 5 3
01/03/20 3 0
01/04/20 8 8
01/05/20 31 23

This may be possible with window functions, but the simplest method is probably a recursive CTE:
with t as (
select t.*, row_number() over (order by period) as seqnum
from yourtable t
),
cte(period, total_amount, r_amount, seqnum) as (
select period, total_amount, r_amount, seqnum
from t
where seqnum = 1
union all
select t.period, t.total_amount, t.total_amount - cte.r_amount, t.seqnum
from cte join
t
on t.seqnum = cte.seqnum + 1
)
select *
from cte;
This question explicitly talks about "recursively" adding values. If you want to solve this using another mechanism, you might explain the logic in detail and ask if there is a non-recursive CTE solution.

Related

take sum of last 7 days from the observed date in BigQuery

I have a table on which I want to compute the sum of revenue on last 7 days from the observed day. Here is my table -
with temp as
(
select DATE('2019-06-29') as transaction_date, "x"as id, 0 as revenue
union all
select DATE('2019-06-30') as transaction_date, "x"as id, 80 as revenue
union all
select DATE('2019-07-04') as transaction_date, "x"as id, 64 as revenue
union all
select DATE('2019-07-06') as transaction_date, "x"as id, 64 as revenue
union all
select DATE('2019-07-11') as transaction_date, "x"as id, 75 as revenue
union all
select DATE('2019-07-12') as transaction_date, "x"as id, 0 as revenue
)
select * from temp
I want to take a sum of last 7 days for each transaction_date. For instance for the last record which has transaction_date = 2019-07-12, I would like to add another column which adds up revenue for last 7 days from 2019-07-12 (which is until 2019-07-05), hence the value of new rollup_revenue column would be 0 + 75 + 64 = 139. Likewise, I need to compute the rollup for all the dates for every ID.
Note - the ID may or may not appear daily.
I have tried self join but I am unable to figure it out.
Below is for BigQuery Standard SQL
#standardSQL
SELECT *,
SUM(revenue) OVER(
PARTITION BY id ORDER BY UNIX_DATE(transaction_date)
RANGE BETWEEN 6 PRECEDING AND CURRENT ROW
) rollup_revenue
FROM `project.dataset.temp`
You can test, play with above using sample data from your question as in example below
#standardSQL
WITH `project.dataset.temp` AS (
SELECT DATE '2019-06-29' AS transaction_date, 'x' AS id, 0 AS revenue UNION ALL
SELECT '2019-06-30', 'x', 80 UNION ALL
SELECT '2019-07-04', 'x', 64 UNION ALL
SELECT '2019-07-06', 'x', 64 UNION ALL
SELECT '2019-07-11', 'x', 75 UNION ALL
SELECT '2019-07-12', 'x', 0
)
SELECT *,
SUM(revenue) OVER(
PARTITION BY id ORDER BY UNIX_DATE(transaction_date)
RANGE BETWEEN 6 PRECEDING AND CURRENT ROW
) rollup_revenue
FROM `project.dataset.temp`
-- ORDER BY transaction_date
with result
Row transaction_date id revenue rollup_revenue
1 2019-06-29 x 0 0
2 2019-06-30 x 80 80
3 2019-07-04 x 64 144
4 2019-07-06 x 64 208
5 2019-07-11 x 75 139
6 2019-07-12 x 0 139
One option uses a correlated subquery to find the rolling sum:
SELECT
transaction_date,
revenue,
(SELECT SUM(t2.revenue) FROM temp t2 WHERE t2.transaction_date
BETWEEN DATE_SUB(t1.transaction_date, INTERVAL 7 DAY) AND
t1.transaction_date) AS rev_7_days
FROM temp t1
ORDER BY
transaction_date;

Percentile for Year-to-Day (successive YtD)

I have the following data:
ID |MPERIOD|FRDATE |FR
===+=======+==========+==
100|2017M01|01.01.2017|60 \ \ \
101|2017M01|02.01.2017|75 > YtD 2017M01 | |
103|2017M01|08.01.2017|48 / > Ytd 2017M02 |
104|2017M02|06.02.2017|55 | > YtD 2017M03
105|2017M02|15.02.2017|63 / |
106|2017M03|18.03.2017|41 |
107|2017M03|22.03.2017|71 /
...|.......|..........|..
I need to calculate 80% percentile for each month and for YtD in (up to) that month (from start of year up to current calculation moment).
I use the following SQL query:
SELECT DISTINCT mperiod,
ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr OVER (PARTITION BY mperiod),2) "80%_FR",
ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr OVER (PARTITION BY SUBSTR(mperiod,1,4)),2) "80%_FR_YtD"
FROM mytable
ORDER BY 1
If I run this query in last day of month when I do not have data for the following month yet then this SQL will correctly calculate YtD value. For example, if I have data for first six months and do not have data for seventh month, and calculate this for sixth month then calculation with year partition OVER (PARTITION BY SUBSTR(mperiod,1,4) will calculate correct YtD value. But if I have data after this month it will be included in PARTITION BY and will not calculate up to that moment.
How to calculate YtD retroactively, for previous months!? For example, the calculation of YtD for third month should include calculation for only those first three months in year, not all months in year.
Since you can't use a windowing clause or add in additional order by columns in PERCENTILE_CONT (boo!), here's one way of achieving your aims. N.B. it's not pretty, and I'm sure it won't be terrifically performant, but it should work at least!
WITH mytable AS (SELECT 100 ID, '2017M01' mperiod, to_date('01/01/2017', 'dd/mm/yyyy') frdate, 60 fr FROM dual UNION ALL
SELECT 101 ID, '2017M01' mperiod, to_date('02/01/2017', 'dd/mm/yyyy') frdate, 75 fr FROM dual UNION ALL
SELECT 103 ID, '2017M01' mperiod, to_date('08/01/2017', 'dd/mm/yyyy') frdate, 48 fr FROM dual UNION ALL
SELECT 104 ID, '2017M02' mperiod, to_date('06/02/2017', 'dd/mm/yyyy') frdate, 55 fr FROM dual UNION ALL
SELECT 105 ID, '2017M02' mperiod, to_date('15/02/2017', 'dd/mm/yyyy') frdate, 63 fr FROM dual UNION ALL
SELECT 106 ID, '2017M03' mperiod, to_date('18/03/2017', 'dd/mm/yyyy') frdate, 41 fr FROM dual UNION ALL
SELECT 107 ID, '2017M03' mperiod, to_date('22/03/2017', 'dd/mm/yyyy') frdate, 71 fr FROM dual UNION ALL
SELECT 108 ID, '2016M12' mperiod, to_date('22/12/2016', 'dd/mm/yyyy') frdate, 42 fr FROM dual UNION ALL
SELECT 109 ID, '2016M11' mperiod, to_date('22/11/2016', 'dd/mm/yyyy') frdate, 32 fr FROM dual),
unpckd AS (SELECT mt.ID,
mt.mperiod,
mt.frdate,
mt.fr,
CASE WHEN substr(mt.mperiod, -2) <= d.id THEN SUBSTR(mt.mperiod, 1, 5) || to_char(d.id, 'fm09')
END new_mperiod,
d.id dummy_id
FROM mytable mt
INNER JOIN (SELECT LEVEL ID
FROM dual
CONNECT BY LEVEL <= 12) d ON substr(mt.mperiod, -2) <= d.id),
res AS (SELECT mperiod,
new_mperiod,
ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr) OVER (PARTITION BY CASE WHEN mperiod = new_mperiod THEN mperiod END),2) fr_80,
ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr) OVER (PARTITION BY new_mperiod),2) fr_80_ytd
FROM unpckd)
SELECT DISTINCT new_mperiod mperiod,
fr_80 "80%_FR",
fr_80_ytd "80%_FR_YtD"
FROM res
WHERE new_mperiod = mperiod
ORDER BY 1;
MPERIOD 80%_FR 80%_FR_YtD
-------- ---------- ----------
2016M11 32 32
2016M12 42 40
2017M01 69 69
2017M02 61.4 65.4
2017M03 65 69.4
This works by doing a partial cross join between the numbers 1 to 12 (12 months in the year) and the last two digits of the mperiod. Once we have that, we now know the overall ytd period that the rows belong to (ie. number 1 will match to the 2017M01, 2 will match to 2017M01 and 2017M02, etc), so you can now produce a label for this calculated value (which I've called new_mperiod) and use that to partition against.
It's obviously going to be inefficient (since the partial cross join will generate more rows than is necessary for a year that's not got data for all its months, which get filtered out later, but I can't think of a better way of doing it.

How to make a time dependent distribution in SQL?

I have an SQL Table in which I keep project information coming from primavera.
Suppose that i have columns for Start Date,End Date,Duration, and Total Qty as shown below .
How can i distribute Total Qty over Months using these information. What kind of additional columns, sql queries i need in order to get correct monthly distribution?
Thanks in Advance.
Columns in order:
itemname,quantity,startdate,duration,enddate
item1 -- 108 -- 2013-03-25 -- 720 -- 2013-07-26
item2 -- 640 -- 2013-03-25 -- 720 -- 2013-07-26
.
.
I think the key is to break the records apart by month. Here is an example of how to do it:
with months as (
select 1 as mon union all select 2 union all select 3 union all
select 4 as mon union all select 5 union all select 6 union all
select 7 as mon union all select 8 union all select 9 union all
select 10 as mon union all select 11 union all select 12
)
select item, m.mon, quantity / nummonths
from (select t.*, (month(enddate) - month(startdate) + 1) as nummonths
from t
) t join
months m
on month(t.startDate) <= m.mon and
months(t.endDate) >= m.mon;
This works because all the months are within the same year -- as in your example. You are quite vague on how the split should be calculated. So, I assumed that every month from the start to the end gets an equal amount.

SQL - Count number of changes in an ordered list

Say I've got a table with two columns (date and price). If I select over a range of dates, then is there a way to count the number of price changes over time?
For instance:
Date | Price
22-Oct-11 | 3.20
23-Oct-11 | 3.40
24-Oct-11 | 3.40
25-Oct-11 | 3.50
26-Oct-11 | 3.40
27-Oct-11 | 3.20
28-Oct-11 | 3.20
In this case, I would like it to return a count of 4 price changes.
Thanks in advance.
You can use the analytic functions LEAD and LAG to access to prior and next row of a result set and then use that to see if there are changes.
SQL> ed
Wrote file afiedt.buf
1 with t as (
2 select date '2011-10-22' dt, 3.2 price from dual union all
3 select date '2011-10-23', 3.4 from dual union all
4 select date '2011-10-24', 3.4 from dual union all
5 select date '2011-10-25', 3.5 from dual union all
6 select date '2011-10-26', 3.4 from dual union all
7 select date '2011-10-27', 3.2 from dual union all
8 select date '2011-10-28', 3.2 from dual
9 )
10 select sum(is_change)
11 from (
12 select dt,
13 price,
14 lag(price) over (order by dt) prior_price,
15 (case when lag(price) over (order by dt) != price
16 then 1
17 else 0
18 end) is_change
19* from t)
SQL> /
SUM(IS_CHANGE)
--------------
4
Try this
select count(*)
from
(select date,price from table where date between X and Y
group by date,price )
Depending on the Oracle version use either analytical functions (see answer from Justin Cave) or this
SELECT
SUM (CASE WHEN PREVPRICE != PRICE THEN 1 ELSE 0 END) CNTCHANGES
FROM
(
SELECT
C.DATE,
C.PRICE,
MAX ( D.PRICE ) PREVPRICE
FROM
(
SELECT
A.Date,
A.Price,
(SELECT MAX (B.DATE) FROM MyTable B WHERE B.DATE < A.DATE) PrevDate
FROM MyTable A
WHERE A.DATE BETWEEN YourStartDate AND YourEndDate
) C
INNER JOIN MyTable D ON D.DATE = C.PREVDATE
GROUP BY C.DATE, C.PRICE
)

Query the Minimum Value per day within a month's worth of data

I have two sets of pricing data (A and B). Set A consists of all of my pricing data per order over a month. Set B consists of all of my competitor's pricing data over the same month. I want to compare my competitor's lowest price to each of my prices per day.
Graphically, the data appears like this:
Date:-- Set A: -- Set B:
1---------25---------31
1---------54---------47
1---------23---------56
1---------12---------23
1---------76---------40
1---------42
I want pass only the lowest price to a case statement which evaluates which prices are better. I would like to process an entire month's worth of data all at one time, so in my example, Dates 1 thru 30(1) would be included and crunched all at once, and for each day, there would only be one value from set B included: the lowest price in the set.
Important notes: Set B does not have a datapoint for each point in Set A
Hopefully this makes sense. Thanks in advance for any help you may be able to render.
That's a strange example you have - do you really have prices ranging from 12 to 76 within a single day?
Anyway, left joining your (grouped) data with their (grouped) data should work (untested):
with
my_prices as (
select price_date, min(price_value) min_price from my_prices group by price_date),
their_prices as (
select price_date, min(price_value) min_price from their_prices group by price_date)
select
mine.price_date,
(case
when theirs.min_price is null then mine.min_price
when theirs.min_price >= mine.min_price then mine.min_price
else theirs.min_price
end) min_price
from
my_min_prices mine
left join their_prices theirs on mine.price_date = theirs.price_date
I'm still not sure that I understand your requirements. My best guess is that you want something like
SQL> ed
Wrote file afiedt.buf
1 with your_data as (
2 select 1 date_id, 25 price_a,31 price_b from dual
3 union all
4 select 1, 54, 47 from dual union all
5 select 1, 23, 56 from dual union all
6 select 1, 12, 23 from dual union all
7 select 1, 76, 40 from dual union all
8 select 1, 42, null from dual)
9 select date_id,
10 sum( case when price_a < min_price_b
11 then 1
12 else 0
13 end) better,
14 sum( case when price_a = min_price_b
15 then 1
16 else 0
17 end) tie,
18 sum( case when price_a > min_price_b
19 then 1
20 else 0
21 end) worse
22 from( select date_id,
23 price_a,
24 min(price_b) over (partition by date_id) min_price_b
25 from your_data )
26* group by date_id
SQL> /
DATE_ID BETTER TIE WORSE
---------- ---------- ---------- ----------
1 1 1 4