Estimation of Cumulative value every 3 months in SQL - sql

I have a table like this:
ID Date Prod
1 1/1/2009 5
1 2/1/2009 5
1 3/1/2009 5
1 4/1/2009 5
1 5/1/2009 5
1 6/1/2009 5
1 7/1/2009 5
1 8/1/2009 5
1 9/1/2009 5
And I need to get the following result:
ID Date Prod CumProd
1 2009/03/01 5 15 ---Each 3 months
1 2009/06/01 5 30 ---Each 3 months
1 2009/09/01 5 45 ---Each 3 months
What could be the best approach to take in SQL?

You can try the below - using window function
DEMO Here
select * from
(
select *,sum(prod) over(order by DATEPART(qq,dateval)) as cum_sum,
row_number() over(partition by DATEPART(qq,dateval) order by dateval) as rn
from t
)A where rn=1

How about just filtering on the month number?
select t.*
from (select id, date, prod, sum(prod) over (partition by id order by date) as running_prod
from t
) t
where month(date) in (3, 6, 9, 12);

Related

Select max of nested id from amazon redshift

My database is an amazon redshift.
I have a table that looks like this -
id
nested_id
date
value
1
10
'2021-01-01'
5
1
20
'2021-01-01'
10
1
10
'2021-01-02'
6
1
20
'2021-01-02'
11
1
10
'2021-01-03'
7
1
20
'2021-01-03'
12
2
30
'2021-01-01'
5
2
40
'2021-01-01'
10
2
30
'2021-01-02'
6
2
40
'2021-01-02'
11
2
30
'2021-01-03'
7
2
40
'2021-01-03'
12
So this is basically a table that tracks values by id over time, except for every id there can be a nested_id. And the dates and values are primarily connected to the nested_id.
However, let's say I'm starting with the id field, but for each id I want to only return the points over time for the nested_id that has the greater sum of points.
So right now I'm just grabbing it like this...
select *
from mytable
where id in (1, 2)
except I only want it to return nested_id rows where the maximum value of that nested_id is the greatest.
So here's how I would do this manually.
For id of 1, the maximum value is 12, and the nested_id of that value is 20
For id of 2, the maximum value is 12, and the nested_id of that value is 40
So my return table should be
id
nested_id
date
value
1
20
'2021-01-01'
10
1
20
'2021-01-02'
11
1
20
'2021-01-03'
12
2
40
'2021-01-01'
10
2
40
'2021-01-02'
11
2
40
'2021-01-03'
12
Is there an easy way of performing this query? I'm assuming you have to partition somehow?
You can solve this with row_number window functions
with maxs as (
select id,
nested_id,
value,
row_number() over (partition by id order by value desc) rn
from mytable
)
select mt.*
from mytable mt
left join maxs on mt.id = maxs.id and mt.nested_id = maxs.nested_id
where maxs.rn = 1

How to use 2 columns as "key" to get MAX value of selection (and on to next "key") in a SQL query

using a SQL query I am trying to get a max value from multiple rows, using 2 columns as 'key', and then sum them and move on t next 'key'
Here is an example table. It has years, userid and points. Each year has several weeks.
What I want to do is to take each users MAX points for each year and SUM them.
year
userid
week
points
2020
1
1
3
2020
1
3
3
2020
1
3
5
2020
1
4
12
2020
2
1
4
2020
2
2
4
2020
2
3
6
2020
2
4
10
2021
1
1
4
2021
1
2
5
2021
1
3
8
2021
1
4
9
2021
2
1
3
2021
2
2
6
2021
2
3
7
2021
2
4
13
I'd like the result for each year to be
User 1:
2020, 1, 12
2021, 1, 9
User 2:
2020, 2, 10
2021, 2, 13
...and after summing them, sorted by points:
userid
points
2
33
1
21
...and so forth (adding on users and years)
Any help is very much appreciated.
Per Gordon's helpful answer this is the query:
SELECT username, userdb.userid, SUM(points) as points FROM (SELECT standing.*, row_number() over (partition by standing.userid, year ORDER BY points desc) AS seqnum FROM standing) t JOIN userdb on userdb.userid = t.userid WHERE seqnum = 1 GROUP BY userid ORDER BY points DESC
You can use two levels of aggregation:
select userid, sum(max_points)
from (select userid, year, max(points) as max_points
from t
group by userid, year
) uy
group by userid;
Alternatively, you could handle this by filtering such as by using a window function:
select userid, sum(points)
from (select t.*,
row_number() over (partition by userid, year order by points desc) as seqnum
from t
) t
where seqnum = 1
group by userid;

To write a Oracle stored procedure to get data between the months

I have a procedure sp_data_between_months (p_from_date DATE, p_to_date DATE) // example p_from_date = '01-jan-2021' and 'p_to_date' = '31-mar-2021'.
I need to get the latest record for the ID for each month, add these values, and populate against p_to_date for each ID from the below table using PLSQL.
Table Name: ID_Value
ID
Date
value
1
1-jan-2021
10
1
10-jan-2021
20
2
15-jan-2021
15
2
16-jan-2021
20
2
02-feb-2021
10
2
06-feb-2021
15
1
17-feb-2021
10
1
5-mar-2021
15
1
17-mar-2021
10
2
10-mar-2021
10
the expected output is to get the latest value for each ID for each month-end and the sum of its value between those months between the ranges.
Output: p_to_date ID Sum of latest record of value for each month
DATE
ID
VALUE
31-Mar-2021
1
40 //(20+10+10) sum of value oflatest record foreach month
31-Mar-2021
2
45 //(20+15+10)
Here you are. Read comments within code.
SQL> with
2 temp as
3 -- analytic function will return 1 for the latest row for that ID in that month
4 (select id, datum, value,
5 row_number() over (partition by id, trunc(datum, 'mm') order by datum desc) rn
6 from id_value
7 )
8 -- finally, select last day in MAX month and sum all values for RN = 1
9 select
10 id,
11 last_day(max(datum)) datum,
12 sum(value)
13 from temp
14 where rn = 1
15 group by id;
ID DATUM SUM(VALUE)
---------- ----------- ----------
1 31-mar-2021 40
2 31-mar-2021 45
SQL>

Current record with group by function

Trying to get userid recent aggregate value for session_id.
(session_id 3 has two records, recent agg value is 80.00
session_id 4 has four records, recent agg value is 95.00
session_id 6 has three records, recent agg value is 72.00
Table:session_agg
id session_id userid agg date
-- ---------- ------ ----- -------
1 3 11 60.00 1573561586
4 3 11 80.00 1573561586
6 4 11 35.00 1573561749
7 4 11 50.00 1573561751
8 4 11 70.00 1573561912
10 4 11 95.00 1573561921
11 6 14 40.00 1573561945
12 6 14 67.00 1573561967
13 6 14 72.00 1573561978
select id, session_id, userid, agg, date from session_agg
WHERE date IN (select MAX(date) from session_agg GROUP BY session_id) AND
userid = 11
If you want to stick with your current approach, then you need to correlate the session_id in the subquery which checks for the max date for each session:
SELECT id, session_id, userid, add, date
FROM session_agg sa1
WHERE
date = (SELECT MAX(date) FROM session_agg sa2 WHERE sa2.session_id = sa1.session_id) AND
userid = 11;
But, if your version of SQL supports analytic functions, ROW_NUMBER is an easier way to do this:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY session_id ORDER BY date DESC) rn
FROM session_agg
)
SELECT id, session_id, userid, add, date
FROM cte
WHERE rn = 1;

SELECT query for skipping rows with duplicates but leaving the first and the last occurrences in PostgreSQL

I have a table with items, dates, and prices and I am trying to find a way to write a SELECT query in PostgreSQL which will skip rows with duplicate prices so that, only the first and last occurrence of the same price in a row would stay. After the price change, it can go back to the previous value and it should be preserved as well.
id date price item
1 20.10.2018 10 a
2 21.10.2018 10 a
3 22.10.2018 10 a
4 23.10.2018 15 a
5 24.10.2018 15 a
6 25.10.2018 15 a
7 26.10.2018 10 a
8 27.10.2018 10 a
9 28.10.2018 10 a
10 29.10.2018 10 a
11 26.10.2018 3 b
12 27.10.2018 3 b
13 28.10.2018 3 b
14 29.10.2018 3 c
Result:
id date price item
1 20.10.2018 10 a
3 22.10.2018 10 a
4 23.10.2018 15 a
6 25.10.2018 15 a
7 26.10.2018 10 a
10 29.10.2018 10 a
11 26.10.2018 3 b
13 28.10.2018 3 b
14 29.10.2018 3 c
You can use lag() and lead():
select id, date, price, item
from (select t.*,
lag(price) over (partition by item order by date) as prev_price,
lead(price) over (partition by item order by date) as next_price
from t
) t
where prev_price is null or prev_price <> price or
next_price is null or next_price <> price