Estimation of Cumulative value every 3 months in SQL

Estimation of Cumulative value every 3 months in SQL - sql

I have a table like this:
ID Date Prod
1 1/1/2009 5
1 2/1/2009 5
1 3/1/2009 5
1 4/1/2009 5
1 5/1/2009 5
1 6/1/2009 5
1 7/1/2009 5
1 8/1/2009 5
1 9/1/2009 5
And I need to get the following result:
ID Date Prod CumProd
1 2009/03/01 5 15 ---Each 3 months
1 2009/06/01 5 30 ---Each 3 months
1 2009/09/01 5 45 ---Each 3 months
What could be the best approach to take in SQL?

You can try the below - using window function
DEMO Here
select * from
(
select *,sum(prod) over(order by DATEPART(qq,dateval)) as cum_sum,
row_number() over(partition by DATEPART(qq,dateval) order by dateval) as rn
from t
)A where rn=1

How about just filtering on the month number?
select t.*
from (select id, date, prod, sum(prod) over (partition by id order by date) as running_prod
from t
) t
where month(date) in (3, 6, 9, 12);

Related

Select max of nested id from amazon redshift

My database is an amazon redshift.
I have a table that looks like this -
id
nested_id
date
value
1
10
'2021-01-01'
5
1
20
'2021-01-01'
10
1
10
'2021-01-02'
6
1
20
'2021-01-02'
11
1
10
'2021-01-03'
7
1
20
'2021-01-03'
12
2
30
'2021-01-01'
5
2
40
'2021-01-01'
10
2
30
'2021-01-02'
6
2
40
'2021-01-02'
11
2
30
'2021-01-03'
7
2
40
'2021-01-03'
12
So this is basically a table that tracks values by id over time, except for every id there can be a nested_id. And the dates and values are primarily connected to the nested_id.
However, let's say I'm starting with the id field, but for each id I want to only return the points over time for the nested_id that has the greater sum of points.
So right now I'm just grabbing it like this...
select *
from mytable
where id in (1, 2)
except I only want it to return nested_id rows where the maximum value of that nested_id is the greatest.
So here's how I would do this manually.
For id of 1, the maximum value is 12, and the nested_id of that value is 20
For id of 2, the maximum value is 12, and the nested_id of that value is 40
So my return table should be
id
nested_id
date
value
1
20
'2021-01-01'
10
1
20
'2021-01-02'
11
1
20
'2021-01-03'
12
2
40
'2021-01-01'
10
2
40
'2021-01-02'
11
2
40
'2021-01-03'
12
Is there an easy way of performing this query? I'm assuming you have to partition somehow?

You can solve this with row_number window functions
with maxs as (
select id,
nested_id,
value,
row_number() over (partition by id order by value desc) rn
from mytable
)
select mt.*
from mytable mt
left join maxs on mt.id = maxs.id and mt.nested_id = maxs.nested_id
where maxs.rn = 1

How to use 2 columns as "key" to get MAX value of selection (and on to next "key") in a SQL query

using a SQL query I am trying to get a max value from multiple rows, using 2 columns as 'key', and then sum them and move on t next 'key'
Here is an example table. It has years, userid and points. Each year has several weeks.
What I want to do is to take each users MAX points for each year and SUM them.
year
userid
week
points
2020
1
1
3
2020
1
3
3
2020
1
3
5
2020
1
4
12
2020
2
1
4
2020
2
2
4
2020
2
3
6
2020
2
4
10
2021
1
1
4
2021
1
2
5
2021
1
3
8
2021
1
4
9
2021
2
1
3
2021
2
2
6
2021
2
3
7
2021
2
4
13
I'd like the result for each year to be
User 1:
2020, 1, 12
2021, 1, 9
User 2:
2020, 2, 10
2021, 2, 13
...and after summing them, sorted by points:
userid
points
2
33
1
21
...and so forth (adding on users and years)
Any help is very much appreciated.
Per Gordon's helpful answer this is the query:
SELECT username, userdb.userid, SUM(points) as points FROM (SELECT standing.*, row_number() over (partition by standing.userid, year ORDER BY points desc) AS seqnum FROM standing) t JOIN userdb on userdb.userid = t.userid WHERE seqnum = 1 GROUP BY userid ORDER BY points DESC

You can use two levels of aggregation:
select userid, sum(max_points)
from (select userid, year, max(points) as max_points
from t
group by userid, year
) uy
group by userid;
Alternatively, you could handle this by filtering such as by using a window function:
select userid, sum(points)
from (select t.*,
row_number() over (partition by userid, year order by points desc) as seqnum
from t
) t
where seqnum = 1
group by userid;

To write a Oracle stored procedure to get data between the months

I have a procedure sp_data_between_months (p_from_date DATE, p_to_date DATE) // example p_from_date = '01-jan-2021' and 'p_to_date' = '31-mar-2021'.
I need to get the latest record for the ID for each month, add these values, and populate against p_to_date for each ID from the below table using PLSQL.
Table Name: ID_Value
ID
Date
value
1
1-jan-2021
10
1
10-jan-2021
20
2
15-jan-2021
15
2
16-jan-2021
20
2
02-feb-2021
10
2
06-feb-2021
15
1
17-feb-2021
10
1
5-mar-2021
15
1
17-mar-2021
10
2
10-mar-2021
10
the expected output is to get the latest value for each ID for each month-end and the sum of its value between those months between the ranges.
Output: p_to_date ID Sum of latest record of value for each month
DATE
ID
VALUE
31-Mar-2021
1
40 //(20+10+10) sum of value oflatest record foreach month
31-Mar-2021
2
45 //(20+15+10)

Here you are. Read comments within code.
SQL> with
2 temp as
3 -- analytic function will return 1 for the latest row for that ID in that month
4 (select id, datum, value,
5 row_number() over (partition by id, trunc(datum, 'mm') order by datum desc) rn
6 from id_value
7 )
8 -- finally, select last day in MAX month and sum all values for RN = 1
9 select
10 id,
11 last_day(max(datum)) datum,
12 sum(value)
13 from temp
14 where rn = 1
15 group by id;
ID DATUM SUM(VALUE)
---------- ----------- ----------
1 31-mar-2021 40
2 31-mar-2021 45
SQL>

Current record with group by function

Trying to get userid recent aggregate value for session_id.
(session_id 3 has two records, recent agg value is 80.00
session_id 4 has four records, recent agg value is 95.00
session_id 6 has three records, recent agg value is 72.00
Table:session_agg
id session_id userid agg date
-- ---------- ------ ----- -------
1 3 11 60.00 1573561586
4 3 11 80.00 1573561586
6 4 11 35.00 1573561749
7 4 11 50.00 1573561751
8 4 11 70.00 1573561912
10 4 11 95.00 1573561921
11 6 14 40.00 1573561945
12 6 14 67.00 1573561967
13 6 14 72.00 1573561978
select id, session_id, userid, agg, date from session_agg
WHERE date IN (select MAX(date) from session_agg GROUP BY session_id) AND
userid = 11

If you want to stick with your current approach, then you need to correlate the session_id in the subquery which checks for the max date for each session:
SELECT id, session_id, userid, add, date
FROM session_agg sa1
WHERE
date = (SELECT MAX(date) FROM session_agg sa2 WHERE sa2.session_id = sa1.session_id) AND
userid = 11;
But, if your version of SQL supports analytic functions, ROW_NUMBER is an easier way to do this:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY session_id ORDER BY date DESC) rn
FROM session_agg
)
SELECT id, session_id, userid, add, date
FROM cte
WHERE rn = 1;

SELECT query for skipping rows with duplicates but leaving the first and the last occurrences in PostgreSQL

I have a table with items, dates, and prices and I am trying to find a way to write a SELECT query in PostgreSQL which will skip rows with duplicate prices so that, only the first and last occurrence of the same price in a row would stay. After the price change, it can go back to the previous value and it should be preserved as well.
id date price item
1 20.10.2018 10 a
2 21.10.2018 10 a
3 22.10.2018 10 a
4 23.10.2018 15 a
5 24.10.2018 15 a
6 25.10.2018 15 a
7 26.10.2018 10 a
8 27.10.2018 10 a
9 28.10.2018 10 a
10 29.10.2018 10 a
11 26.10.2018 3 b
12 27.10.2018 3 b
13 28.10.2018 3 b
14 29.10.2018 3 c
Result:
id date price item
1 20.10.2018 10 a
3 22.10.2018 10 a
4 23.10.2018 15 a
6 25.10.2018 15 a
7 26.10.2018 10 a
10 29.10.2018 10 a
11 26.10.2018 3 b
13 28.10.2018 3 b
14 29.10.2018 3 c

You can use lag() and lead():
select id, date, price, item
from (select t.*,
lag(price) over (partition by item order by date) as prev_price,
lead(price) over (partition by item order by date) as next_price
from t
) t
where prev_price is null or prev_price <> price or
next_price is null or next_price <> price

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Estimation of Cumulative value every 3 months in SQL - sql

You can try the below - using window function DEMO Here select * from ( select *,sum(prod) over(order by DATEPART(qq,dateval)) as cum_sum, row_number() over(partition by DATEPART(qq,dateval) order by dateval) as rn from t )A where rn=1

How about just filtering on the month number? select t.* from (select id, date, prod, sum(prod) over (partition by id order by date) as running_prod from t ) t where month(date) in (3, 6, 9, 12);

Related

Select max of nested id from amazon redshift

How to use 2 columns as "key" to get MAX value of selection (and on to next "key") in a SQL query

To write a Oracle stored procedure to get data between the months

Current record with group by function

SELECT query for skipping rows with duplicates but leaving the first and the last occurrences in PostgreSQL

Categories

Resources