How to get most recent balance for every user and its corresponding dates - sql

I have a table called balances. I want to get the most recent balance for each user, forever every financial year and its corresponding date it was updated.
name
balance
financial_year
date_updated
Bob
20
2021
2021-04-03
Bob
58
2019
2019-11-13
Bob
43
2019
2022-01-24
Bob
-4
2019
2019-12-04
James
92
2021
2021-09-11
James
86
2021
2021-08-18
James
33
2019
2019-03-24
James
46
2019
2019-02-12
James
59
2019
2019-08-12
So my desired output would be:
name
balance
financial_year
date_updated
Bob
20
2021
2021-04-03
Bob
43
2019
2022-01-24
James
92
2021
2021-09-11
James
59
2019
2019-08-12
I've attempted this but found that using max() sometimes does not work since I use it across multiple columns
SELECT name, max(balance), financial_year, max(date_updated)
FROM balances
group by name, financial_year

select NAME
,BALANCE
,FINANCIAL_YEAR
,DATE_UPDATED
from (
select t.*
,row_number() over(partition by name, financial_year order by date_updated desc) as rn
from t
) t
where rn = 1
NAME
BALANCE
FINANCIAL_YEAR
DATE_UPDATED
Bob
43
2019
24-JAN-22
Bob
20
2021
03-APR-21
James
59
2019
12-AUG-19
James
92
2021
11-SEP-21
Fiddle

The problem is not that you use max() across multiple columns but the fact, that max() returns the maximum value. In your example, the highest balance of Bob in financial year 2019 was 58. The 'highest' (last) date_updated was 2022-01-24, but at this time the balance was 43.
What you're looking for is the balance at the time the balance was updated last within a financial year per user, that is something like
SELECT b.name, b.financial_year, b.balance, b.date_updated
FROM balances b
INNER JOIN (SELECT name, financial_year, max(date_updated) last_updated
FROM balances GROUP BY name, financial_year) u
ON b.name = u.name AND b.financial_year = u.financial_year AND b.date_updated = u.last_updated;

Related

How do you get the last entry for each month in SQL?

I am looking to filter very large tables to the latest entry per user per month. I'm not sure if I found the best way to do this. I know I "should" trust the SQL engine (snowflake) but there is a part of me that does not like the join on three columns.
Note that this is a very common operation on many big tables, and I want to use it in DBT views which means it will get run all the time.
To illustrate, my data is of this form:
mytable
userId
loginDate
year
month
value
1
2021-01-04
2021
1
41.1
1
2021-01-06
2021
1
411.1
1
2021-01-25
2021
1
251.1
2
2021-01-05
2021
1
4369
2
2021-02-06
2021
2
32
2
2021-02-14
2021
2
731
3
2021-01-20
2021
1
258
3
2021-02-19
2021
2
4251
3
2021-03-15
2021
3
171
And I'm trying to use SQL to get the last value (by loginDate) for each month.
I'm currently doing a groupby & a join as follows:
WITH latest_entry_by_month AS (
SELECT "userId", "year", "month", max("loginDate") AS "loginDate"
FROM mytable
)
SELECT * FROM mytable NATURAL JOIN latest_entry_by_month
The above results in my desired output:
userId
loginDate
year
month
value
1
2021-01-25
2021
1
251.1
2
2021-01-05
2021
1
4369
2
2021-02-14
2021
2
731
3
2021-01-20
2021
1
258
3
2021-02-19
2021
2
4251
3
2021-03-15
2021
3
171
But I'm not sure if it's optimal.
Any guidance on how to do this faster? Note that I am not materializing the underlying data, so it is effectively un-clustered (I'm getting it from a vendor via the Snowflake marketplace).
Using QUALIFY and windowed function(ROW_NUMBER):
SELECT *
FROM mytable
QUALIFY ROW_NUMBER() OVER(PARTITION BY userId, year, month
ORDER BY loginDate DESC) = 1

How to filter data based on top values from a specific year in SQL?

Let's assume my data looks like this:
year person cash
0 2020 personone 29
1 2021 personone 40
2 2020 persontwo 17
3 2021 persontwo 13
4 2020 personthree 62
5 2021 personthree 55
What I want to do is the following. I'd like to get the top 2 people comparing their cash based on year 2021. We can see that in 2021 personone and personthree are the top 2 people, then it can be ordered by cash in 2021. So the output I'm after is:
year person cash
0 2020 personthree 62
1 2021 personthree 55
2 2020 personone 29
3 2021 personone 40
I've been trying a similar approach to the one described here with no much luck.
We can use DENSE_RANK here:
WITH cte AS (
SELECT *, DENSE_RANK() OVER (PARTITION BY person ORDER BY cash DESC) dr
FROM yourTable
WHERE year = 2021
)
SELECT *
FROM yourTable
WHERE person IN (SELECT person FROM cte WHERE dr = 2);

How Do I retrieve most Recent record in different years With Date date in different table

I'm working with a database that isn't structured that well and need to retrieve the row with the latest month used in specific years. The main data is stored is stored in the member table and lists one row per member month. The Date for the member month is not specifically stored here but connected by a foreign Date_Key and linked to a Date table. This is where the column for the Year and Month can be derived based on the Date_Key specified in each table. Each row in the Date table represents 1 new month for a year and each of these rows has a unique sequential date_key.
I am using Microsoft SQL Server Studio as the environment
Member Table
MemberKey
Membe_ID
Date_Key
100
1234
89
101
1234
96
102
1234
97
103
1236
96
104
1236
97
Date Table
Date_Key
Year
Month
89
2020
10
90
2020
11
91
2020
12
92
2021
1
93
2021
2
94
2021
3
95
2021
4
96
2021
5
97
2021
6
Looking for the following Results
Member_ID
Year
Month
1234
2020
10
1234
2021
6
1236
2021
6
2020/11 is NOT a date. It is a year/month pair. But it seems like a simple aggregate - select year, max(month) group by year. You join and include member ID so you include that column in the GROUP BY clause to get one row per member per year.
select mbr.Member_ID, dts.Year, max(dts.Month) as Month
from dbo.Members as mbr
inner join dbo.Dates as dts on mbr.Date_Key = dts.Date_Key
group by mbr.Member_ID, dts.Year
order by mbr.Member_ID, dts.Year
;

MS Access selecting by year intervals

I have a table, where every row has its own date (year of purchase), I should select the purchases grouped into year intervals.
Example:
Zetor 1993
Zetor 1993
JOHN DEERE 2001
JOHN DEERE 2001
JOHN DEERE 2001
Means I have 2 zetor purchase in 1993 and 3 john deere purchase in 2001. I should select the count of the pruchases grouped into these year intervals:
<=1959
1960-1969
1970-1979
1980-1989
1990-1994
1995-1999
2000-2004
2004-2009
2010-2013
I have no idea how should I do this.
The result should look like this on the example above:
<=1959
1960-1969 0
1970-1979 0
1980-1989 0
1990-1994 2
1995-1999 0
2000-2004 3
2004-2009 0
2010-2013 0
Create table with intervals:
tblRanges([RangeName],[Begins],[Ends])
Populate it with your intervals
Use GROUP BY with your table tblPurchases([Item],YearOfDeal):
SELECT tblRanges.RangeName, Count(tblPurchases.YearOfDeal)
FROM tblRanges INNER JOIN tblPurchases ON (tblRanges.Begins <= tblPurchases.Year) AND (tblRanges.Ends >= tblPurchases.YearOfDeal)
GROUP BY tblRanges.RangeName;
You may wish to consider Partition for future use:
SELECT Partition([Year],1960,2014,10) AS [Group], Count(Stock.Year) AS CountOfYear
FROM Stock
GROUP BY Partition([Year],1960,2014,10)
Input:
Tractor Year
Zetor 1993
Zetor 1993
JOHN DEERE 2001
JOHN DEERE 2001
JOHN DEERE 2001
Pre 59 1945
1960 1960
Result:
Group CountOfYear
:1959 1
1960:1969 1
1990:1999 2
2000:2009 3
Reference: http://office.microsoft.com/en-ie/access-help/partition-function-HA001228892.aspx

How to calculate Rank SQL query

HI, I have the following table which save agent ranking on daily basis on basis of tickets status.
No. **Agent Name** **Incidents** **workorder** **Rank** **TimeStamp**
1 cedric 200 29 1 21 Jan 2011
2 poul 100 10 2 21 Jan 2011
3 dan 200 20 1 21 Jan 2011
4 cedric 100 19 2 22 Jan 2011
5 poul 200 26 1 22 Jan 2011
6 dan 150 20 2 22 Jan 2011
Now i need query which fetch ranking between two dates means if i select date between 21 jan 2011 to 22 jan 2011 then query return me agents average ranking between these two dates of agent not return the agent ranking details on date wise. I need single name of agent with his ranking.
Regards,
Iftikhar hashmi
Try
SELECT [Agent Name], AVG(RANK) FROM MY_TABLE WHERE [TimeStamp] BETWEEN DATE1 AND DATE2
GROUP BY [Agent Name]
(Update)
Thanks to Martin which reminded me I need to cast RANK.
SELECT [Agent Name], AVG(CAST(RANK AS FLOAT)) FROM MY_TABLE WHERE [TimeStamp] BETWEEN DATE1 AND DATE2
GROUP BY [Agent Name]