How can I select an ID each month with the highest Mark - sql

I am fairly new to SQL. My table is
id mark datetimes
------|-----|------------
1001 | 10 | 2011-12-20
1002 | 11 | 2012-01-10
1005 | 12 | 2012-01-10
1003 | 10 | 2012-01-10
1004 | 11 | 2018-10-10
1006 | 12 | 2018-10-19
1007 | 13 | 2018-03-12
1008 | 15 | 2018-03-13
I need to select an ID with the highest mark at the end of each month (Year also matters) and ID can be repeated
My desired output would be
id mark
-----|----
1001 | 10
1005 | 12
1006 | 12
1008 | 15
So far I've Only able to get the highest value in each month
Select Max(Mark)'HighestMark'
From StudentMark
Group BY Year(datetimes), Month(datetimes)
When I tried to
Select Max(Mark)'HighestMark', ID
From StudentMark
Group BY Year(datetimes), Month(datetimes), ID
I get
Id HighestMark
----------- ------------
1001 10
1002 11
1003 12
1004 10
1005 11
1006 12
1007 13
1008 15

You can try like following.
Using ROW_NUMBER()
SELECT * FROM
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY YEAR(DATETIMES)
,MONTH(DATETIMES) ORDER BY MARK DESC) AS RN
FROM [MY_TABLE]
)T WHERE RN=1
Using WITH TIES
SELECT TOP 1 WITH TIES ID, mark AS HighestMarks
FROM [MY_TABLE]
ORDER BY ROW_NUMBER() OVER (PARTITION BY YEAR(datetimes)
,MONTH(datetimes) ORDER BY mark DESC)
Example:
WITH MY AS
(
SELECT
* FROM (VALUES
(1001 , 10 , '2011-12-20'),
(1002 , 11 , '2012-01-10'),
(1005 , 12 , '2012-01-10'),
(1003 , 10 , '2012-01-10'),
(1004 , 11 , '2018-10-10'),
(1006 , 12 , '2018-10-19'),
(1007 , 13 , '2018-03-12'),
(1008 , 15 , '2018-03-13')
) T( id , mark , datetimes)
)
SELECT ID,Mark as HighestMark FROM
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY YEAR(DATETIMES),MONTH(DATETIMES) ORDER BY MARK DESC) AS RN
FROM MY
)T WHERE RN=1
Output:
ID HighestMark
1001 10
1005 12
1008 15
1006 12

I don't see a way of doing this in a single query. But we can easily enough use one subquery to find the final mark in the month for each student, and another to find the student with the highest final mark.
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID, CONVERT(varchar(7), datetimes, 126)
ORDER BY datetimes DESC) rn
FROM StudentMark
)
SELECT ID, Mark AS HighestMark
FROM
(
SELECT *,
RANK() OVER (PARTITION BY CONVERT(varchar(7), datetimes, 126)
ORDER BY Mark DESC) rk
FROM cte
WHERE rn = 1
) t
WHERE rk = 1
ORDER BY ID;
Demo

In below query you have included ID column for Group By, because of this, it is considering all data for all ID.
Select Max(Mark)'HighestMark', ID From StudentMark Group BY Year(datetimes), Month(datetimes), ID
Remove ID column from this script and try again.

Use RANK in case there are more than 1 student having the same highest mark.
select id, mark
from
(select *,
rank() over( partition by convert(char(7), datetimes, 111) order by mark desc) seqnum
from studentMark ) t
where seqnum = 1

this should work:
select s.ID, t.Mark, t.[Month year] from Studentmark s
inner join (
Select
Max(Mark)'HighestMark'
,cast(Year(datetimes) as varchar(10)) +
cast(Month(datetimes) as varchar(10)) [month year]
From StudentMark
Group BY cast(Year(datetimes) as varchar(10))
+ cast(Month(datetimes) as varchar(10))) t on t.HighestMark = s.mark and
t.[month year] = cast(Year(s.datetimes) as varchar(10)) + cast(Month(s.datetimes) as varchar(10))

If for some reason you abhor subqueries, you can actually do this as:
select distinct
first_value(id) over (partition by year(datetimes), month(datetime) order by mark desc) as id
max(mark) over (partition by year(datetimes), month(datetime))
from StudentMark;
Or:
select top (1) with ties id, mark
from StudentMark
order by row_number() over (partition by year(datetimes), month(datetime) order by mark desc);
In this case, you can get all students in the event of ties by using rank() or dense_rank() instead of row_number().

Related

DB2 Toad SQL - Group by Certain Columns using Max Command

I am having some trouble with the below query. I do understand I need to group by ID and Category, but I only want to group by ID while keeping the rest of the columns based on Rank being max. Is there a way to only group by certain columns?
select ID, Category, max(rank)
from schema.table1
group by ID
Input:
ID Category Rank
111 3 4
111 1 5
123 5 3
124 7 2
Current Output
ID Category Rank
111 3 4
111 9 1
123 5 3
124 7 2
Desired Output
ID Category Rank
111 1 5
123 5 3
124 7 2
You can use:
select *
from table1
where (id, rank) in (select id, max(rank) from table1 group by id)
Result:
ID CATEGORY RANK
---- --------- ----
111 1 5
123 5 3
124 7 2
Or you can use the ROW_NUMBER() window function. For example:
select *
from (
select *,
row_number() over(partition by id order by rank desc) as rn
from table1
) x
where rn = 1
See running example at db<>fiddle.
You can try using - row_number()
select * from
(
select ID, Category,rank, row_number() over(partition by id order by rank desc) as rn
from schema.table1
)A where rn=1

SQL getting top 2 rows by date per PolicyId but with distinct dates

ValId | PolicyId | Date | Value
------+----------+------------+-------
1 | 11 | 2020-06-01 | 2000
2 | 11 | 2020-06-03 | 3000
3 | 11 | 2020-06-03 | 4000
4 | 12 | 2020-06-02 | 8000
5 | 12 | 2020-06-03 | 8500
I wanted to get top 2 latest Val rows for each PolicyId but they cannot be from the same date.
Rows for PolicyId = 12 are returned correctly - ValId 4 and 5.
For PolicyId = 11, rows with ValId 2 and 3 are returned but as they are on the same date I wanted row of ValId 1 to be returned instead of ValId 2.
SELECT
V.ValId, V.PolicyId, V.Value, V.Date
FROM
(SELECT
ValId, PolicyId, Value, Date,
ROW_NUMBER() OVER (PARTITION BY PolicyId ORDER BY Date Desc, ValId DESC) AS RowNum
FROM
TVal) V
WHERE
RowNum <= 2
You can enumerate the rows by dates and within dates:
select t.*
from (select t.*,
dense_rank() over (partition by policyid order by date desc valId desc) as seqnum,
rank() over (partition by policyid, date order by valId desc) as seqnum_within_date
from tval
) t
where seqnum <= 2 and seqnum_within_date = 1;
Using the suggestion from Gordon Linoff I was able to complete the sql as below
Select v.* from
(
select t.*,
row_number() over (partition by policyid order by date desc valId desc) as seqnum,
from (select t.*
dense_rank() over (partition by policyid, date order by valId desc) as seqnum_within_date
from tval
) t where seqnum_within_date = 1
)v where seqnum <= 2

Getting the third moving date

I need to get the third moving date based on address for each person. Either status change or address change will make DT change. For example, for PERSON_ID:1, the third moving date should be 07/16/2016. Thanks!
The data is as below:
PERSON_ID STATUS DT ADDRESS
1 12 5/6/2016 3
1 6 5/8/2016 3
1 7 6/5/2016 3
1 1 6/13/2016 3
1 12 6/20/2016 1
1 17 7/8/2016 1
1 1 7/11/2016 1
1 12 7/16/2016 2
1 3 12/6/2016 2
2 5 3/11/2016 5
2 1 5/15/2016 4
2 6 7/18/2016 6
2 12 7/21/2016 6
Using row_number() and group by for the min(dt) per address:
Note: This will not work correctly if the person moves between the same addresses.
select
Person_id
, dt = convert(char(10),dt,120)
, Address
from (
select
person_id
, dt = min(dt)
, address
, rn = row_number() over (partition by person_id order by min(dt))
from t
group by person_id, address
) s
where rn = 3
rextester demo: http://rextester.com/VLTUU16478
returns:
+-----------+------------+---------+
| Person_id | dt | Address |
+-----------+------------+---------+
| 1 | 2016-07-16 | 2 |
| 2 | 2016-07-18 | 6 |
+-----------+------------+---------+
To solve this correctly for a person moving between the same addresses, you have to address the gaps and islands problem.
Adding an additional subquery to the above solution so we can identify and group by the islands:
select
Person_id
, dt = convert(char(10),dt,120)
, Address
from (
select
person_id
, dt = min(dt)
, address
, rn = row_number() over (partition by person_id order by min(dt))
from (
select
person_id
, address
, dt
, island = row_number() over (partition by person_id order by dt)
- row_number() over (partition by person_id, address order by dt)
from t
) s
group by person_id, address, island
) s
where rn = 3
rextester demo: http://rextester.com/PPIH49666
returns:
+-----------+------------+---------+
| Person_id | dt | Address |
+-----------+------------+---------+
| 1 | 2016-07-16 | 3 |
| 2 | 2016-07-18 | 5 |
+-----------+------------+---------+
I'm not sure i understood what you need to get but I think what you're trying to do is something like:
SELECT * FROM
(
SELECT person_id,adress,[status],DT, rank() over (partition by person_id order by adress,dt)-1 as movement
FROM #t
WHERE person_id=1
) t
WHERE t.movement=3
Question is not entirely clear
select PERSON_ID, STATUS, DT, ADDRESS
from ( select PERSON_ID, STATUS, DT, ADDRESS
, row_number() over (partition by person order by dt) as rn
from ( select PERSON_ID, STATUS, DT, ADDRESS
, row_number() over ( partition by PERSON_ID, address order by dt) as rn
from table
) tt
where tt.rn = 1
) rr
where rr.rn = 3

select top N records for each entity

I have a table like below -
ID | Reported Date | Device_ID
-------------------------------------------
1 | 2016-03-09 09:08:32.827 | 1
2 | 2016-03-08 09:08:32.827 | 1
3 | 2016-03-08 09:08:32.827 | 1
4 | 2016-03-10 09:08:32.827 | 2
5 | 2016-03-05 09:08:32.827 | 2
Now, i want a top 1 row based on date column for each device_ID
Expected Output
ID | Reported Date | Device_ID
-------------------------------------------
1 | 2016-03-09 09:08:32.827 | 1
4 | 2016-03-10 09:08:32.827 | 2
I am using SQL Server 2008 R2. i can go and write Stored Procedure to handle it but wanted do it with simple query.
****************EDIT**************************
Answer by 'Felix Pamittan' worked well but for 'N' just change it to
SELECT
Id, [Reported Date], Device_ID
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [ReportedDate] DESC)
FROM tbl
)t
WHERE Rn >= N
He had mentioned this in comment thought to add it to questions so that no body miss it.
Use ROW_NUMBER:
SELECT
Id, [Reported Date], Device_ID
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [ReportedDate] DESC)
FROM tbl
)t
WHERE Rn = 1
You can also try using CTE
With DeviceCTE AS
(SELECT *, ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [Reported Date] DESC) AS Num
FROM tblname)
SELECT Id, [Reported Date], Device_ID
From DeviceCTE
Where Num = 1
If you can't use an analytic function, e.g. because your application layer won't allow it, then you can try the following solution which uses a subquery to arrive at the answer:
SELECT t1.ID, t2.maxDate, t1.Device_ID
INNER JOIN
(
SELECT Device_ID, MAX([Reported Date]) AS maxDate
FROM yourTable
GROUP BY Device_ID
) t2
ON t1.Device_ID = t2.Device_ID AND t1.[Reported Date] = t2.maxDate
Select * from DEVICE_TABLE D
where [Reported Date] = (Select Max([Reported Date]) from DEVICE_TABLE where Device_ID = D.Device_ID)
should do the trick, assume that "top 1 row based on date column" means that you want to select the latest reported date of each Device_ID ?
As for your title, select top 5 rows of each Device_ID
Select * from DEVICE_TABLE D
where [Reported Date] in (Select top 5 [Reported Date] from DEVICE_TABLE D where Device_ID = D.Device_ID)
order by Device_ID, [Reported Date] desc
will give you the top 5 latest reports of each device id.
You may want to sort out the top 5 date if your data isn't in order...
Again with no analytic functions you can use CROSS APPLY :
DECLARE #tbl TABLE(Id INT,[Reported Date] DateTime , Device_ID INT)
INSERT INTO #tbl
VALUES
(1,'2016-03-09 09:08:32.827',1),
(2,'2016-03-08 09:08:32.827',1),
(3,'2016-03-08 09:08:32.827',1),
(4,'2016-03-10 09:08:32.827',2),
(5,'2016-03-05 09:08:32.827',2)
SELECT r.*
FROM ( SELECT DISTINCT Device_ID FROM #tbl ) d
CROSS APPLY ( SELECT TOP 1 *
FROM #tbl t
WHERE d.Device_ID = t.Device_ID ) r
Can be easily modified to support N records.
Credits go to wBob answering this question here

Oracle select query to filter rows

Say I have a table with the following values
id (PK) a_num a_code effect_dt expire_dt
32 1234 abcd 01/01/2015 05/30/2015
9 1234 abcd 06/01/2015 12/31/2015
5 1234 efgh 01/01/2015 05/30/2015
14 1234 efgh 06/01/2015 12/31/2015
How can I select just one record from a_num,a_code pair. Either Id's 1,3 or 2,4? There may be scenarios where there are more than 2 records for a a_num,a_code pair.
UPDATE - ID will not necessarily always be in order, it is just a primary key.
This will give you rows 1 and 3
Select * from (
Select * , Row_number() Over(Partition by a_num, a_code order by id) r_num from Your_Table ) result
Where r_num = 1
Just use DESC in order by and you will get rows 2 and 4
Select * from (
Select * , Row_number() Over(Partition by a_num, a_code order by id desc) r_num from Your_Table ) result
Where r_num = 1
One way would be to use the row_number window function:
SELECT id, a_num, a_code, effect_dt, expire_dt
FROM (SELECT id, a_num, a_code, effect_dt, expire_dt,
ROW_NUMBER() OVER (PARTITION BY a_num, a_code
ORDER BY 1) AS rn
FROM mytable) t
WHERE rn = 1