Select top 1 row with aggregate function - sql

I have data in table like this:
UserData table:
|ID| Name | TeamName |
| 1| Peter | Alpha |
| 1| Peter | Beta |
| 1| Peter | Gamma |
| 2| Mary | Gamma |
| 2| Mary | Omega |
| 3| John | Kappa |
| 3| John | Delta |
Combinations of Name and TeamName are always unique. I need for each unique ID and Name get the top 1 TeamName and number of Team relations, like this:
table #FinalTable
|ID| Name | TeamName | NumberOfRelations |
| 1| Peter | Alpha | 3 |
| 2| Mary | Gamma | 2 |
| 3| John | Kappa | 2 |
Question - is there a way of doing this in one query, or do I have to use temporary tables for selection top 1 team and for counting number of relations and then select data indo separate final table?
I tried something like this:
;WITH cte AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY TeamName Asc) AS rn
FROM UserData
)
SELECT * into #tempTable1
FROM cte
WHERE rn = 1
and this:
insert into #tempTable2 (ID, Name, NumberOfRelations)
select ID, Name, count(*) as NumberOfRelations
from UserData
group by ID, Name
...and then selecting data from two temp tables.
I wonder if there's more simple way of doing it.

For SQLserver:
You don't have order by,so i choose one below...
select top 1 with ties id,playen,count(id) over (partition by id,playen) as countt
,temaname
from #temp t1
order by row_number() over (partition by id,playen order by id,playen,temaname)
Output:
id playen countt temaname
1 Peter 3 Alpha
2 Mary 2 Gamma
3 John 2 Delta

Assuming this is SQL Server try this :
Select t.ID, t.Name, team.TeamName, count(t.TeamName) countt
from #temp t join
(Select id, TeamName, Row_Number() over (Partition By ID Order By TeamName asc) as rn
from #temp) team on (team.ID = t.ID and team.rn=1)
Group by t.ID, t.Name, team.TeamName

SQL tables represent unordered sets. There is no first team name, unless a column specifies the ordering. You don't seem to have such a column.
If you had such a column:
WITH cte AS (
SELECT ud.*,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ??) as seqnum,
COUNT(*) OVER (PARTITION BY ID) as cnt
FROM UserData ud
)
SELECT cte.*
FROM cte
WHERE seqnum = 1;
Note the ??. This is to specify the ordering for getting the team name. Depending on the database, you can use NULL or (SELECT NULL) to get an arbitrary team name.

Related

find each student's highest score

I have this table contains student id and their score for each
|student id |score|
| aac | 3 |
| aaa | 6 |
| aac | 5 |
| aaa | 7 |
| aad | 3 |
I want to find the highest score for each student. How do I do it?
I tried going through every student ID on the list but it is not efficient.
For the exact table you gave, a simple group by query should work:
SELECT student_id, MAX(score) AS max_score
FROM yourTable
GROUP BY student_id;
You can use window function row_number
select
student_id,
score
from
(
select
*,
row_number() over (partition by student_id order by score desc) as rn
from yourTable
) subq
where rn = 1

Select the highest value of column 2 per column 1

Given the following table P_PROV
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 1 |19/06/2019 | 1 |
| 2 |18/07/2010 | 2 |
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
I want this output
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
Putting this in words, I want to return per person the maximum date. I tried something like this
SELECT DISTINCT pp.date, pp.id FROM P_PROV pp
WHERE (SELECT MAX(aa.date)
FROM P_PROV aa) = pp.date;
This one is only returning one row (of course, because the MAX will return the maximum date only), but I really don't know how to approach this issue, any kind of help would be appreciated
ROW_NUMBER provides one way to handle this:
SELECT id, date, person_id
FROM
(
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY date DESC) rn
FROM yourTable t
) t
WHERE rn = 1;
Oracle has a fun way to do this using aggregation:
select max(id) keep (dense_rank first order by date desc) as id,
max(date) as date, person_id
from P_PROV
group by person_id;
Given that your ids are increasing, this probably also does what you want:
select max(id) as id, max(date) as date, person_id
from P_PROV
group by person_id;

Getting distinct values of induvidual columns

I have a table A as below
id| Name|Subject
--|-----|-------
1 |Mano |Science
2 |Pavi |Maths
3 |Mano |Social
1 |Kalai|Maths
4 |Kalai|Science
I want distinct values for each column.
So My output be like
id|Name | Subject
--|-----|--------
1 |Mano |Science
2 |Pavi |Maths
3 |Kalai|Social
4 | |
I have tried using cursors. But I didn't get what I needed.
Anybody help me in getting this
You seem to just want a list of the distinct values, without regards to what appears together. This isn't very SQL'ish, but can be done:
select row_number() over (order by n.seqnum) as firstcol, n.name, s.subject
from (select name, row_number() over (order by name) as seqnum
from t
group by name
) n full outer join
(select subject, row_number() over (order by subject) as seqnum
from t
group by subject
) s
on s.seqnum = n.seqnum;
select *
from (select col,val,dense_rank () over (partition by col order by val) as dr
from mytable unpivot (val for col in (name,subject)) u
) pivot (min(val) for col in ('NAME','SUBJECT'))
order by dr
+----+-------+---------+
| DR | NAME | SUBJECT |
+----+-------+---------+
| 1 | Kalai | Maths |
| 2 | Mano | Science |
| 3 | Pavi | Social |
+----+-------+---------+

SQL subquery to return rank 2

I have a question about writing a sub-query in Microsoft T-SQL. From the original table I need to return the name of the person with the second most pets. I am able to write a query that returns the number of perts per person, but I'm not sure how to write a subquery to return rank #2.
Original table:
+—————————-——+———-————-+
| Name | Pet |
+————————————+————-————+
| Kathy | dog |
| Kathy | cat |
| Nick | gerbil |
| Bob | turtle |
| Bob | cat |
| Bob | snake |
+—————————-——+—————-———+
I have the following query:
SELECT Name, COUNT(Pet) AS NumPets
FROM PetTable
GROUP BY Name
ORDER BY NumPets DESC
Which returns:
+—————————-——+———-————-+
| Name | NumPets |
+————————————+————-————+
| Bob | 3 |
| Kathy | 2 |
| Nick | 1 |
+—————————-——+—————-———+
You are using TSQL So:
WITH C AS (
SELECT COUNT(Pet) OVER (PARTITION BY Name) cnt
,Name
FROM PetTable
)
SELECT TOP 1 Name, cnt AS NumPets
FROM C
WHERE cnt = 2
The ANSI standard method is:
OFFSET 1 FETCH FIRST 1 ROW ONLY
However, most databases have their own syntax for this, using limit, top or rownum. You don't specify the database, so I'm sticking with the standard.
This is how you could use ROW_NUMBER to get the result.
SELECT *
FROM(
SELECT ROW_NUMBER() OVER (ORDER BY COUNT(name) DESC) as RN, Name, COUNT(NAME) AS COUNT
FROM PetTable
GROUP BY Name
) T
WHERE T.RN = 2
In MSSQL you can do this:
SELECT PetCounts.Name, PetCounts.NumPets FROM (
SELECT
RANK() OVER (ORDER BY COUNT(Pet) DESC) AS rank,
Name, COUNT(Pet)as NumPets
FROM PetTable
GROUP BY Name
) AS PetCounts
WHERE rank = 2
This will return multiple rows if they have the same rank. If you want to return just one row you can replace RANK() with ROW_NUMBER()

Select entire partition where max row in partition is greater than 1

I'm partitioning by some non unique identifier, but I'm only concerned in the partitions with at least two results. What would be the way to get out all the instances where there's exactly one of the specified identifier?
Query I'm using:
SELECT ROW_NUMBER() OVER
(PARTITION BY nonUniqueId ORDER BY nonUniqueId, aTimeStamp) as row
,nonUniqueId
,aTimeStamp
FROM myTable
What I'm getting:
row | nonUniqueId | aTimeStamp
---------------------------------
1 | 1234 | 2014-10-08...
2 | 1234 | 2014-10-09...
1 | 1235 | 2014-10-08...
1 | 1236 | 2014-10-08...
2 | 1236 | 2014-10-09...
What I want:
row | nonUniqueId | aTimeStamp
---------------------------------
1 | 1234 | 2014-10-08...
2 | 1234 | 2014-10-09...
1 | 1236 | 2014-10-08...
2 | 1236 | 2014-10-09...
Thanks for any direction :)
Based on syntax, I'm assuming this is SQL Server 2005 or higher. My answer will be meant for that.
You have a couple options.
One, use a CTE:
;WITH CTE AS (
SELECT ROW_NUMBER() OVER
(PARTITION BY nonUniqueId ORDER BY nonUniqueId, aTimeStamp) as row
,nonUniqueId
,aTimeStamp
FROM myTable
)
SELECT *
FROM CTE t
WHERE EXISTS (SELECT 1 FROM CTE WHERE row = 2 and nonUniqueId = t.nonUniqueId);
Or, you can use subqueries:
SELECT ROW_NUMBER() OVER
(PARTITION BY nonUniqueId ORDER BY nonUniqueId, aTimeStamp) as row
,nonUniqueId
,aTimeStamp
FROM myTable t
WHERE EXISTS (SELECT 1 FROM myTable
WHERE nonUniqueId = t.nonUniqueId GROUP BY nonUniqueId, aTimeStamp HAVING COUNT(*) >= 2);