Group by two columns, take sum, then max - sql

There are three columns: Id (char), Name (char), and Score (int).
First, we group by Id and Name and add Score for each group. Let us call the added score total_score.
Then, we group by Name and take only the maximum of total_score and its corresponding Id and Name. I've got everything else but I'm having a hard time figuring out how to get the Id. The error I get is
Column 'Id' is invalid in the select list because
it is not contained in either an aggregate function or the GROUP BY
clause.
WITH Tmp AS
(SELECT Id,
Name,
SUM(Score) AS total_score
FROM Mytable
GROUP BY Id,
Name)
SELECT Name, -- Id,
MAX(total_score) AS max_score
FROM Tmp
GROUP BY Name
ORDER BY max_score DESC

just add row_number() partition by Name to your query and get the 1st row (order by total_score descending)
select *
from
(
-- your existing `total_score` query
SELECT Id, Name,
SUM(Score) AS total_score,
r = row_number() over (partition by Name order by SUM(Score) desc)
FROM Mytable
GROUP BY Id, Name
) d
where r = 1

WITH Tmp AS
(SELECT Id,
Name,
SUM(Score) AS total_score
FROM Mytable
GROUP BY Id,
Name)
SELECT Name, Id,
MAX(total_score) AS max_score
FROM Tmp
GROUP BY Name,id
ORDER BY max_score DESC
Try this. Hope this will help.

WITH Tmp AS
(
SELECT Id,
Name,
SUM(Score) AS total_score
FROM Mytable
GROUP BY Id,
NAME
)
SELECT Name, Id,
MAX(total_score) AS max_score
FROM Tmp
GROUP BY Name,id
ORDER BY max_score DESC
Note:- If we are using aggregate function then we have to use other column as Group By....
In your case you are using SUM(Score) as aggregate function then we to use other column as Group by ...

I am not sure about performance of below query but we can use window functions to get maximum value from data partition.
SELECT
Id,
Name,
SUM(Score) AS total_score,
MAX(SUM(Score)) OVER(Partition by Name) AS max_score
FROM Mytable
GROUP BY Id, Name;
Tested -
declare #Mytable table (id int, name varchar(10), score int);
insert into #Mytable values
(1,'abc', 100),
(2,'abc', 200),
(3,'def', 300),
(3,'def', 400),
(4,'pqr', 500);
Output -
Id Name total_score max_score
1 abc 100 200
2 abc 200 200
3 def 700 700
4 pqr 500 500

You can select DENSE_RANK() with total_score column and then select records with Rank = 1. This will work for those also when there are multiple Name which are having same total_score.
WITH Tmp AS
(SELECT Id,
Name,
SUM(Score) AS total_score
FROM Mytable
GROUP BY Id, Name)
SELECT Id,
Name,
total_score AS max_score
FROM (SELECT Id,
Name,
total_score,
DENSE_RANK() OVER (PARTITION BY Name ORDER BY total_score DESC) AS Rank
FROM Tmp) AS Tmp2
WHERE Rank = 1

You can try this as well:
select id,name,max(total_score) over (partition by name) max_score from (
select id,name,sum(score) as total_score from YOURTABLE
group by id,name
) t

Related

Find first n sums of a column, grouped by two other columns

I have a table in SQL like this. Now I want to find sum of score grouped by column ID & Name, and show just two highest sums for each ID as below, so how can I solve this?
Can you try something like this:
Select ID, Name, Score From (
Select ID, Name, SUM(Score) score, row_number() over (partition by ID,Name order by SUM(Score) desc) rn
from Table
group by ID,Name) allScores
where rn > 2
You can Try This:
select top(2) Id , name , sum(Score) as sumScore
from table
group by id , name
order by sum(Score) desc

how to show a column thats removed by a group by in sql

Please see my query below. How can i set it up to return the 1 subobject that has the max age. if i group by subobject, then it shows all of them
select
object,
max(Age) as Age,
from table
group by 1,
Try The below
;with cte as
(
select Object,SubObject,ROW_NUMBER() over (partition by Object order by age desc) as rowNum
from table
)
select * from cte where rowNum=1
You can use an analytic function (MAX OVER):
select *
from
(
select t.*, max(age) over (partition by object) as max_age
from table t
) with_max_age
where age = max_age
order by object;

How do I create a new SQL table with custom column names and populate these columns

So I currently have an SQL statement that generates a table with the most frequent occurring value as well as the least frequent occurring value in a table. However this table has 2 rows with the row values as well as the fields. I need to create a custom table with 2 columns with min and max. Then have one row with one value for each. The value for these columns needs to be from the same row.
(SELECT name, COUNT(name) AS frequency
FROM firefighter_certifications
GROUP BY name
ORDER BY frequency DESC limit 1)
UNION
(SELECT name, COUNT(name) AS frequency
FROM firefighter_certifications
GROUP BY name
ORDER BY frequency ASC limit 1);
So for the above query I would need the names of the min and max values in one row. I need to be able to define the name of new columns for the generated SQL query as well.
Min_Name | Max_Name
Certif_1 | Certif_2
I think this query should give you the results you want. It ranks each name according to the number of times it appears in the table, then uses conditional aggregation to select the min and max frequency names in one row:
with cte as (
select name,
row_number() over (order by count(*) desc) as maxr,
row_number() over (order by count(*)) as minr
from firefighter_certifications
group by name
)
select max(case when minr = 1 then name end) as Min_Name,
max(case when maxr = 1 then name end) as Max_Name
from cte
Postgres doesn't offer "first" and "last" aggregation functions. But there are other, similar methods:
select distinct first_value(name) over (order by cnt desc, name) as name_at_max,
first_value(name) over (order by cnt asc, name) as name_at_min
from (select name, count(*) as cnt
from firefighter_certifications
group by name
) n;
Or without any subquery at all:
select first_value(name) over (order by count(*) desc, name) as name_at_max,
first_value(name) over (order by count(*) asc, name) as name_at_min
from firefighter_certifications
group by name
limit 1;
Here is a db<>fiddle

T-SQL : return latest value of each ID

I have a table looks like
ID, name, Likes, Login_time
select * from mytbl
I want to filter this table:
distinct ID, name like, login_time(last login time)
I tried this query, but it didn't work.
select *
from
(select
name, likes, login_time
rank() over (partition by id order by login_time desc) as rank
from
mytbl) t
where
t.rank = 1
use row_number instead of rank
select *
from
(
select
id, name, likes, login_time,
ROW_NUMBER() over (partition by id order by login_time desc) as rank
from
mytbl )t
where
t.rank = 1
Try this
;WITH temps AS
(
SELECT Id, name, likes, login_time, row_number() over(PARTITION BY Id ORDER BY login_time desc) AS RowIndex
)
SELECT Id, name, likes, login_time FROM temps
WHERE RowIndex = 1

Grouping while maintaining next record

I have a table (NerdsTable) with some of this data:
-------------+-----------+----------------
id name school
-------------+-----------+----------------
1 Joe ODU
2 Mike VCU
3 Ane ODU
4 Trevor VT
5 Cools VCU
When I run the following query
SELECT id, name, LEAD(id) OVER (ORDER BY id) as next_id
FROM dbo.NerdsTable where school = 'ODU';
I get these results:
[id=1,name=Joe,nextid=3]
[id=3,name=Ane,nextid=NULL]
I want to write a query that does not need the static check for
where school = 'odu'
but gives back the same results as above. In another words, I want to select all results in the database, and have them grouped correctly as if i went through individually and ran queries for:
SELECT id, name, LEAD(id) OVER (ORDER BY id) as next_id FROM dbo.NerdsTable where school = 'ODU';
SELECT id, name, LEAD(id) OVER (ORDER BY id) as next_id FROM dbo.NerdsTable where school = 'VCU';
SELECT id, name, LEAD(id) OVER (ORDER BY id) as next_id FROM dbo.NerdsTable where school = 'VT';
Here is the output I am hoping to see:
[id=1,name=Joe,nextid=3]
[id=3,name=Ane,nextid=NULL]
[id=2,name=Mike,nextid=5]
[id=5,name=Cools,nextid=NULL]
[id=4,name=Trevor,nextid=NULL]
Here is what I have tried, but am failing miserably:
SELECT id, name,
LEAD(id) OVER (ORDER BY id) as next_id
FROM dbo.NerdsTable
ORDER BY school;
-- Problem, as this does not sort by the id. I need the lowest id first for the group
SELECT id, name,
LEAD(id) OVER (ORDER BY id) as next_id
FROM dbo.NerdsTable
ORDER BY id, school;
-- Sorts by id, but the grouping is not correct, thus next_id is wrong
I then looked on the Microsoft doc site for aggregate functions, but do not see how i can use any to group my results correctly. I tried to use GROUPING_ID, as follows:
SELECT id, GROUPING_ID(name),
LEAD(id) OVER (ORDER BY id) as next_id
FROM dbo.NerdsTable
group by school;
But I get an error:
is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
Any idea as to what I am missing here?
From your desired output it looks like you are just trying to order the records by school. You can do that like this:
SELECT id, name
FROM dbo.NerdsTable
ORDER BY school ASC, id ASC
I don't know what next ID is supposed to mean.
create table schools (id int, name varchar(50), school varchar(3))
insert into schools values (1, 'Joe', 'ODU'), (2, 'Mike', 'VCU'), (3, 'Ane',
'ODU'), (4, 'Trevor', 'VT'), (5, 'Cools', 'VCU'), (6, 'Sarah', 'VCU')
select n.id, n.name, min(g.id) nextid
from schools n
left join
(
select id, school
from schools
) g on g.school = n.school and g.id > n.id
group by n.id, n.name
drop table schools