Find list of topper across each class when given individual scores for each subject - sql

I need help in writing an efficient query to find a list of toppers (students with maximum total marks in each class) when we are given individual scores for each subject across different classes. We are required to return 3 columns: class, topper_student name and topper_student_total marks.
I have used multiple sub-queries to find a solution. I am sure there would be much better implementations available for this problem (maybe via joins or window functions?).
Input table and my solution can be found at SQL Fiddle link.
http://www.sqlfiddle.com/#!15/2919e/1/0
Input table:

It would be clearer to use temporary tables to store results along the way and make the result traceable, but the solution can be achieved with a single query:
WITH student_marks AS (
SELECT Class_num, Name, SUM(Marks) AS student_total_marks
FROM School
GROUP BY Class_num, Name
)
SELECT Class_num, Name, student_total_marks
FROM (
SELECT Class_num, Name, student_total_marks, ROW_NUMBER() OVER(partition by Class_num order by student_total_marks desc, Class_num) AS beststudentfirst
FROM student_marks
) A
WHERE A.beststudentfirst = 1
The query within WITH statement calculate a sum of marks for every student in a class. At this point, subject is not required anymore. The result is temporarily stored into student_marks.
Next, we need to create a counter (beststudentfirst) using ROW_NUMBER to number the total marks from the highest to the lowest in each class (order by student_total_marks desc, Class_num). The counter should be reinitiated each time the class changes (partition by Class_num order).
From this last result, we only need the counter (beststudentfirst) with the value of one. It is the top student in each class.

Window functions are the most natural way to approach this. If you always want exactly three students, then use row_number():
select Class_num, Name, total_marks
from (select name, class_num, sum(marks) as total_marks,
row_number() over (partition by class_num order by sum(marks) desc) as seqnum
from School
group by Class_num, Name
) s
where seqnum <= 1
order by class_num, total_marks desc;
If you want to take ties into account, then use rank() or dense_rank().
Here is the SQL Fiddle.

select Class_num,[Name],total_marks from
(
select Row_number() over (partition by class_num order by Class_num,SUM(Marks) desc) as
[RN],Class_num,[Name],SUM(Marks) as total_marks
from School
group by Class_num,[Name]
)A
where RN=1

Related

SQL - finding the minimal value of a specific group and giving extended information about it

I would like to introduce myself as someone who has just recently started to fiddle a bit with SQL. Throughout my learning process I have come across a very specific problem and thus, my question is very specific too. Given the following table:
How should my list of commands look in order to get this following table:
In other words, what should I write to basically show the minimal salary and the id of its owner for each country. I have tried using GROUP BY but all I could get is the minimal salary per country whereas my goal was to show the id that belongs to the minimal salary too.
Hope I got my question clear and I thank everyone for the support.
This is a typical greatest-n-per-group problem.
One cross-database solution is to filter with a subquery:
select t.*
from mytable t
where t.salary = (select min(t1.salary) from mytable t1 where t1.country = t.country)
For performance with this query, you want an index on (country, salary).
You can also use window functions, if your database supports that:
select id, country, salary
from (
select t.*, rank() over(partition by country order by salary) rn
from mytable t
) t
where rn = 1
You can do by
select
id,
country,
salary
from
(
select
id,
country,
salary,
row_number() over (partition by country order by salary) as rnk
from table
)val
where rnk = 1

How to show 'the most popular name' in each year using SQL

I am practicing SQL in Google Cloud Platform and have a table with the most popular names in USA.
I need to write a query that returns the most popular name in every year.
The Table has 6 columns: id, state, gender, year, name, Number of occurrences of the name
So far I have:
SELECT DISTINCT year
FROM 'table'
But I don't now what next...
What you are looking for is the called the mode in statistics. One way to get this uses aggregation and window fucntions:
select year, name
from (select year, name, count(*) as cnt,
row_number() over (partition by year order by count(*) desc) as seqnum
from t
group by year, name
) yn
where seqnum = 1;
The above version returns one arbitrary name if there are ties for the most frequent. If you do want ties, use rank() instead of row_number().

Counting the unique values after group by clause

I need to count students in every major for an academic year. There are three terms in a year. One student declares a different major in every terms . I need to take the last major he/she declared and count all the students in a major. So only one student for one major.
When I do group by by major, I can't avoid the duplicates.
I have only one table. It has everything I need.
I wrote this code. And It gives me the duplicated count.
SELECT MAJR_CODE, MAJR_DESC, COUNT(DISTINCT ID_KEY)
FROM STUDENT_ENROLLMENT
WHERE TERM in ('201830','201910','201920')
and REGISTERED='Y'
GROUP BY MAJR_CODE, MAJR_DESC
ORDER BY MAJR_CODE
How can I get the result I want?
You can use window functions to get the data for the most recent term:
SELECT MAJR_CODE, MAJR_DESC, COUNT(*)
FROM (SELECT se.*, ROW_NUMBER() OVER (PARTITION BY ID_KEY DESC ORDER BY TERM DESC) as seqnum
FROM STUDENT_ENROLLMENT se
WHERE TERM in ('201830', '201910', '201920') AND
REGISTERED = 'Y'
) se
WHERE seqnum = 1
GROUP BY MAJR_CODE, MAJR_DESC
ORDER BY MAJR_CODE

I need the Top 10 results from table

I need to get the Top 10 results for each Region, Market and Name along with those with highest counts (Gaps). There are 4 Regions with 1 to N Markets. I can get the Top 10 but cannot figure out how to do this without using a Union for every Market. Any ideas on how do this?
SELECT DISTINCT TOP 10
Region, Market, Name, Gaps
FROM
TableName
ORDER BY
Region, Market, Gaps DESC
One approach would be to use a CTE (Common Table Expression) if you're on SQL Server 2005 and newer (you aren't specific enough in that regard).
With this CTE, you can partition your data by some criteria - i.e. your Region, Market, Name - and have SQL Server number all your rows starting at 1 for each of those "partitions", ordered by some criteria.
So try something like this:
;WITH RegionsMarkets AS
(
SELECT
Region, Market, Name, Gaps,
RN = ROW_NUMBER() OVER(PARTITION BY Region, Market, Name ORDER BY Gaps DESC)
FROM
dbo.TableName
)
SELECT
Region, Market, Name, Gaps
FROM
RegionsMarkets
WHERE
RN <= 10
Here, I am selecting only the "first" entry for each "partition" (i.e. for each Region, Market, Name tuple) - ordered by Gaps in a descending fashion.
With this, you get the top 10 rows for each (Region, Market, Name) tuple - does that approach what you're looking for??
I think you want row_number():
select t.*
from (select t.*,
row_number() over (partition by region, market order by gaps desc) as seqnum
from tablename t
) t
where seqnum <= 10;
I am not sure if you want name in the partition by clause. If you have more than one name within a market, that may be what you are looking for. (Hint: Sample data and desired results can really help clarify a question.)

SQL - Select top 1 with according to values from two columns

I know the title doesn't say much, but let me explain you my situation:
I have the following table:
Now, I would like to select top 1 from each department, but I don't want to get duplicate position id, so I want the top employee from each department by number of projects, but distinct position ids. The results are the highlighted rows.
You cannot guarantee that the returned positions will be the best. One position might be the best in two departments, in which case, one of the results constraints will need to be relaxed.
So, here is a method to get some (perhaps all) departments with the highest ranking but distinct positions. Start by choosing only the highest ranked employees for each department. These are the one with the most projects.
Then, for each PositionTypeId choose a random department from among these alternatives. Then, for each department, choose a random position type. The following query takes this approach:
select DepID, EmplyeeID, PositionTypeId, NumProjects
from (select t.*, row_number() over (partition by DepId order by newid()) as seqnum
from (select t.*, row_number() over (partition by PositionTypeId order by newid()) as position_seqnum
from (select t.*,
dense_rank() over (partition by DepId order by NumProducts desc
) as rank_seqnum
from t
) t
where rank_seqnum = 1
) t
where position_seqnum = 1
) t
where seqnum = 1;
This is not guaranteed to return a row for each department. But, it is guaranteed that all departments returned will have different position types and the rows will be best for that department. You could probably work to tweak the middle step to ensure a greater coverage of departments. However, because the problem is not guaranteed to have a solution, such tweaks may be more effort than they are worth.