Assume there's a table students containing student id, semester id and number of courses taken in each semester:
student semester num_courses
10001 23 3
10002 23 5
10003 23 1
10004 25 2
10005 25 3
10003 25 5
10000 26 4
10013 26 2
...
How can I find students in each semester who took the largest number of courses?
Expected output would be:
student semester num_courses
10002 23 5
10003 25 5
10000 26 4
...
I thought of using OVER PARTITION BY, but I'm not sure how to use it correctly with this type of query. What I got is not working as expected.
SELECT s.student, s.semester, MAX(s.num_courses) OVER (PARTITION BY s.semester) AS max
FROM students s;
Your idea is good. Now that you have the students' semester course counts along with the maximum semester course counts, compare the two:
SELECT semester, num_courses, student
FROM
(
SELECT
student,
semester,
num_courses,
MAX(num_courses) OVER (PARTITION BY semester) AS semester_max_num_courses
FROM students s
) with_max
WHERE num_courses = semester_max_num_courses
ORDER BY semester, student;
Another approach would be to select all maximum semester course counts and then use this to get the students:
SELECT semester, num_courses, student
FROM students
WHERE (semester, num_courses) IN
(
SELECT semester, MAX(num_courses)
from students
GROUP BY semester
)
ORDER BY semester, student;
With row_number() you can easily have a serial number in descending order of num_courses. Then just pick the rows with first serial number for each semister.
Subquery:
select student ,semester, num_courses from
(
Select student ,semester, num_courses, row_number()over(partition by semester
order by num_courses desc)rn from students
)t
where rn=1
Common table expression
With cte as
(
Select student ,semester, num_courses, row_number()over(partition by semester
order by num_courses desc)rn from students
)
select student ,semester, num_courses from cte where rn=1
Another approach, also if more than one student has max number of courses
WITH cte as
(
SELECT s.semester, MAX(s.num_courses) as num
FROM students s
GROUP BY s.semester
)
SELECT s.student, s.semester, s.num_courses FROM students s JOIN cte on s.semester =
cte.semester
WHERE cte.num = s.course
ORDER BY s.semester
If you want one row per semester, then in Postgres, I would recommend distinct on:
select distinct on (semester) s.*
from students s
order by semester, num_courses desc;
If you want all students who have the maximum number, then Thorsten's answer is appropriate.
Got this as an interview question from Amazon to test basic SQL skills and I kind of flopped it. Consider the following tables:
Student - Stid, Stname, Details
Subject - Subid, Subname
Marks - Stid, Subid, mark
Write a query to print the list of names of students who have scored the maximum mark in each subject.
The wrong answer which I gave was:
select A.Stname from A as Student, B as
(select Stid, Subid, max(mark) from Marks groupby Subid) where A.Stid = B.Stid
I was thinking you can have a table B in which you can get the top marks alone and match it with the names in the student table A. But turns out my "groupby" is wrong.
One more variation of the question which I felt was can be asked is that, if there is more than one student having the highest mark in a subject, even their names should be included.
Can you please help solve these queries. They seem to be simple, but I am not able to get a hang of it.
Thanks!
You just have to partition the problem into some bite-sized steps and solve each one by one
First, get the maximum score in each subject:
select SubjectID, max(MarkRate)
from Mark
group by SubjectID;
Then query who are those that has SubjectID with max MarkRate:
select SubjectID, MarkRate, StudentID
from Mark
where (SubjectID,MarkRate)
in
(
select SubjectID, max(MarkRate)
from Mark
group by SubjectID
)
order by SubjectID, StudentID;
Then obtain the Student's name, instead of displaying just the StudentID:
select SubjectName, MarkRate, StudentName
from Mark
join Student using(StudentID)
join Subject using(SubjectID)
where (SubjectID,MarkRate)
in
(
select SubjectID, max(MarkRate)
from Mark
group by SubjectID
)
order by SubjectName, StudentName
Database vendors' artificial differences aside with regards to joining and correlating results, the basic step is the same; first, partition the problem in a bite-sized parts, and then integrate them as you solved each one of them, so it won't be as confusing.
Sample data:
CREATE TABLE Student
(StudentID int, StudentName varchar(6), Details varchar(1));
INSERT INTO Student
(StudentID, StudentName, Details)
VALUES
(1, 'John', 'X'),
(2, 'Paul', 'X'),
(3, 'George', 'X'),
(4, 'Paul', 'X');
CREATE TABLE Subject
(SubjectID varchar(1), SubjectName varchar(7));
INSERT INTO Subject
(SubjectID, SubjectName)
VALUES
('M', 'Math'),
('E', 'English'),
('H', 'History');
CREATE TABLE Mark
(StudentID int, SubjectID varchar(1), MarkRate int);
INSERT INTO Mark
(StudentID, SubjectID, MarkRate)
VALUES
(1, 'M', 90),
(1, 'E', 100),
(2, 'M', 95),
(2, 'E', 70),
(3, 'E', 95),
(3, 'H', 98),
(4, 'H', 90),
(4, 'E', 100);
Live test here: http://www.sqlfiddle.com/#!1/08728/3
IN tuple test is still a join by any other name:
Convert this..
select SubjectName, MarkRate, StudentName
from Mark
join Student using(StudentID)
join Subject using(SubjectID)
where (SubjectID,MarkRate)
in
(
select SubjectID, max(MarkRate)
from Mark
group by SubjectID
)
order by SubjectName, StudentName
..to JOIN:
select SubjectName, MarkRate, StudentName
from Mark
join Student using(StudentID)
join Subject using(SubjectID)
join
(
select SubjectID, max(MarkRate) as MarkRate
from Mark
group by SubjectID
) as x using(SubjectID,MarkRate)
order by SubjectName, StudentName
Contrast this code with the code immediate it. See how JOIN on independent query look like an IN construct? They almost look the same, and IN was just replaced with JOIN keyword; and the replaced IN keyword with JOIN is in fact longer, you need to alias the independent query's column result(max(MarkRate) AS MarkRate) and also the subquery itself (as x). Anyway, this are just matter of style, I prefer IN clause, as the intent is clearer. Using JOINs merely to reflect the data relationship.
Anyway, here's the query that works on all databases that doesn't support tuple test(IN):
select sb.SubjectName, m.MarkRate, st.StudentName
from Mark as m
join Student as st on st.StudentID = m.StudentID
join Subject as sb on sb.SubjectID = m.SubjectID
join
(
select SubjectID, max(MarkRate) as MaxMarkRate
from Mark
group by SubjectID
) as x on m.SubjectID = x.SubjectID AND m.MarkRate = x.MaxMarkRate
order by sb.SubjectName, st.StudentName
Live test: http://www.sqlfiddle.com/#!1/08728/4
SQL> select * from stud;
STUDENTID NAME DETAILS
---------- ------------------------------ ------------------------------
1 Alfred AA
2 Betty BB
3 Chris CC
SQL> select * from subject;
SUBJECTID NAME
---------- ------------------------------
1 Maths
2 Science
3 English
SQL> select * from marks;
STUDENTID SUBJECTID MARK
---------- ---------- ----------
1 1 61
1 2 75
1 3 87
2 1 82
2 2 64
2 3 77
3 1 82
3 2 83
3 3 67
9 rows selected.
SQL> select name, subjectid, mark
2 from (select name, subjectid, mark, dense_rank() over(partition by subjectid order by mark desc) rank
3 from stud st, marks mk
4 where st.studentid=mk.studentid)
5 where rank=1;
NAME SUBJECTID MARK
------------------------------ ---------- ----------
Betty 1 82
Chris 1 82
Chris 2 83
Alfred 3 87
SQL>
My attempt - I'd start with the max mark and build from there
Schema:
CREATE TABLE Student (
StudentId int,
Name nvarchar(30),
Details nvarchar(30)
)
CREATE TABLE Subject (
SubjectId int,
Name nvarchar(30)
)
CREATE TABLE Marks (
StudentId int,
SubjectId int,
Mark int
)
Data:
INSERT INTO Student (StudentId, Name, Details)
VALUES (1,'Alfred','AA'), (2,'Betty','BB'), (3,'Chris','CC')
INSERT INTO Subject (SubjectId, Name)
VALUES (1,'Maths'), (2, 'Science'), (3, 'English')
INSERT INTO Marks (StudentId, SubjectId, Mark)
VALUES
(1,1,61),(1,2,75),(1,3,87),
(2,1,82),(2,2,64),(2,3,77),
(3,1,82),(3,2,83),(3,3,67)
GO
My query would have been:
;WITH MaxMarks AS (
SELECT SubjectId, MAX(Mark) as MaxMark
FROM Marks
GROUP BY SubjectId
)
SELECT s.Name as [StudentName], sub.Name AS [SubjectName],m.Mark
FROM MaxMarks mm
INNER JOIN Marks m
ON m.SubjectId = mm.SubjectId
AND m.Mark = mm.MaxMark
INNER JOIN Student s
ON s.StudentId = m.StudentId
INNER JOIN Subject sub
ON sub.SubjectId = mm.SubjectId
SQL Fiddle Example
Find the max mark for each subject
Join Marks, Student and Subject to find the relevant details of that highest mark
This also take care of duplicate students with the highest mark
Results:
STUDENTNAME SUBJECTNAME MARK
Alfred English 87
Betty Maths 82
Chris Maths 82
Chris Science 83
I would have said:
select s.stname, s2.subname, highmarks.mark
from students s
join marks m on s.stid = m.stid
join Subject s2 on m.subid = s2.subid
join (select subid, max(mark) as mark
from marks group by subid) as highmarks
on highmarks.subid = m.subid and highmarks.mark = m.mark
order by subname, stname;
SQLFiddle here: http://sqlfiddle.com/#!2/5ef84/3
This is a:
select on the students table to get the possible students
a join to the marks table to match up students to marks,
a join to the subjects table to resolve subject ids into names.
a join to a derived table of the maximum marks in each subject.
Only the students that get maximum marks will meet all three join conditions. This lists all students who got that maximum mark, so if there are ties, both get listed.
I like the simple solution using windows functions:
select t.*
from (select student.*, su.subname, max(mark) over (partition by subid) as maxmark
from marks m join
students st
on m.stid = st.stid join
subject su
on m.subid = su.subid
) t
where t.mark = maxmark
Or, alternatively:
select t.*
from (select student.*, su.subname, rank(mark) over (partition by subid order by mark desc) as markseqnum
from marks m join
students st
on m.stid = st.stid join
subject su
on m.subid = su.subid
) t
where markseqnum = 1
Just for fun, consider the different question one would get with a very literal interpretation of the OP's description: "Write a query to print the list of names of students who have scored the maximum mark in each subject."
Those who've answered here have written queries to list a student if he or she scored the maximum mark in any one subject, but not necessarily in all subjects. Since the question posed by the OP does not ask for subject names in the output, this is a plausible interpretation.
To list the names of students (if any) who have scored the maximum mark in all subjects (excluding subjects with no marks, since arguably there is then no maximum mark then), I think this works, using column names from Michael's SQL Fiddle, which I've adapted here.
select StudentName
from Student
where not exists (
select * from Subject
where exists (
select * from Mark as M1
where M1.SubjectID = Subject.SubjectID
and M1.StudentID <> Student.StudentID
and not exists (
select * from Mark as M2
where M2.StudentID = Student.StudentID
and M2.SubjectID = M1.SubjectID
and M2.MarkRate >= M1.MarkRate
)
)
)
In other words, select a student X's name if there is no subject in which someone's mark for that subject is not matched or exceeded by some mark belonging to X for the same subject. (This query returns a student's name if the student has received more than one mark in a subject, so long as one of those marks is the highest for the subject.)
Using a Subquery:
select m.subid, m.stid, m.mark
from marks m,
(select m2.subid, max(m2.mark) max_mark
from marks m2
group by subid) subq
where subq.subid = m.subid
and subq.max_mark = m.mark
order by 1,2;
Using Rank with Partition:
select subid, stid, mark
from (select m.subid, m.stid, m.mark,
rank() over (partition by m.subid order by m.mark desc) Mark_Rank
from marks m)
where Mark_Rank = 1
order by 1,2;
select max(m.mark) as maxMarkObtained,su.Subname from Student s
inner join Marks m on s.Stid=m.Stid inner join [Subject] su on
su.Subid=m.Subid group by su.Subname
I have execute it , this should work.
Select S.StudentName
From Student S
where S.StudentID IN
(Select StudentID from (
( Select Max(MarkRate)as MarkRate,SubjectID From Mark Group by SubjectID)) MaxMarks, Mark
where MaxMarks.SubjectID= Mark.SubjectID AND MaxMarks.MarkRate=Mark.MarkRate)
I will try to get the answer with one query using CTE and window function rank()
create the tables
create table Students
(student_id int,
Name varchar(255),
details varchar(255));
create table Subject(
Sub_id int,
name varchar(255));
create table marks
(student_id int,
subject_id int,
mark int);
the answer should be a table with the below fields
student_name | subject_name | mark
plan the execution steps
Create a CTE
join the tables;
rank the subjects, order them by mark descending
Select the output fields from the CTE
with CTE as (select s.name, sb.name as subject_name, m.mark, rank() over(partition by sb.name order by m.mark desc) as rn
from Students s
join marks m on s.student_id = m.student_id
join subject sb
on sb.Sub_id = m.subject_id)
select name , subject_name, mark
from CTE
where rn = 1
select Student.Name,dt.Mark from Student inner join
(
select StudentId, Mark from Marks where Mark in
(
select Max(Mark) from Marks group by SubjectId
)
)dt
on dt.StudentId = Student.StudentId;
SELECT subjectname,
studentname
FROM student s
INNER JOIN mark m
ON s.studid = m.studid
INNER JOIN subject su
ON su.subjectid = m.subjectid
INNER JOIN (
SELECT subjectid,
max(value) AS maximum
FROM mark
GROUP BY subjectid
) highmark h
ON h.subjectid = m.subjectid
AND h.maximum = m.value;