Trying to figure out how to join these queries - sql

I have a table named grades. A column named Students, Practical, Written. I am trying to figure out the top 5 students by total score on the test. Here are the queries that I have not sure how to join them correctly. I am using oracle 11g.
This get's me the total sums from each student:
SELECT Student, Practical, Written, (Practical+Written) AS SumColumn
FROM Grades;
This gets the top 5 students:
SELECT Student
FROM ( SELECT Student,
, DENSE_RANK() OVER (ORDER BY Score DESC) as Score_dr
FROM Grades )
WHERE Student_dr <= 5
order by Student_dr;

The approach I prefer is data-centric, rather than row-position centric:
SELECT g.Student, g.Practical, g.Written, (g.Practical+g.Written) AS SumColumn
FROM Grades g
LEFT JOIN Grades g2 on g2.Practical+g2.Written > g.Practical+g.Written
GROUP BY g.Student, g.Practical, g.Written, (g.Practical+g.Written) AS SumColumn
HAVING COUNT(*) < 5
ORDER BY g.Practical+g.Written DESC
This works by joining with all students that have greater scores, then using a HAVING clause to filter out those that have less than 5 with a greater score - giving you the top 5.
The left join is needed to return the top scorer(s), which have no other students with greater scores to join to.
Ties are all returned, leading to more than 5 rows in the case of a tie for 5th.
By not using row position logic, which varies from darabase to database, this query is also completely portable.
Note that the ORDER BY is optional.

With Oracle's PLSQL you can do:
SELECT score.Student, Practical, Written, (Practical+Written) as SumColumn
FROM ( SELECT Student, DENSE_RANK() OVER (ORDER BY Score DESC) as Score_dr
FROM VOTES ) as score, students
WHERE score.score_dr <= 5
and score.Student = students.Student
order by score.Score_dr;

You can easily include the projection of the first query in the sub-query of the second.
SELECT Student
, Practical
, Written
, tot_score
FROM (
SELECT Student
, Practical
, Written
, (Practical+Written) AS tot_score
, DENSE_RANK() OVER (ORDER BY (Practical+Written) DESC) as Score_dr
FROM Grades
)
WHERE Student_dr <= 5
order by Student_dr;
One virtue of analytic functions is that we can just use them in any query. This distinguishes them from aggregate functions, where we need to include all non-aggregate columns in the GROUP BY clause (at least with Oracle).

Related

SQL What do I need to group by?

I am trying to get a better understanding of group by and count in SQL and tried to find the student who has been studying for the second longest time.
I need to also group by s.semester for it to work, just group by s.name alone (which is what I had done initially) does not work - why is this? I know this is right, but am trying to understand why for future practice questions.
select s.name
from students s
group by s.name, s.semester
having 1 = (select count (gold.name)
from students gold
where gold.name <> s.name
and gold.semester > s.semester
)
Thanks in advance!
To solve this task, it is not enough to use grouping. You need to use ranking functions.
You will receive the rank of each subsequent student or students, based on the number of semesters.
You can get students by rank using "where".
select
*
from(
select
row_number() over (partition by name_students, semestr order by semestr_count
asc) r_n,
name_students,
semestr
from (
select name_students, semestr, count(semestr_count) from studies
group by name_students, semestr
) agregate_studies
where r_n = 1
Using r_n for get some students top _ n : 2..3

SQL Select all MIN Values for each group

So I need to select all the students, having the minimum grade for each prof. For example, if Augustinus had two students with grade 1.0, then I would like to see both in the result.
Table of my data
What the result could look like, if the LIMIT was set to 10
So what I basically want is to see the best students that each prof has.
What I have tried is the following:
SELECT professor, student, min(note)
FROM temp
GROUP BY professor
ORDER BY note
The problem of course being that I only get one minimum value for each prof and not all minimum values.
*temp is just the table name
One way to solve these types of problems is to use a subquery to rank the grades for each class in a descending order. This involves a window function. With a second query you can limit the results based on your criteria of 10.
SELECT professor, student, note
FROM
(
SELECT professor,student,note,
row_number() over(partition by professor order by note desc) as downwardrank
) as rankings
WHERE
downwardrank <= 10
Just found a solution myself:
SELECT professor, student, note
FROM temp
WHERE (professor, note) IN
(SELECT professor, min(note)
FROM temp
GROUP BY professor
ORDER BY note)
ORDER BY note, professor, student
LIMIT 10

SQL Server Get one row for each student with highest date

I have two tables as follows:
I want to find the StudentId, FirstName, StudentLoginInfoId, LoginDate. I am expecting only one entry per student with higher LoginDate.
Expected result:
You could use ROW_NUMBER to number output of the result-set for each partition (here each student) in a subquery and achieve your desired output by applying a condition of the number assigned for each student to be 1 which will equal one row.
select studentid, firstname, studentlogininfoid, logindate
from (
select
s.studentid, s.firstname, sl.studentlogininfoid, sl.logindate,
row_number() over (partition by sl.studentid order by sl.logindate desc) as rn
from student s
inner join studentlogininfoid sl on s.studentid = sl.studentid
) t
where rn = 1
Explaining arguments for row_number:
PARTITION BY specifies what are your groups to enumerate separately (start from 1 for each group)
ORDER BY specifies how should rows be enumerated (based on which order)
If we enumerate rows for each student and sort them from latest date descending, then the first row for each student (the row with rn = 1) will contain highest login date value for that student.
You can use "CROSS APPLY" to find what you want:
SELECT S.StudentId
, S.FirstName
, SLI.StudentLoginInfoId
, SLI.LoginDate
FROM Student S
CROSS APPLY (SELECT TOP 1 * FROM StudentLoginInfo SLI WHERE S.StudentId = SLI.StudentId ORDER BY LoginDate DESC) SLI

SQL Finding maximum value without top command

Let's say I have a bases with a table:
-courses (key: name [ofthecourse], other attributes: year in which the course takes place)
I want to complete a query looking for an answer to the question:
On which year of study there is a maximum number of courses?
Normally, the query would be:
SELECT TOP 1 STUDYEAR
FROM COURSES
GROUP BY STUDYEAR
ORDER BY COUNT(CNO) DESC;
But my question is, which query could complete this without using the TOP 1 phrase?
You can use an inner query to get the maximum count. The only difference is though that it can return more than one record if they have the same count.
SELECT STUDYEAR
FROM COURSES
GROUP BY STUDYEAR
HAVING COUNT(CNO) = (SELECT MAX(CNOCount) FROM
(SELECT COUNT(CNO) CNOCount
FROM COURSES
GROUP BY STUDYEAR) X)
Another version with only one inner query:
SELECT STUDYEAR
FROM
(SELECT STUDYEAR, ROW_NUMBER() OVER (ORDER BY COUNT(CNO) DESC) RowNumber
FROM COURSES
GROUP BY STUDYEAR) X
WHERE RowNumber = 1

sql query finding most often level appear

I have a table Student in SQL Server with these columns:
[ID], [Age], [Level]
I want the query that returns each age value that appears in Students, and finds the level value that appears most often. For example, if there are more 'a' level students aged 18 than 'b' or 'c' it should print the pair (18, a).
I am new to SQL Server and I want a simple answer with nested query.
You can do this using window functions:
select t.*
from (select age, level, count(*) as cnt,
row_number() over (partition by age order by count(*) desc) as seqnum
from student s
group by age, level
) t
where seqnum = 1;
The inner query aggregates the data to count the number of levels for each age. The row_number() enumerates these for each age (the partition by with the largest first). The where clause then chooses the highest values.
In the case of ties, this returns just one of the values. If you want all of them, use rank() instead of row_number().
One more option with ROW_NUMBER ranking function in the ORDER BY clause. WITH TIES used when you want to return two or more rows that tie for last place in the limited results set.
SELECT TOP 1 WITH TIES age, level
FROM dbo.Student
GROUP BY age, level
ORDER BY ROW_NUMBER() OVER(PARTITION BY age ORDER BY COUNT(*) DESC)
Or the second version of the query using amount each pair of age and level, and max values of count pair age and level per age.
SELECT *
FROM (
SELECT age, level, COUNT(*) AS cnt,
MAX(COUNT(*)) OVER(PARTITION BY age) AS mCnt
FROM dbo.Student
GROUP BY age, level
)x
WHERE x.cnt = x.mCnt
Demo on SQLFiddle
Another option but will require later version of sql-server:
;WITH x AS
(
SELECT age,
level,
occurrences = COUNT(*)
FROM Student
GROUP BY age,
level
)
SELECT *
FROM x x
WHERE EXISTS (
SELECT *
FROM x y
WHERE x.occurrences > y.occurrences
)
I realise it doesn't quite answer the question as it only returns the age/level combinations where there are more than one level for the age.
Maybe someone can help to amend it so it includes the single level ages aswell in the result set: http://sqlfiddle.com/#!3/d597b/9
with combinations as (
select age, level, count(*) occurrences
from Student
group by age, level
)
select age, level
from combinations c
where occurrences = (select max(occurrences)
from combinations
where age = c.age)
This finds every age and level combination in the Students table and counts the number of occurrences of each level.
Then, for each age/level combination, find the one whose occurrences are the highest for that age/level combination. Return the age and level for that row.
This has the advantage of not being tied to SQL Server - it's vanilla SQL. However, a window function like Gordon pointed out may perform better on SQL Server.