How to get MAX value out of the GROUPs COUNT - sql

I've recently started to learn tsql beyond basic inserts and selects, I have test database that I train on, and there is one query that I can't really get to work.
There are 3 tables used in that query, in the picture there are simplified fields and relations
I have 2 following queries - first one is simply displaying students and number of marks from each subject. Second is doing almost what I want to achive - shows students and maxiumum amount of marks they got, so ex.
subject1 - (marks) 1, 5, 3, 4 count - 4
subject2 - (marks) 5, 4, 5 - count - 3
Query shows 4 and from what I checked it returns correct results, but I want one more thing - just to show the name of the subject from which there is maximum amount of marks so in the example case - subject1
--Query 1--
SELECT s.Surname, subj.SubjectName, COUNT(m.Mark) as Marks_count
FROM marks m, students s, subjects subj
WHERE m.StudentId = s.StudentNumber and subj.SubjectNumber = m.SubjectId
GROUP BY s.Surname, subj.SubjectName
ORDER BY s.Surname
--Query 2--
SELECT query.Surname, MAX(Marks_count) as Maximum_marks_count FROM (SELECT s.Surname, subj.SubjectNumber, COUNT(m.Mark) as Marks_count
FROM marks m, students s, subjects subj
WHERE marks.StudentId = s.StudentNumber and subj.SubjectNumber = m.SubjectId
GROUP BY s.Surname, subj.SubjectName) as query
GROUP BY query.Surname
ORDER BY query.Surname
--Query 3 - not working as supposed--
SELECT query.Surname, query.SubjectName, MAX(Marks_count) as Maximum_marks_count FROM (SELECT s.Surname, subj.SubjectNumber, COUNT(m.Mark) as Marks_count
FROM marks m, students s, subjects subj
WHERE marks.StudentId = s.StudentNumber and subj.SubjectNumber = m.SubjectId
GROUP BY s.Surname, subj.SubjectName) as query
GROUP BY query.Surname, query.SubjectName
ORDER BY query.Surname
Part of the query 1 result
Part of the query 2 and unfortunately query 3 result
The problem is that when I add to the select statement subject name I got results as from query one - there is no more maximum amount of marks just students, subjects and amount of marks from each subject.
If someone could say what I'm missing, I will much appreciate :)

Here's a query that gets the highest mark per student, put it at the top of your sql file/batch and it will make another "table" you can join to your other tables to get the student name and the subject name:
WITH studentBest as
SELECT * FROM(
SELECT *, ROW_NUMBER() OVER(PARTITION BY studentid ORDER BY mark DESC) rown
FROM marks) a
WHERE rown = 1)
You use it like this (for example)
--the WITH bit goes above this line
SELECT *
FROM
studentBest sb
INNER JOIN
subject s
ON sb.subjectid = s.subjectnumber
Etc
That's also how you should be doing your joins
How does it work? Well.. it establishes an incrementing counter that restarts every time studentid changes (the partition clause) and the numberin goes in des ending mark order (the order by clause). An outer query selects only those rows with 1 in the row number, ie the top mark per student
Why can't I use group by?
You can, but you have to write a query that summarises the marks table into the top mark (max) per student and then you have to join that data back to the mark table to retrieve the subject and all in it's a lot more faff, often less efficient
What if there are two subjects with the same mark?
Use RANK instead of ROW_NUMBER if you want to see both
Edit in response to your comment:
An extension of the above method:
SELECT * FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY su, st ORDER BY c DESC) rn FROM
(
SELECT studentid st, subjectid su, count(*) c
FROM marks
GROUP BY st, su
) a
) b
INNER JOIN student stu on b.st = stu.studentnumber
INNER JOIN subject sub on b.su = sub.subjectnumber
WHERE
b.rn = 1
We count the marks by student/subject, then rownumber them in descending order of count per student-subject pair, then choose only the first row and join in the other wanted data

Ok thanks to Caius Jard, some other Stack's question and a little bit of experiments I managed to write working query, so this is how I did it.
First I created view from query1 and added one more column to it - studentId.
Then I wrote query which almost satisfied me. That question helped me a lot with that task: Question
SELECT marks.Surname,
marks.SubjectName,
marks.Marks_count,
ROW_NUMBER() OVER(PARTITION BY marks.Surname ORDER BY marks.Surname) as RowNum
FROM MarksAmountPerStudentAndSubject marks
INNER JOIN (SELECT MarksAmountPerStudentAndSubject.Id,
MAX(MarksAmountPerStudentAndSubject.Marks_count) as MaxAmount
FROM MarksAmountPerStudentAndSubject
GROUP BY MarksAmountPerStudentAndSubject.Id) m
ON m.Id = marks.Id and marks.Marks_count = m.MaxAmount
It gives following results
That's what I wanted to achieve with one exception - if students have the same amount of marks from multiple subjects it displays all of them - thats fine but I decided to restrict this to the first result for each student - I couldn't just simply put TOP(1)
there so I used similar solution that Caius Jard showed - ROW_NUMBER and window function - it gave me a chance to choose records that has row number equals to 1.
I created another view from this query and I could simply write the final one
SELECT marks.Surname, marks.SubjectName, marks.Marks_count
FROM StudentsMaxMarksAmount marks
WHERE marks.RowNum = 1
ORDER BY marks.Surname
With result

Related

How to add this condition to this query?

I'm working on this query:
SELECT s.studentname,
Avg(cs.exam_season_one
+ cs.exam_season_two
+ cs.degree_season_one
+ cs.degree_season_two ) / 4 AS average
FROM courses_student cs
join students s
ON s.student_id = cs.student_id
join SECTION se
ON s.sectionid = se.sectionid
WHERE cs.courses_id = 1
AND ( se.classes_id = 2
OR se.classes_id = 5 )
AND s.studentname LIKE 'm%'
GROUP BY s.studentname
until this moment everything works perfectly but I need to add a last condition and I dont know how.
I need to get the sudents with the same average
I mean only students with count(average) > 1 (idk if this is right)
anyone knows how to solve this problem in this query?
PS: I use oracle.
Edit:
The create tables statements and sample data are here:
http://sqlfiddle.com/#!4/ebf636
trying to explain the problem more because maybe I did it the wrong way the first time!
First, the average is the average of 4 columns
The output I expect is to get the names of the students who belong to class 2 or class 5 (classes_id = 2, classes_id = 5), also their name should start with M
I want to check their average in a specific course (course_id = 1)
and the last condition I'm asking about is that I want to get the students who only have the same average in this course.
for example:
if we have 4 students and the averages in the course are (60,70,80,80) then I want to get only the last 2 student names because they have the same average. hope it's clear now!
You appear to be asking for the students where the average of the averages of their exam and degree seasons 1 and 2 are greater than 1 for certain sections.
Without sample data and expected output to validate against, its difficult to answer but, for each student, you want to GROUP BY the primary key that uniquely identifies the student (otherwise you may aggregate two students with the same name together) and you only need to check if the section exists (rather than JOINing the section as that would create duplicate rows if there are multiple matching sections and skew the averages).
Then
if we have 4 students and the averages in the course are (60,70,80,80) then I want to get only the last 2 student names because they have the same average
You want to count how many students have the same average and then filter out those with unique averages:
SELECT studentname,
average
FROM (
SELECT a.*,
COUNT(*) OVER (PARTITION BY average) AS num_with_average
FROM (
SELECT MAX(s.studentname) AS studentname,
( Avg(cs.exam_season_one)
+ Avg(cs.exam_season_two)
+ Avg(cs.degree_season_one)
+ Avg(cs.degree_season_two) ) / 4 AS average
FROM courses_student cs
join students s
ON s.student_id = cs.student_id
WHERE cs.courses_id = 1
AND s.studentname LIKE 'm%'
AND EXISTS(
SELECT 1
FROM section se
WHERE s.sectionid = se.sectionid
AND se.classes_id IN (2, 5)
)
GROUP BY
s.student_id
) a
)
WHERE num_with_average > 1;
db<>fiddle here

Grouping multiple rows into one string postgres for each position in select

I have a table education that has a column university. For each of the rows in the table I want to find 3 most similar universities from the table.
Here is my query that finds 3 most similar universities to a given input:
select distinct(university),
similarity(unaccent(lower(university)),
unaccent(lower('Boston university')))
from education
order by similarity(unaccent(lower(university)),
unaccent(lower('Boston university'))) desc
limit 3;
It works fine. But now I would like to modify this query so that I get two columns and a row for each existing university in the table: the first column would be the university name and the second would be the three most similar universities found in the database (or if its easier - four columns where the first is the university and the next 3 are the most similar ones).
What should this statement look like?
You could use an inline aggregated query:
with u as (select distinct university from education)
select
university,
(
select string_agg(u.university, ',')
from u
where u.university != e.university
order by similarity(
unaccent(lower(u.university)),
unaccent(lower(e.university))
) desc
limit 3
) similar_universities
from education e
This assumes that a given university may occur more than once in the education table (hence the need for a cte).
I think a lateral join and aggregation does what you want:
select e.university, e2.universities
from education e cross join lateral
(select array_agg(university order by sim desc) as universities
from (select e2.university,
similarity(unaccent(lower(e2.university)),
unaccent(lower(e.university))
) as sim
from education e2
order by sim desc
limit 3
) e2
) e2;
Note: The most similar university is probably the one with the same name. (You can filter that out with a where clause in the subquery.)
This returns the value as an array, because I prefer working with arrays rather than strings in Postgres.

SQL add multiple "Count" together

I'm trying to add the counts together and output the one with the max counts.
The question is: Display the person with the most medals (gold as place = 1, silver as place = 2, bronze as place = 3)
Add all the medals together and display the person with the most medals
Below is the code I have thought about (obviously doesn't work)
Any ideas?
Select cm.Givenname, cm.Familyname, count(*)
FROM Competitors cm JOIN Results re ON cm.competitornum = re.competitornum
WHERE re.place between '1' and '3'
group by cm.Givenname, cm.Familyname
having max (count(re.place = 1) + count(re.place = 2) + count(re.place = 3))
Sorry forgot to add that were not allowed to use ORDER BY.
Some data in the table
Competitors Table
Competitornum GivenName Familyname gender Dateofbirth Countrycode
219153 Imri Daniel Male 1988-02-02 Aus
Results Table
Eventid Competitornum Place Lane Elapsedtime
SWM111 219153 1 2 20 02
From what you've described it sounds like you just need to take the "Top" individual in the total medal count. In order to do that you would write something like this.
Select top 1 cm.Givenname, cm.Familyname, count(*)
FROM Competitors cm JOIN Results re ON cm.competitornum = re.competitornum
WHERE re.place between '1' and '3'
group by cm.Givenname, cm.Familyname
order by count(*) desc
Without using order by you have a couple of other options though I'm glossing over whatever syntax peculiarities sqlfire may use.
You could determine the max medal count of any user and then only select competitors that have that count. You could do this by saving it out to a variable or using a subquery.
Select cm.Givenname, cm.Familyname, count(*)
FROM Competitors cm JOIN Results re ON cm.competitornum = re.competitornum
WHERE re.place between '1' and '3'
group by cm.Givenname, cm.Familyname
having count(*) = (
Select max( count(*) )
FROM Competitors cm JOIN Results re ON cm.competitornum = re.competitornum
WHERE re.place between '1' and '3'
group by cm.Givenname, cm.Familyname
)
Just a note here. This second method is highly inefficient because we recalculate the max medal count for every row in the parent table. If sqlfire supports it you would be much better served by calculating this ahead of time, storing it in a variable and using that in the HAVING clause.
You are grouping by re.place, is that what you want? You want the results per ... ? :)
[edit] Good, now that's fixed you're almost there :)
The having is not needed in this case, you simply need to add a count(re.EventID) to your select and make a subquery out of it with a max(that_count_column).

SQL Select Statement

I think this is a pretty basic question and I have looked around on the site but I am not sure what to search on to find the answer.
I have an SQL table that looks like:
studentId period class
1 1 math
1 2 english
2 1 math
2 2 history
I am looking for a SELECT statement that finds the studentId that is taking math 1st period and english 2nd period. I have tried something like SELECT studentID WHERE ( period = 1 AND class= "math" ) AND ( period = 2 AND class = "english" ) but that has not worked.
I have also thought about changing my table to be:
studentId period1 period2 period3 period4 period5 etc
But I think I want to be adding things besides classes like after school activities and wanted to be able to expand easily without constantly having to add columns.
Thanks for any help you can give me.
try something like:
select studentid from table where ( period = 1 AND class= "math" ) or ( period = 2 AND class =
"english" ) group by studentid having count(*) >= 2
the idea is to select all who meet the first criteria or the second criteria, group it by person and see where all are met by checking the number of rows grouped
You can use subqueries to do each individually and get only results where both subqueries match.
Select StudentId FROM table WHERE
StudentId IN
(SELECT studentID FROM table WHERE ( period = 1 AND class= "math" ) )
AND
StudentId IN
(SELECT studentID FROM table WHERE ( period = 2 AND class= "english" ) )
Edit - added
I have not tested this myself, but I was curious about performance considerations, so I looked it up. I found this quote:
Many Transact-SQL statements that
include subqueries can be
alternatively formulated as joins.
Other questions can be posed only with
subqueries. In Transact-SQL, there is
usually no performance difference
between a statement that includes a
subquery and a semantically equivalent
version that does not. However, in
some cases where existence must be
checked, a join yields better
performance. Otherwise, the nested
query must be processed for each
result of the outer query to ensure
elimination of duplicates. In such
cases, a join approach would yield
better results. The following is an
example showing both a subquery SELECT
and a join SELECT that return the same
result set:
here: http://technet.microsoft.com/en-us/library/ms189575.aspx
You could also do a self join
SELECT t1.studentID
FROM table t1
JOIN table t2 ON t1.studentID = t2.studentID
WHERE ( t1.period = 1 AND t1.class= "math" )
AND ( t2.period = 2 AND t2.class = "english" )

calculate rank in highscore from 2 tables

i have a trivia game and i want to reward users for 2 events:
1) answering correctly
2) sending a question to the questions pool
i want to query for score and rank of a specific player and i use this query:
SELECT (correct*10+sent*30) AS score, #rank:=#rank+1 AS rank
FROM ( trivia_players
JOIN ( SELECT COUNT(*) AS sent, senderid
FROM trivia_questions
WHERE senderid='$userid'
) a
ON trivia_players.userid=a.senderid
)
ORDER BY score DESC
and it works if the player is in both tables i.e answered correctly AND sent a question.
but it doesn't work if a player hasn't sent a question
any idea how to fix this query? ($userid is the given parameter)
thanks!
Thanks Tom! only problem is the ranks are not correct:
userid score rank
58217 380 1
12354 80 3
32324 0 2
I would probably do it like this:
SELECT
user_id,
score,
rank
FROM
(
SELECT
TP.user_id,
(TP.correct * 10) + (COUNT(TQ.sender_id) * 30) AS score,
#rank:=#rank + 1 AS rank
FROM
Trivia_Players TP
LEFT OUTER JOIN Trivia_Questions TQ ON
TQ.sender_id = TP.user_id
GROUP BY
TP.user_id,
TP.correct
ORDER BY
score DESC
) AS SQ
WHERE
SQ.user_id = $user_id
I don't use MySQL much, so the syntax may not be perfect. I think that you can use a subquery like this in MySQL. Assuming that MySQL handles COUNT() by only counting rows with a non-null value for , this should work.
The keys are that you do a COUNT over a non-null column from Trivia Questions so that it counts them up by the user and you need to use a subquery so that you can get ranks for everyone BEFORE constraining to a particular user id.
Have you tried using a RIGHT JOIN or LEFT JOIN? Just off the top of my head!