Limitations of GROUP BY

Limitations of GROUP BY - sql

Disclaimer: I'm an SQL newb and this is for a class, but I could really use a poke in the right direction.
I've got these three tables:
student(_sid_, sname, sex, age, year, gpa)
section(_dname_, _cno_, _sectno_, pname)
enroll(_sid_, grade, _dname_, _cno_, _sectno_)
(primary keys denoted by underscores)
I'm trying to write an Oracle-compatible SQL query that returns a table with the student's name (student.sname) that has the highest gpa in each section (that's including section.cno and section.sectno) as well as all the other attributes from section.
I've managed to use an aggregate query and GROUP BY to get the maximum GPA for each section:
SELECT MAX(s.gpa), e.cno, e.sectno
FROM enroll e,
student s
WHERE s.sid = e.sid
GROUP BY e.cno, e.sectno
Let alone the other section attributes, I can't even figure out how to tack on the student name (student.sname). If I add it to the SELECT clause, it has to be included in GROUP BY which messes up the rest of the query. If I use this entire query inside the WHERE or FROM clause of an outer query, I can only access the three fields in the table, which isn't that much use.
I know you can't give me the exact answer, but any hints would be appreciated!

Assuming Oracle 9i+, to get only one of the students with the highest GPA (in the event of ties) use:
WITH summary AS (
SELECT e.*,
s.name,
ROW_NUMBER() OVER(PARTITION BY e.cno, e.sectno
ORDER BY s.gpa DESC) AS rank
FROM ENROLL e
JOIN STUDENT s ON s.sid = e.sid)
SELECT s.*
FROM summary s
WHERE s.rank = 1
Non CTE equivalent:
SELECT s.*
FROM (SELECT e.*,
s.name,
ROW_NUMBER() OVER(PARTITION BY e.cno, e.sectno
ORDER BY s.gpa DESC) AS rank
FROM ENROLL e
JOIN STUDENT s ON s.sid = e.sid) s
WHERE s.rank = 1
If you want to see all students who tied for GPA, use:
WITH summary AS (
SELECT e.*,
s.name,
DENSE_RANK OVER(PARTITION BY e.cno, e.sectno
ORDER BY s.gpa DESC) AS rank
FROM ENROLL e
JOIN STUDENT s ON s.sid = e.sid)
SELECT s.*
FROM summary s
WHERE s.rank = 1

Hint: consider that there might be more than one student with the highest GPA in a class. An outer query only needs the three fields.

Here are some pointers :-
You are on the right track with your Group By query
This returns you the max GPA for each section based on the cno and sectno fields
Now, you have the Max GPA value for each cno and sectno combination.
Use this data that you have in like a reverse manner if u like to now find all the students matching these combination values.
HINT : Consider the results of your Group By query as a table and Use a INNER JOIN
Even If it is possible that there is more than 1 students for the same max GPA, you will still get them all
Hope this helps!!

This should give you what you are looking for. See Oracle's RANK() function for details on how the GPAs are ranked from highest to lowest by section.
Requirements:
Return a table with the student's name (student.sname) that has the highest gpa in each section (section.cno and section.sectno) as well as all the other attributes from section.
SELECT * FROM
(
SELECT
s.sname,
s.gpa,
sec.dname,
sec.cno,
sec.sectno,
sec.pname,
/* for each "sec.cno, sec.sectno", this will rank each GPA in order from highest to lowest. Ties will have the same rank. */
RANK() OVER(PARTITION BY sec.cno, sec.sectno ORDER BY s.gpa DESC) as r_rank
FROM
enroll e,
student s,
section sec
WHERE
/* join enroll with student */
s.sid = e.sid
/* join section with enroll */
AND sec.dname = e.dname
AND sec.cno = e.cno
AND sec.sectno = e.sectno
)
WHERE r_rank = 1 /* this returns only the highest GPA (maybe multiple students) for each "sec.cno, sec.sectno" combination */
;
Note: If you do not want ties, change RANK() to ROW_NUMBER()

Maybe shortest:
SELECT DISTINCT e.cno, e.sectno , e...,
FIRST_VALUE(s.sname) OVER
(PARTITION BY e.cno, e.sectno ORDER BY s.gpa DESC)
FROM enroll e,
student s
WHERE s.sid = e.sid
Or
SELECT A.*
FROM
( SELECT s._sid, s.sname, e.cno, e.sectno ,..., s.gpa
MAX(s.gpa) OVER (PARTITION BY e.cno, e.sectno) AS maxgpa
FROM enroll e,
student s
WHERE s.sid = e.sid
) A
WHERE A.maxgpa = A.gpa

Although this is answered long back but still I want to present a beautiful explanation , this is very useful for newbie. Also Introduction to SQL has such rules.

Related

'ALL' concept in SQL queries

Relational Schema:
Students (**sid**, name, age, major)
Courses (**cid**, name)
Enrollment (**sid**, **cid**, year, term, grade)
Write a SQL query that returns the name of the students who took all courses.I'm not sure how I capture the concept of 'ALL' in a SQL query.
EDIT:
I want to be able write it without aggregation as I want to use the same logic for writing the query in relational algebra as well.
Thanks for the help!

One way of writing such queries is to count the number of course and number of courses each student took, and compare them:
SELECT s.*
FROM students s
JOIN (SELECT sid, COUNT(DISTINCT cid) AS student_courses
FROM enrollment
GROUP BY sid) e ON s.sid = e.sid
JOIN (SELECT COUNT(*) AS cnt
FROM courses) c ON cnt = student_cursed

This gives course combinations that are possible but haven't been taken...
SELECT s.sid, c.cid FROM students CROSS JOIN courses
EXCEPT
SELECT sid, cid FROM enrollment
So, you can then do the same with the student list...
SELECT sid FROM students
EXCEPT
(
SELECT DISTINCT
sid
FROM
(
SELECT s.sid, c.cid FROM students CROSS JOIN courses
EXCEPT
SELECT sid, cid FROM enrollment
)
AS not_enrolled
)
AS slacker_students
I don't like it, but it avoids aggregation...

SELECT *
FROM Students
WHERE NOT EXISTS (
SELECT 1 FROM Courses
LEFT OUTER JOIN Enrollment ON Courses.cid = Enrollment.cid
AND Enrollment.sid = Students.sid
WHERE Enrollment.sid IS NULL
)
btw. names of tables should be in singular form, not plural

How to create a good request with "max(count(*))"?

I have to say who is the scientist who have been the most in mission. I tried this code but it wasn't successful:
select name
from scientist, mission
where mission.nums = chercheur.nums
having count(*) = (select max(count(numis)) from mission, scientist where
mission.nums = chercheur.nums
group by name)
I have done several modifications for this request but I only obtain errors (ora-0095 and ora-0096 if I remember correctly).
Also, I create my tables with:
CREATE TABLE Scientist
(NUMS NUMBER(8),
NAME VARCHAR2 (15),
CONSTRAINT CP_CHER PRIMARY KEY (NUMS));
CREATE TABLE MISSION
(NUMIS NUMBER(8),
Country VARCHAR2 (15),
NUMS NUMBER(8),
CONSTRAINT CP_MIS PRIMARY KEY (NUMIS),
CONSTRAINT CE_MIS FOREIGN KEY (NUMS) REFERENCES SCIENTIST (NUMC));

You could count the missions each scientist participated in, and wrap that query in a query with a window function that will rank them according to their participation:
SELECT name
FROM (SELECT name, RANK() OVER (PARTITION BY name ORDER BY cnt DESC) AS rk
FROM (SELECT name, COUNT(*) AS cnt
FROM scientist s
JOIN mission m ON s.nums = m.nums
GROUP BY name) t
) q
WHERE rk = 1

Step 0 : Format your code :-) It would make it much easier to visualize
Step 1 : Get the count of Numis by Nums in the Mission table. This will tell you how many missions were done by each Nums
This is done in the cte block cnt_by_nums
Next to get the name of the scientist by joining cnt_by_nums with scientist table.
After that you want to get only those scientists who have the cnt_by_missions as the max available value from cnt_by_num
with cnt_by_nums
as (select Nums,count(Numis) as cnt_missions
from mission
group by Nums
)
select a.Nums,max(b.Name) as name
from cnt_by_nums a
join scientist b
on a.Nums=b.Nums
group by a.Nums
having count(a.cnt_missions)=(select max(a1.cnt_missions) from cnt_by_nums a1)

I'd write a query like this:
SELECT NAME, COUNTER
FROM
(SELECT NAME, COUNT(*) AS COUNTER
FROM SCIENTIST S
LEFT JOIN MISSION M
ON S.NUMS=M.NUMS
GROUP BY NAME) NUM
INNER JOIN
(SELECT MAX(COUNTER) AS MAX_COUNTER FROM
(SELECT NAME, COUNT(*) AS COUNTER
FROM SCIENTIST S
LEFT JOIN MISSION M
ON S.NUMS=M.NUMS
GROUP BY NAME) C) MAX
ON NUM.COUNTER=MAX.MAX_COUNTER;
(it works on MYSQL, I hope it's the same in Oracle)

As you don't select the name of your scientist (only count their missions) you don't need to join those tables within the subquery. Grouping over the foreign key would be sufficient:
select count(numis) from mission group by nums
You column names are a bit weird but that's your choice ;-)
Selecting only the scientist with the most mission references could be achieved in two ways. One way would be your approach where you may get multiple scientists if they have the same max missions.
The first problem you have in your query is that you are checking an aggregation (HAVING COUNT(*) = ) without grouping. You are only grouping your subselect.
Second, you could not aggregate an aggregation (MAX(COUNT)) but you may select only the first row of that subselect ordered by it's size or select the max of it by subquerying the subquery.
Approach with only one line:
select s.name from scientist s, mission m
where m.nums = s.nums
group by name
having count(*) =
(select count(numis) from mission
group by nums
order by 1 desc
fetch first 1 row only)
Approach with double subquery:
select s.name from scientist s, mission m
where m.nums = s.nums
group by name having count(*) =
(select max(numis) from
(select count(numis) numis from mission group by nums)
)
Second approach would be doing the FETCH FIRST on yur final result but this would give you exactly 1 scientist even if there are multiple with the same max missions:
select s.name from scientist s, mission m
where m.nums = s.nums
group by name
order by count(*) desc
fetch first 1 row only
Doing a cartisian product is not state of the art but the optimizer would make it a "good join" with the given reference in the where clause.
Made these with IBM Db2 but should also work on Oracle.

If you want one row, then in Oracle 12+, you can do:
SELECT name, COUNT(*) AS cnt
FROM scientist s JOIN
mission m
ON s.nums = m.nums
GROUP BY name
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
In earlier versions, you would generally use a subquery:
SELECT s.*
FROM (SELECT name, COUNT(*) AS cnt
FROM scientist s JOIN
mission m
ON s.nums = m.nums
GROUP BY name
ORDER BY COUNT(*) DESC
) sm
WHERE rownum = 1;
If you want ties, then generally window functions would be a simple solution:
SELECT s.*
FROM (SELECT name, COUNT(*) AS cnt,
RANK() OVER (ORDER BY COUNT(*) DESC) as seqnum
FROM scientist s JOIN
mission m
ON s.nums = m.nums
GROUP BY name
ORDER BY COUNT(*) DESC
) sm
WHERE seqnum = 1;

How to find the following SQL query?

there is a table in SQL database, called Players:
Players (ID, name, age, gender, score)
where ID is the primary key.
Now I want to write a query to find the following results:
For each age, find the name and age of the player(s) with the highest score among all players of this age.
I wrote the following query:
SELECT P.name, P.age
FROM Players P
WHERE P.score = (SELECT MAX(P2.score) FROM Players P2)
GROUP BY P.age, P.name
ORDER BY S.age
However, the result of the above query is a list of players with the highest score among ALL players across all ages, not for EACH age.
Then I changed my query to the following:
SELECT P.name, P.age, MAX(P.score)
FROM Players P
GROUP BY P.age, P.name
ORDER BY P.age
However, the second query I wrote gives a list of players with each age, but for each age, there are not only the players with the highest score, but also other players with lower scores within this age group.
How should I fix my logic/query code?
Thank you!

You can use rank to do this.
select name, age
from (
SELECT *,
rank() over(partition by age order by score desc) rnk
FROM Players) t
where rnk = 1

Your original query is quite close. You just need to change the subquery to be a correlated subquery and remove the GROUP BY clause:
SELECT P.name, P.age
FROM Players P
WHERE P.score = (SELECT MAX(P2.score) FROM Players P2 WHERE p2.age = p.age)
ORDER BY P.age;
The analytic ranking functions are another very viable method for processing this question. Both methods can take advantage of an index on Players(age, score). This also wants an index on Players(score). With that index, this should have better performance on large data sets.

You can try it also.
SELECT p.name, p.age, p.score
FROM players p
INNER JOIN
(SELECT `age`, MAX(`score`) AS Maxscore
FROM players
GROUP BY `age`) pp
ON p.`age` = pp.`age`
AND p.`score` = pp.Maxscore;

Try it this will resolve your issue :
select p1.name,p1.age,p1.score from players p1 where p1.score =
(SELECT max(score) from players where age = p1.age) group by p1.age;
If you will required all records having same maximum score :
Then you will use this. I have tested both the query on my localhost.
SELECT p1.name,p1.age,p1.score FROM players p1
WHERE p1.score IN (SELECT MAX(score) FROM players GROUP BY age)

Writing two queries (sort students and cities according to average grades)

I have three tables:
1) Students: studentID (KEY), name, surname, address
2) Exams: examID (KEY), examName
3) Grades: studenID (KEY), examID(KEY), grade
How to write SQL query to show the best students (for example those with average grade above 9)?
How to write SQL query to rank Cities (column address) according to the average grade of their students?
I'm a system engineer, working with Unix and Linux systems and I am new in SQL, I only know about SQL basics, and I was trying to do this for past three days, with no success, so please help me. I presume it's not a complex thing for one who's experienced in SQL. Thanks a lot.

your first query to show the best students :
SELECT student.surname, students.surename, students.address
FROM
students INNER JOIN Grades ON Grades.StudentID=Students.StudentID
INNER JOIN Exams ON Grades.examID=exams.examID WHERE Grades.grade=
(SELECT MAX(grade) FROM Grades WHERE examID=exams.examID)
your second query to rank cities:
SELECT students.address
FROM
students INNER JOIN Grades ON Grades.StudentID=Students.StudentID
INNER JOIN Exams ON Grades.examID=exams.examID order by grades.grade DESC

Refer the fiddle here:
LINK 1 : http://sqlfiddle.com/#!4/ab4de6/19
LINK 2 : http://sqlfiddle.com/#!4/ab4de6/32
Below Queries should help you in Oracle:
--List of Students having Average grade >=9
SELECT S.studentID, S.NAME, S.SURNAME, S.ADDRESS, A.AVG_GRADE FROM
STUDENTS S JOIN
(
SELECT studentID, AVG(GRADE) AVG_GRADE FROM GRADES
GROUP BY studentID
) A
ON S.studentID = A.studentID
AND A.AVG_GRADE >=9
ORDER BY A.AVG_GRADE, S.studentID;
--------------------------------------------------------------------
--- Rank cities
SELECT A.ADDRESS, A.AVG_GRADE, ROWNUM RANKING FROM
(
SELECT S.ADDRESS, AVG(G.GRADE) AVG_GRADE FROM
STUDENTS S JOIN GRADES G
ON S.STUDENTID = G.STUDENTID
GROUP BY S.ADDRESS
ORDER BY 2 DESC
) A;
You need to know about the following concepts.
INNER QUERY / SUB QUERY
JOINS
AGGREGATE FUNCTIONS (Average Calculations)
GROUP BY
ORDER BY
ROWNUM

Adding 0 or null to missing fields

I have the following tables:
student(sid, sname)
teacher(tid, tname)
enrollment(sid, cid, tid)
course(cid, course)
rank(sid, tid, grade, date, valid)
I need to calculate the average grade of all the teachers (data is in rank table), when only the grade from the most recent date counts (and if it's invalid - ignore it).
I wrote the following query, and it's working nice. The problem is that I also need the average for ALL the teachers, including those who were not ranked yet/their rank is invalid (their average grade will be 0 in that case, and I'll have to count their students like I did for the others).
I think it's something with LEFT OUTER JOIN, but all the examples I see online have only two tables in FROM, and I can't figure out the right syntax in my case.
SELECT teacher.tid,
tname,
AVG(grade) AS avgGrade,
COUNT(DISTINCT enrollment.sid) AS studCount
FROM rank,
teacher,
enrollment,
( SELECT rank.sid, rank.tid, MAX(date) AS maxDate
FROM rank
GROUP BY sid, tid
) lastGrades
WHERE teacher.tid=enrollment.tid
AND rank.tid=teacher.tid
AND rank.tid=lastGrades.tid
AND rank.sid=lastGrades.sid
AND rank.date=lastGrades.maxDate
AND valid = TRUE
GROUP BY teacher.tid, tname

You could use a subquer to look up the latest rank per (teacher, student) combination. Use a left join to count enrollments that have not been ranked:
select t.tid
, t.tname
, avg(r.grade) as AverageRank
, count(distinct e.sid) as StudentCount
from teacher t
join enrollment e
on t.tid = e.tid
left join
rank r
on r.tid = t.tid
and r.sid = e.sid
and r.valid = true
and r.date =
(
select max(date)
from rank r2
where r2.sid = r.sid
and r2.tid = r.tid
and r2.valid = true
)
group by
t.tid
, t.tname
Example without data at SQL Fiddle.
The table design is kind of strange. You'd expect a student to enroll in a course, not in a teacher!

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Limitations of GROUP BY - sql

Hint: consider that there might be more than one student with the highest GPA in a class. An outer query only needs the three fields.

Although this is answered long back but still I want to present a beautiful explanation , this is very useful for newbie. Also Introduction to SQL has such rules.

Related

'ALL' concept in SQL queries

How to create a good request with "max(count(*))"?

How to find the following SQL query?

Writing two queries (sort students and cities according to average grades)

Adding 0 or null to missing fields

Categories

Resources