SQL What do I need to group by? - sql

I am trying to get a better understanding of group by and count in SQL and tried to find the student who has been studying for the second longest time.
I need to also group by s.semester for it to work, just group by s.name alone (which is what I had done initially) does not work - why is this? I know this is right, but am trying to understand why for future practice questions.
select s.name
from students s
group by s.name, s.semester
having 1 = (select count (gold.name)
from students gold
where gold.name <> s.name
and gold.semester > s.semester
)
Thanks in advance!

To solve this task, it is not enough to use grouping. You need to use ranking functions.
You will receive the rank of each subsequent student or students, based on the number of semesters.
You can get students by rank using "where".
select
*
from(
select
row_number() over (partition by name_students, semestr order by semestr_count
asc) r_n,
name_students,
semestr
from (
select name_students, semestr, count(semestr_count) from studies
group by name_students, semestr
) agregate_studies
where r_n = 1
Using r_n for get some students top _ n : 2..3

Related

How to Rank Based on Multiple Columns

I'm trying to score people in Microsoft Access based on the count they have for a particular category.
There are 7 possible categories a person can have against them, and I want to assigned each person a score from 1-7, with 1 being assigned to the highest scoring category, 7 being the lowest. They might not have an answer for every category, in which case that category can be ignored.
The aim would be to have an output result as shown in this image:
I've tried a few different things, including partition over and joins, but none have worked. To be honest I think I'm way off the mark with the queries I've been trying. I've tried to write the code in SQL from scratch, and used query builder.
Any help is really appreciated!
As you for an email can have duplicated counts, you will need two subqueries for this:
SELECT
Score.email,
Score.category,
Score.[Count],
(Select Count(*) From Score As T Where
T.email = Score.email And
T.[Count] >= Score.[Count])-
(Select Count(*) From Score As S Where
S.email = Score.email And
S.[Count] = Score.[Count] And
S.category > Score.category) AS Rank
FROM
Score
ORDER BY
Score.email,
Score.[Count] DESC,
Score.category;
For categories with equal Count values for the same email, the following will rank the records alphabetically descending by Category name (since this is what is shown in your example):
select t.email, t.category, t.count,
(
select count(*) from YourTable u
where t.email = u.email and
((t.count = u.count and t.category <= u.category) or t.count < u.count)
) as rank
from YourTable t
order by t.email, t.count desc, t.category desc
Change both references of YourTable to the name of your table.

Sql max trophy count

I Create DataBase in SQL about Basketball. Teacher give me the task, I need print out basketball players from my database with the max trophy count. So, I wrote this little bit of code:
select surname ,count(player_id) as trophy_count
from dbo.Players p
left join Trophies t on player_id=p.id
group by p.surname
and SQL gave me this:
but I want, that SQL will print only this:
I read info about select in selects, but I don't know how it works, I tried but it doesn't work.
Use TOP:
SELECT TOP 1 surname, COUNT(player_id) AS trophy_count -- or TOP 1 WITH TIES
FROM dbo.Players p
LEFT JOIN Trophies t
ON t.player_id = p.id
GROUP BY p.surname
ORDER BY COUNT(player_id) DESC;
If you want to get all ties for the highest count, then use SELECT TOP 1 WITH TIES.
;WITH CTE AS
(
select surname ,count(player_id) as trophy_count
from dbo.Players p
group by p.surname;
)
select *
from CTE
where trophy_count = (select max(trophy_count) from CTE)
While select top with ties works (and is probably more efficient) I would say this code is probably more useful in the real world as it could be used to find the max, min or specific trophy count if needed with a very simple modification of the code.
This is basically getting your group by first, then allowing you to specify what results you want back. In this instance you can use
max(trophy_count) - get the maximum
min(trophy_count) - get the minimum
# i.e. - where trophy_count = 3 - to get a specific trophy count
avg(trophy_count) - get the average trophy_count
There are many others. Google "SQL Aggregate functions"
You will eventually go down the rabbit hole of needing to subsection this (examples are by week or by league). Then you are going to want to use windows functions with a cte or subquery)
For your example:
;with cte_base as
(
-- Set your detail here (this step is only needed if you are looking at aggregates)
select surname,Count(*) Ct
left join Trophies t on player_id=p.id
group by p.surname
, cte_ranked as
-- Dense_rank is chosen because of ties
-- Add to the partition to break out your detail like by league, surname
(
select *
, dr = DENSE_RANK() over (partition by surname order by Ct desc)
from cte_base
)
select *
from cte_ranked
where dr = 1 -- Bring back only the #1 of each partition
This is by far overkill but helping you lay the foundation to handle much more complicated queries. Tim Biegeleisen's answer is more than adequate to answer you question.

SQL Query: Find the name of the company that has been assigned the highest number of patents

Using this query I can find the Company Assignee number for company with most patents but I can't seem to print the company name.
SELECT count(*), patent.assignee
FROM Patent
GROUP BY patent.assignee
HAVING count(*) =
(SELECT max(count(*))
FROM Patent
Group by patent.assignee);
COUNT(*) --- ASSIGNEE
9 19715
9 27895
Nesting above query into
SELECT company.compname
FROM company
WHERE ( company.assignee = ( *above query* ) );
would give an error "too many values" since there are two companies with most patents but above query takes only one assignee number in the WHERE clause. How do I solve this problem? I need to print name of BOTH companies with assignee number 19715 and 27895. Thank you.
You have started down the path of using nested queries. All you need to do is remove COUNT(*):
SELECT company.compname
FROM company
WHERE company.assignee IN
(SELECT patent.assignee
FROM Patent
GROUP BY patent.assignee
HAVING count(*) = (SELECT max(count(*))
FROM Patent
GROUP BY patent.assignee
)
);
I wouldn't write the query this way. The use of max(count(*)) is particularly jarring, but it is valid Oracle syntax.
Applying an aggregate function on another aggregate function (like max(count(*))) is illegal in many databases but I believe using the ALL operator instead and a join to get the company name would solve your problem.
Try this:
SELECT COUNT(*), p.assignee, c.compname
FROM Patent p
JOIN Company c ON c.assignee = p.assignee
GROUP BY p.assignee, c.compname
HAVING COUNT(*) >= ALL -- this predicate will return those rows
( -- for which the comparison holds true
SELECT COUNT(*) -- for all instances.
FROM Patent -- it can only be true for the highest count
GROUP BY assignee
);
Assuming you have Oracle, I thought about this a bit differently:
select
c.compname
from
company c
join
(
select
assignee,
dense_rank() over (order by count(1) desc) rnk
from
patent
group by
assignee
) p
on p.assignee = c.assignee
where
p.rnk = 1
;
I like this because is lets you find the any rank. For example, if you want the top 3 you would just change p.rnk = 1 to p.rnk <= 3. If you want 10th place, you just change it to p.rnk = 10. Adding the total count and rank into the results would be easy from here too. Overall I think it's more versatile.

Trying to figure out how to join these queries

I have a table named grades. A column named Students, Practical, Written. I am trying to figure out the top 5 students by total score on the test. Here are the queries that I have not sure how to join them correctly. I am using oracle 11g.
This get's me the total sums from each student:
SELECT Student, Practical, Written, (Practical+Written) AS SumColumn
FROM Grades;
This gets the top 5 students:
SELECT Student
FROM ( SELECT Student,
, DENSE_RANK() OVER (ORDER BY Score DESC) as Score_dr
FROM Grades )
WHERE Student_dr <= 5
order by Student_dr;
The approach I prefer is data-centric, rather than row-position centric:
SELECT g.Student, g.Practical, g.Written, (g.Practical+g.Written) AS SumColumn
FROM Grades g
LEFT JOIN Grades g2 on g2.Practical+g2.Written > g.Practical+g.Written
GROUP BY g.Student, g.Practical, g.Written, (g.Practical+g.Written) AS SumColumn
HAVING COUNT(*) < 5
ORDER BY g.Practical+g.Written DESC
This works by joining with all students that have greater scores, then using a HAVING clause to filter out those that have less than 5 with a greater score - giving you the top 5.
The left join is needed to return the top scorer(s), which have no other students with greater scores to join to.
Ties are all returned, leading to more than 5 rows in the case of a tie for 5th.
By not using row position logic, which varies from darabase to database, this query is also completely portable.
Note that the ORDER BY is optional.
With Oracle's PLSQL you can do:
SELECT score.Student, Practical, Written, (Practical+Written) as SumColumn
FROM ( SELECT Student, DENSE_RANK() OVER (ORDER BY Score DESC) as Score_dr
FROM VOTES ) as score, students
WHERE score.score_dr <= 5
and score.Student = students.Student
order by score.Score_dr;
You can easily include the projection of the first query in the sub-query of the second.
SELECT Student
, Practical
, Written
, tot_score
FROM (
SELECT Student
, Practical
, Written
, (Practical+Written) AS tot_score
, DENSE_RANK() OVER (ORDER BY (Practical+Written) DESC) as Score_dr
FROM Grades
)
WHERE Student_dr <= 5
order by Student_dr;
One virtue of analytic functions is that we can just use them in any query. This distinguishes them from aggregate functions, where we need to include all non-aggregate columns in the GROUP BY clause (at least with Oracle).

SQL Finding maximum value without top command

Let's say I have a bases with a table:
-courses (key: name [ofthecourse], other attributes: year in which the course takes place)
I want to complete a query looking for an answer to the question:
On which year of study there is a maximum number of courses?
Normally, the query would be:
SELECT TOP 1 STUDYEAR
FROM COURSES
GROUP BY STUDYEAR
ORDER BY COUNT(CNO) DESC;
But my question is, which query could complete this without using the TOP 1 phrase?
You can use an inner query to get the maximum count. The only difference is though that it can return more than one record if they have the same count.
SELECT STUDYEAR
FROM COURSES
GROUP BY STUDYEAR
HAVING COUNT(CNO) = (SELECT MAX(CNOCount) FROM
(SELECT COUNT(CNO) CNOCount
FROM COURSES
GROUP BY STUDYEAR) X)
Another version with only one inner query:
SELECT STUDYEAR
FROM
(SELECT STUDYEAR, ROW_NUMBER() OVER (ORDER BY COUNT(CNO) DESC) RowNumber
FROM COURSES
GROUP BY STUDYEAR) X
WHERE RowNumber = 1