I have read answers to similar questions but I cannot find a solution to my particular problem.
I will use a simple example to demonstrate my question.
I have a table called 'Prizes' with two columns: Employees and Awards
The employee column lists the employee's ID and award shows a single award won by the employee. If an employee has won multiple awards their ID will be listed in multiple rows of the table along with each unique award.
The table would look as follows:
Employee AWARD
1 Best dressed
1 Most attractive
2 Biggest time waster
1 Most talkative
3 Hardest worker
4 Most shady
3 Most positive
3 Heaviest drinker
2 Most facebook friends
Using this table, how would I select the ID's of the employees who won the most awards?
The output should be:
Employee
1
3
For the example as both these employees won 3 awards
Currently, the query below outputs the employee ID along with the number of awards they have won in descending order:
SELECT employee,COUNT(*) AS num_awards
FROM prizes
GROUP BY employee
ORDER BY num_awards DESC;
Would output:
employee num_awards
1 3
3 3
2 2
4 1
How could I change my query to select the employee(s) with the most awards?
A simple way to express this is using rank() or dense_rank():
SELECT p.*
FROM (SELECT employee, COUNT(*) AS num_awards,
RANK() OVER (ORDER BY COUNT(*) DESC) as seqnum
FROM prizes
GROUP BY employee
) p
WHERE seqnum = 1;
Being able to combine aggregation functions and analytic functions can make these queries much more concise.
You can use dense_rank to get all the rows with highest counts.
with cnts as (
SELECT employee, count(*) cnt
FROM prizes
GROUP BY employee)
, ranks as (select employee, cnt, dense_rank() over(order by cnt desc) rnk
from cnts)
select employee, cnt
from ranks where rnk = 1
Related
Suppose I have the following applicant data for jobs in a company:
id position salary
——————————————————————
0 senior 20000
1 senior 15000
2 associate 10000
The budget is 40000 and the preference is to hire senior managers. What PostgreSQL constructs do I use to get the following result as far as the number of hires are concerned.
seniors associates
———————————————————
2 0
Any directions would be appreciated.
Here is a starting sqlfiddle: http://sqlfiddle.com/#!17/2cef4/1
Using PostgreSQL filters and window functions, I was able to come up with a query that produced the result.
select
count(*) filter(where s.position = 'senior') as seniors,
count(*) filter(where s.position = 'associate') as associates
from (
select
position,
sum(salary) over(order by position desc rows between unbounded preceding and current row) as salary
from
candidates
) as s
where s.salary <= 40000;
Example: http://sqlfiddle.com/#!17/2cef4/10
I have this employees table.
I need to write an SQL query that will bring me the max salary of an employee and the second highest salary of an employee by city id (id is linked to a table with cities. there are 5 cities.)
My query looks like this:
select MAX([dbo.Employees].Salary) as Salary from [dbo.Employees]
where [dbo.Employees].Salary not in(select MAX([dbo.Employees].Salary) from [dbo.Employees])
UNION select MAX([dbo.Employees].Salary) from [dbo.Employees] group by [dbo.Employees].Id
I try to bring the highest and exclude the highest but it suppose to bring overall 7 values but it brings only 5. (because there are 5 cities but the 5th is not in use so there are 4 cities and 2 employees in each city except 1 that has only 1 employee so the query suppose to bring me 2 pairs of employees per city = 6, and one of the cities has only 1 employee so it will bring the only possible value. overall 7. )
another problem is that I don't know how to make in bring 2 columns - one for the id of the cities and the second for the salaries themselves because it tells me that something about the group by doesn't work.
You can use ROW_NUMBER() to create a sequence for each city, with the highest salary getting value 1, second highest getting value 2, etc. Then you just need a WHERE clause.
WITH
ranked AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY Salary DESC) AS city_salary_id
FROM
dbo.Employees
)
SELECT
*
FROM
ranked
WHERE
city_salary_id IN (1, 2)
If multiple people are tied with the same salary, it will arbitrarily pick the order for you, and always return (at most) 2 employees per city.
Adding a column to the ORDER BY lets you be more specific about how to deal with ties, such as ORDER BY Salary DESC, id ASC will prioritise the highest id in the even of a tie.
Changing to RANK() will give tied Salaries the same rank, and so will return more than two employees if there are ties.
Tables:
Department (dept_id,dept_name)
Students(student_id,student_name,dept_id)
I am using Oracle. I have to print the name of that department that has the minimum no. of students. Since I am new to SQL, I am stuck on this problem. So far, I have done this:
select d.department_id,d.department_name,
from Department d
join Student s on s.department_id=d.department_id
where rownum between 1 and 3
group by d.department_id,d.department_name
order by count(s.student_id) asc;
The output is incorrect. It is coming as IT,SE,CSE whereas the output should be IT,CSE,SE! Is my query right? Or is there something missing in my query?
What am I doing wrong?
One of the possibilities:
select dept_id, dept_name
from (
select dept_id, dept_name,
rank() over (order by cnt nulls first) rn
from department
left join (select dept_id, count(1) cnt
from students
group by dept_id) using (dept_id) )
where rn = 1
Group data from table students at first, join table department, rank numbers, take first row(s).
left join are used is used to guarantee that we will check departments without students.
rank() is used in case that there are two or more departments with minimal number of students.
To find the department(s) with the minimum number of students, you'll have to count per department ID and then take the ID(s) with the minimum count.
As of Oracle 12c this is simply:
select department_id
from student
group by department_id
order by count(*)
fetch first row with ties
You then select the departments with an ID in the found set.
select * from department where id in (<above query>);
In older versions you could use RANK instead to rank the departments by count:
select department_id, rank() over (order by count(*)) as rnk
from student
group by department_id
The rows with rnk = 1 would be the department IDs with the lowest count. So you could select the departments with:
select * from department where (id, 1) in (<above query>);
This question already has answers here:
Select top 10 records for each category
(14 answers)
Closed 6 years ago.
My problem
I want to return the top n rows per group ordered by a date in Oracle 10g
My table
EMPLOYEE|START_DATE|DEPARTMENT
Amy |01-02-1901|Sales
Edwina |01-02-1902|Mergers
Tawnee |01-02-1904|Legal
Trudy |01-02-1998|Sales
Tanner |01-02-1967|Sales
Kelly |01-02-1954|Mergers
Jenny |01-02-1991|Sales
Jacinta |01-02-1924|Legal
Suzanne |01-02-1976|Legal
Jacqui |01-02-1989|Legal
Jill |01-02-1989|Mergers
Kate |01-02-1998|Mergers
Jane |01-02-1900|Sales
Louise |01-02-1912|Mergers
Kim |01-02-1976|Sales
Cara |01-02-1955|Sales
Kirsten |01-02-1933|Legal
Sarah |01-02-1998|Legal
Desired outcome
EMPLOYEE|START_DATE|DEPARTMENT
Jane |01-02-1900|Sales
Amy |01-02-1901|Sales
Tawnee |01-02-1904|Legal
Jacinta |01-02-1924|Legal
Sarah |01-02-1998|Legal
Edwina |01-02-1902|Mergers
Louise |01-02-1912|Mergers
What I've tried
(select * from
employees where
DEPARTMENT = 'Sales' and
rownum <3;)
UNION
(select * from
employees where
DEPARTMENT = 'Legal' and
rownum <3;)
UNION
(select * from
employees where
DEPARTMENT = 'Mergers' and
rownum <3;)
REALLY ugly query
I'm thinking if there was a way you could you an
OVER (PARTITION BY DEPARTMENT)
but from what I read, this needs to be preceded by an analytic function (count, sum whatever). Is there a more elegant, inexpensive solution?
Consider this non-Windows function approach using a count correlated aggregate query. The idea is to run a department rank subquery and then use that in a derived table that filters outer query by this department rank. Please note your desired results do not return by ordered START_DATE but simply query's row number.
SELECT main.EMPLOYEE, t.START_DATE, t.DEPARTMENT
FROM
(SELECT t.EMPLOYEE, t.START_DATE, t.DEPARTMENT,
(SELECT Count(*) FROM Employees sub
WHERE sub.START_DATE <= t.START_DATE
AND sub.Department = t.Department) AS DeptRank
FROM Employees t) main
WHERE main.DeptRank <= 3
ORDER BY main.DEPARTMENT, main.START_DATE;
-- EMPLOYEE START_DATE DEPARTMENT
-- Tawnee 1/2/1904 Legal
-- Jacinta 1/2/1924 Legal
-- Kirsten 1/2/1933 Legal
-- Edwina 1/2/1902 Mergers
-- Louise 1/2/1912 Mergers
-- Kelly 1/2/1954 Mergers
-- Jane 1/2/1900 Sales
-- Amy 1/2/1901 Sales
-- Cara 1/2/1955 Sales
For the Windows function counterpart:
SELECT main.EMPLOYEE, t.START_DATE, t.DEPARTMENT
FROM
(SELECT t.EMPLOYEE, t.START_DATE, t.DEPARTMENT,
RANK() OVER (PARTITION BY Department
ORDER BY START_DATE) AS DeptRank
FROM Employees t) main
WHERE main.DeptRank <= 3
ORDER BY main.DEPARTMENT, main.START_DATE;
And as #Matt comments, you may want to handle ties (i.e., employees who started on same day). Both above solutions will output all such employees depending on rank filter. To take one of the ties in correlated subquery, use Employee name as tiebreaker (or better yet a unique ID if available):
SELECT main.EMPLOYEE, t.START_DATE, t.DEPARTMENT
FROM
(SELECT t.EMPLOYEE, t.START_DATE, t.DEPARTMENT,
(SELECT Count(*) FROM Employees sub
WHERE sub.Department = t.Department
AND (sub.START_DATE <= t.START_DATE
OR sub.START_DATE = t.START_DATE
AND sub.EMPLOYEE < t.EMPLOYEE) AS DeptRank
FROM Employees t) main
WHERE main.DeptRank <= 3
ORDER BY main.DEPARTMENT, main.START_DATE;
And for window-function query use ROW_NUMBER() in place of RANK().
Hi i I am facing a problem, it is that the students can have more than 5 subjects but i have to sum of only 5 subjects total marks of which the student secured the highest. One way to say sum of top 5 total marks obtained by any student.
How do i proceed please help me. Thanks in advance.
In SQL 2000 you will need to use a subselect to determine how many rows with the same ID have a higher mark. Then Filter for rows that have less then 5 higher marked rows above it:
select
ID, Sum(Mark)
From Table1 t
where
(Select count(*)
from Table1 it
where it.id=t.id and it.mark>t.mark) <5
group by ID
ROW_NUMBER isn't in sql-server-2000 unfortunately. You can achieve the same result with a subquery though. Hopefully this is what you're looking for:
SELECT s.studentid, SUM(s.total_marks)
FROM students s
WHERE s.sub_code IN (SELECT TOP 5 sub_code
FROM students a
WHERE a.studentid = s.studentid
ORDER BY total_marks DESC)
GROUP BY studentid
Working in fiddle
Here is a query that gives you only the 5 hightest marks per student:
SELECT studentID, total_marks,
row_number() OVER (PARTITION BY studentID, ORDER BY total_marks DESC) as rowN
FROM studentTable
WHERE rowN <= 5
So to get the total:
SELECT studentID, SUM(total_marks)
FROM
(
SELECT studentID, total_marks,
row_number() OVER (PARTITION BY studentID, ORDER BY total_marks DESC) as rowN
FROM studentTable
WHERE rowN <= 5
) T
GROUP BY studentID