Get top results for each group (in Oracle) - sql

How would I be able to get N results for several groups in
an oracle query.
For example, given the following table:
|--------+------------+------------|
| emp_id | name | occupation |
|--------+------------+------------|
| 1 | John Smith | Accountant |
| 2 | Jane Doe | Engineer |
| 3 | Jack Black | Funnyman |
|--------+------------+------------|
There are many more rows with more occupations. I would like to get
three employees (lets say) from each occupation.
Is there a way to do this without using a subquery?

I don't have an oracle instance handy right now so I have not tested this:
select *
from (select emp_id, name, occupation,
rank() over ( partition by occupation order by emp_id) rank
from employee)
where rank <= 3
Here is a link on how rank works: http://www.psoug.org/reference/rank.html

This produces what you want, and it uses no vendor-specific SQL features like TOP N or RANK().
SELECT MAX(e.name) AS name, MAX(e.occupation) AS occupation
FROM emp e
LEFT OUTER JOIN emp e2
ON (e.occupation = e2.occupation AND e.emp_id <= e2.emp_id)
GROUP BY e.emp_id
HAVING COUNT(*) <= 3
ORDER BY occupation;
In this example it gives the three employees with the lowest emp_id values per occupation. You can change the attribute used in the inequality comparison, to make it give the top employees by name, or whatever.

Add RowNum to rank :
select * from
(select emp_id, name, occupation,rank() over ( partition by occupation order by emp_id,RowNum) rank
from employee)
where rank <= 3

tested this in SQL Server (and it uses subquery)
select emp_id, name, occupation
from employees t1
where emp_id IN (select top 3 emp_id from employees t2 where t2.occupation = t1.occupation)
just do an ORDER by in the subquery to suit your needs

I'm not sure this is very efficient, but maybe a starting place?
select *
from people p1
join people p2
on p1.occupation = p2.occupation
join people p3
on p1.occupation = p3.occupation
and p2.occupation = p3.occupation
where p1.emp_id != p2.emp_id
and p1.emp_id != p3.emp_id
This should give you rows that contain 3 distinct employees all in the same occupation. Unfortunately, it will give you ALL combinations of those.
Can anyone pare this down please?

Related

How do I find the highest salary from each department using SUBQUERIES

I'm really new to this and this particular question has been bugging me for days. I do know there are similar questions to this but I kept wondering how it would be done in subqueries.
SALARY TABLE
[Emp_ID] [SalaryPM]
001 | 10,500
002 | 50,000
003 | 8,000
004 | 10,000
DEPT TABLE
[Emp_ID] [Dept_ID]
001 | A
002 | B
003 | C
004 | C
I want it to look like this
[Emp_ID] [Dept_ID] [SalaryPM]
001 | A | 10,000
002 | B | 50,000
004 | C | 10,000
What I have tried so far, but it only gives the highest salary of the employee##
SELECT * FROM DEPT
WHERE EMP_ID IN
(SELECT Emp_ID
FROM SALARY
WHERE SalaryPM = (SELECT MAX(SalaryPM)
FROM SALARY));
Would this qualify as a subquery solution?
select *
from (
select s.*, e.deptid,
rank() over(partition by e.dept order by s.salaries desc) rn
from employees e
inner join salaries s on s.id = e.id
) rn
where rn = 1
Note: your does not look good. The data you are showing suggests a 1-1 relationship between the two tables (wich I called employees and salaries): if so, both tables should be combined in a single table.
I would be good to know what do you mean by "using subqueries" ;)
here's another solution using subquery in the SELECT clause
SELECT
d.id,
d.deptid,
(
SELECT
MAX(s.salary)
FROM
my_salaries s
WHERE
s.id = d.id
) max_salary
FROM
my_departments d;
here's solution without joining tables (IMHO not joining tables is just overcomplicating things). Why 'not using join' is so important for you?
SELECT
sub.id,
MAX(sub.deptid),
MAX(sub.salary)
FROM
(
SELECT
d.id,
d.deptid,
NULL salary
FROM
my_departments d
UNION ALL
SELECT
s.id,
NULL deptid,
s.salary
FROM
my_salaries s
) sub
GROUP BY
sub.id
ORDER BY
sub.id;

Is there an analytic function for count in oracle sql

select manager, count(*) over (partition by manager) cnt
from dbtable
group by manager
This will provide me the count of manager but if I need a count of senior_manager how will I get it?
|--------------------|------------------|
| Manager |Senior_Manager |
|--------------------|------------------|
| John |Arpit |
| John |govind |
| John |olive |
| Domnic |kelvin |
| Domnic |paul |
|--------------------|------------------|
Result
John 3
Domnic 2
Your code returns "1" for all managers -- because it counts the number of rows after the group by.
If you want to count the number of rows in the table for a given manager, then you want aggregation, not analytic functions:
Select manager, count(*) as cnt
from dbtable
group by manager;
I'm not sure if this answers your question, but it at least addresses the issue that the your query does not do much that is useful.
EDIT:
For the revised question, it simply seems:
Select senior_manager, count(*) as cnt
from dbtable
group by senior_manager;
The result you wanted can be retrieved by
select manager, count(*) over (partition by manager) cnt
from dbtable
This means each manager will be associated with the count of rows in the partition where {manager} value equals that exact manager. According to the table above this is what you expect to get.
Your example:
select manager, count(*) over (partition by manager) cnt
from dbtable
group by manager
Yields the following results:
MANAGER CNT
Domnic 1
John 1
If you drop the group by, you get:
MANAGER CNT
Domnic 2
Domnic 2
John 3
John 3
John 3
Are those the counts you're looking for? If so, then you can eliminate the duplicate rows with distinct:
select distinct manager, count(*) over (partition by manager) cnt
from dbtable
Which gives:
MANAGER CNT
John 3
Domnic 2

How do I return both values if the maximum if two rows are equal?

I need to find the persons with the maximum salaries in each department. I've got the code and found out the persons with the maximum salaries for each department. But then, when I looked at my data, there is another person that has the equal max value in the same department. Is there a way to return both persons' name?
example table:
Department Salary Name
Admin $1000 Amy
Admin $900 Ben
HR $1500 Cassy
HR $1500 Dan
I have tried this code:
SELECT department, Max(salary), name
FROM table
GROUP BY department
ORDER BY salary desc;
I've been getting Admin's person's details OK. But HR I can only get Cassy's name. Is there a way to get Dan's name in my output as well? Can anyone give me an example? Thank you
Hope this can help
SELECT department, salary, name
FROM table t
where salary= (select max(salary) from table where t.department = department)
You didn't mention the DBMS you are using.
With standard SQL, you can use window functions for this (which are supported by all modern DBMS):
select department, salary, name
from (
select department, salary, name,
dense_rank() over (partition by department order by salary desc) as rnk
from department
) t
where rnk = 1;
With NOT EXISTS:
SELECT department, salary, name
FROM tablename t
WHERE NOT EXISTS (
SELECT 1 FROM tablename
WHERE department = t.department and salary > t.salary
)
ORDER BY salary desc, name;
See the demo.
Results:
| Department | Salary | Name |
| ---------- | ------ | ----- |
| HR | 1500 | Cassy |
| HR | 1500 | Dan |
| Admin | 1000 | Amy |
You can use two levels of aggregation if you want one row per department with the names lists on the row:
select dept, salary, names
from (select dept, salary, group_concat(name) as names,
row_number() over (partition by dept order by salary desc) as seqnum
from example
group by dept, salary
) t
where seqnum = 1;

Select records with highest salary from duplicate records

Select employee record with highest salary from duplicate records with same name and different salary
id|name|salary
1 | A | 500
2 | B | 100
3 | A | 400
3 | B | 200
Output
1 | A | 500
3 | B | 200
Please post the generic sql that will work on all the databases.
I have tried the below query. But this does not return record if duplicate not exist.
select e.id,e.name,e.salary FROM employee e,
employee e1
WHERE
e.name = e1.name
AND e.salary > e1.salary
Using NOT EXISTS
SELECT s1.id, s1.name, s1.salary
FROM salaries s1
WHERE NOT EXISTS
(
SELECT *
FROM salaries s2
WHERE s1.name = s2.name AND s1.salary < s2.salary
)
Using ALL
SELECT s1.id, s1.name, s1.salary
FROM salaries s1
WHERE s1.salary >=
ALL(
SELECT salary
FROM salaries s2
WHERE s1.name = s2.name
)
Any reasonable modern database (mysql being the notable exception) should support window functions. rank() should do the trick:
SELECT id, name, salary
FROM (SELECT id, name, salary,
RANK() OVER (PARTITION BY name ORDER BY salary DESC) rk
FROM some_table) t
WHERE rk = 1
In SQL Server that works (I don't have another SGDB at this time, but it shouldn't be difficult to adapt it):
select s.*
from salaries s
join (
select name,MAX(salary) as maxsalary
from salaries
group by name) ms on s.name=ms.name and s.salary=ms.maxsalary
the subquery selects rows that represent maximum salaries. The main query filters according to both parameters: name and max salary.

Issue with returning distinct records based on single column (Oracle)

If I have the table "members" (shown below), how would I go about getting the record of the first occurrence of a membership_id (Oracle).
Expected results
123 John Doe A P
313 Michael Casey A A
113 Luke Skywalker A P
Table - members
membership_id first_name last_name status type
123 John Doe A P
313 Michael Casey A A
113 Luke Skywalker A P
123 Bob Dole A A
313 Lucas Smith A A
SELECT membership_id,
first_name,
last_name,
status,
type
FROM( SELECT membership_id,
first_name,
last_name,
status,
type,
rank() over (partition by membership_id
order by type desc) rnk
FROM members )
WHERE rnk = 1
will work for your sample data set. If you can have ties-- that is, multiple rows with the same membership_id and the same maximum type-- this query will return all those rows. If you only want to return one of the rows where there is a tie, you would either need to add additional criteria to the order by to ensure that all ties are broken or you would need to use the row_number function rather than rank which will arbitrarily break ties.
Select A.*
FROM Members AS A inner join
(Select membership_id, first(first_name) AS FN, first(last_name) AS LN
From Members
Group by membership_id) AS B
ON A.membership_id=B.membership_id and A.first_name=B.FN and A.last_name=B.LN
Hope that helps!
select *
from members
where rowid in (
select min(rowid)
from members
group by membership_id
)