I have the following example tables
Employee (EmpID, DepID)
Order (OrderID, EMpID, description)
What I'm trying to achieve is to select employees with most orders by department. I'm on it for like 4 hours already and can't find resolution to this perhaps easy problem.
All I get is either number of order by employee or max number of orders by one employee in one department but I'm struggling to get result as:
DepID, EmpID, Number of orders
Here's my solution for you :
WITH Temp AS (
SELECT
emp.EMpID
,emp.DepID
,COUNT(OrderId) nb_order
,ROW_NUMBER() OVER(PARTITION BY emp.DepID ORDER BY COUNT(OrderId) DESC) Ordre
FROM
Order ord
INNER JOIN
Employee emp
ON emp.EmpID = ord.EmpID
GROUP BY
emp.EMpID
,emp.DepID)
SELECT *
FROM Temp
WHERE Ordre = 1
I hope this will help you :)
Related
I am working on a table that contains employee data. The table has historical employee records based on department and year as follows:
Now I want to consolidate records based on EmployeeId, Department and get the Min FromYear and Max ToYear like this:
I tried to use a query :
Select EmployeeId, Department, MIN(FromYear), MAX(ToYear)
from Employee
GROUP BY EmployeeId, Department
But this query fails for the employee with ID 3 as it returns me only 2 rows:
I have added a similar structure and query here: http://sqlfiddle.com/#!9/6f1e53/5
Any help would be highly appreciated!
This is a gaps-and-islands problem. Identify the islands using lag() and a cumulative sum. Then aggregate:
select employeeid, department, min(fromyear), max(toyear)
from (select e.*,
sum(case when prev_toyear >= fromyear - 1 then 0 else 1 end) over (partition by employeeid order by fromyear) as grp
from (select e.*,
lag(toyear) over (partition by employeeid, department order by fromyear) as prev_toyear
from employee e
) e
) e
group by employeeid, department, grp
order by employeeid, min(fromyear);
Here is a db<>fiddle.
you can use self join as well
select a.employeeid, min(a.fromyear), max(b.toyear) from emp a
inner join emp b on a.employeeid=b.employeeid
group by a.employeeid
Schema for EMPLOYEE
(ID, EMPLOYEENAME, SALARY, ORGANIZATIONID)
Query to Solve: Find employee Names in each organization with Maximum Salary without a Join.
SELECT E.*
FROM EMPLOYEE E,
(SELECT EMP.ORGANIZATIONID, MAX(EMP.SALARY)
FROM EMPLOYEE EMP
GROUP BY EMP.ORGANIZATIONID) MAXSALARY
WHERE MAXSALARY.SALARY =E.SALARY
AND E.ORGANIZATIONID=EMP.ORGANIZATIONID ;
Is there a way to avoid the join? I am using Spark SQL API and joins cause an extra shuffle operation which is expensive. Is there a way to get the employee name while getting the max salary?
Assume you have a single employee in each organization having the max salary
You can use PARTITION BY with Spark SQL as shown below (Although it will require a subquery)
SELECT E.*
FROM
(SELECT EMP.EMPLOYEENAME, EMP.ORGANIZATIONID, EMP.SALARY,
row_number() OVER (PARTITION BY ORGANIZATIONID ORDER BY SALARY DESC) as rank
FROM EMPLOYEE EMP
) AS E
WHERE E.rank=1
Try this:
SELECT P.ORGANIZATIONID, P.EMPLOYEENAME
FROM EMPLOYEE P
WHERE P.SALARY = (SELECT MAX(E.SALARY) FROM EMPLOYEE E WHERE P.ORGANIZATIONID = E.ORGANIZATIONID)
GROUP BY P.ORGANIZATIONID, P.EMPLOYEENAME
Try this:
SELECT EMPLOYEENAME FROM EMPLOYEE
WHERE SALARY IN (SELECT MAX(SALARY) FROM EMPLOYEE GROUP BY ORGANIZATIONID)
Two tables - 'salaries' and 'master1'
The salaries is by year, and I can group them to get the sums for each player using
SELECT playerID, sum(salary) as sal
FROM salaries
GROUP BY playerID ORDER BY sal DESC LIMIT 10;
This returns the playerID and sum of salary, but I need the names of the players from the 'master1' table under column 'nameFirst' and 'nameLast'. They have the common column of 'playerID' in both 'master1' and 'salaries' but when I try to run
SELECT master1.nameFirst, master1.nameLast, sum(salary) as sal
FROM salaries, master1
GROUP BY salaries.playerID ORDER BY sal DESC LIMIT 10;
I get the error
Expression not in GROUP BY key 'nameFirst'
I have tried tinkering with it to continue getting errors.
Thanks!
Need to include nameFirst and nameLast in the group by:
SELECT
master1.nameFirst,
master1.nameLast,
sum(salary) as sal
FROM salaries JOIN master1 ON salaries.playID = master1.playerID
GROUP BY master1.nameFirst, master1.nameLast, salaries.playerID
ORDER BY sal DESC LIMIT 10;
First, you would need to use proper explicit JOIN syntax
SELECT
MAX(m.nameFirst) FirstName,
MAX(m.nameLast) LastName,
SUM(s.salary) Salary
FROM master1 m
INNER JOIN salaries s ON m.playerID = s.playerID
GROUP BY m.playerID
Use, master1 table to get the FirstName, LastName and do the JOIN with
salaries table to get the total salary of each player.
For, your current query exception when you are using GROUP BY clause make ensure that the columns/expressions in SELECT statement needs to be aggregate.
So I have been trying to solve this for a while, and even though I have found many interesting things here I simply could not solve it the way it has been requested.
I have two tables:
PROFESSOR (ID, NAME, DEPARTMENT_ID and SALARY) and
DEPARTMENT (ID, NAME).
I have to write a query that shows the DEPARTMENT NAME that has the HIGHEST average SALARY. Also if more than one department have the highest average SALARY, should list all of then in any order.
I have tried so many things and in the end I created a monster, I think. i tried using HAVING but it did not work the way I did. I'm lost. The problem is that I need to use to aggregate functions.
SELECT b.nam, b.average
FROM ( SELECT DEPARTMENT.NAME AS nam, AVG(PROFESSOR.SALARY) AS average
FROM PROFESSOR JOIN DEPARTMENT ON (PROFESSOR.DEPARTMENT_ID =
DEPARTMENT.ID)
GROUP BY DEPARTMENT.NAME) AS b
GROUP BY b.nam, b.average
ORDER BY b.average DESC
But this query is bringing me all the departments with the average, not the maximum.
If someone could please assist me and explain in a easy way I would really appreciate it.
Thanks!
You can use this. If more than one row has same max avg value, with using WITH TIES you can bring all of them.
SELECT TOP 1 WITH TIES DEPARTMENT.NAME AS nam, AVG(PROFESSOR.SALARY) AS average
FROM PROFESSOR
JOIN DEPARTMENT ON (PROFESSOR.DEPARTMENT_ID = DEPARTMENT.ID)
GROUP BY DEPARTMENT.NAME
ORDER BY AVG(PROFESSOR.SALARY) DESC
;WITH x AS (
SELECT t.dept,
T.avg_sal,
rank() OVER(ORDER BY t.avg_sal DESC) AS rnk
FROM
(
SELECT d.name AS 'dept',
avg(p.salary) AS avg_sal
FROM department AS d
INNER JOIN
professor AS p ON p.department_id=d.id
GROUP BY d.name
) AS t
)
-- all depts with highest avg sal
SELECT dept, avg_sal
FROM x
WHERE rnk = 1
You can subquery for the MAX(avgSalary). The way I've done it here was to use a CTE.
WITH cte AS
(
SELECT DEPARTMENT_ID
,AVG(SALARY) [avgSalary]
FROM PROFESSOR
GROUP BY DEPARTMENT_ID
)
SELECT D.[NAME]
,cte.avgSalary
FROM cte INNER JOIN DEPARTMENT D
ON D.ID = cte.DEPARTMENT_ID
WHERE cte.avgSalary = (SELECT MAX(avgSalary)
FROM cte)
I think what you want is:
select
NAME,
max(avg_salary) as max_avg_salary
from
DEPARTMENT d inner join
(select
DEPARTMENT_ID ,
avg(SALARY) as avg_salary
from
PROFESSOR
group by
DEPARTMENT_ID) a on
d.DEPARTMENT_ID = a.DEPARTMENT_ID
there are other ways to do it as you see in the other answers, but I think you want the simplest solution possible using group by to determine both each avg and the max of all avgs. Only other thing you need is a subquery, which you're probably familiar with.
HTH
Tables:
Department (dept_id,dept_name)
Students(student_id,student_name,dept_id)
I am using Oracle. I have to print the name of that department that has the minimum no. of students. Since I am new to SQL, I am stuck on this problem. So far, I have done this:
select d.department_id,d.department_name,
from Department d
join Student s on s.department_id=d.department_id
where rownum between 1 and 3
group by d.department_id,d.department_name
order by count(s.student_id) asc;
The output is incorrect. It is coming as IT,SE,CSE whereas the output should be IT,CSE,SE! Is my query right? Or is there something missing in my query?
What am I doing wrong?
One of the possibilities:
select dept_id, dept_name
from (
select dept_id, dept_name,
rank() over (order by cnt nulls first) rn
from department
left join (select dept_id, count(1) cnt
from students
group by dept_id) using (dept_id) )
where rn = 1
Group data from table students at first, join table department, rank numbers, take first row(s).
left join are used is used to guarantee that we will check departments without students.
rank() is used in case that there are two or more departments with minimal number of students.
To find the department(s) with the minimum number of students, you'll have to count per department ID and then take the ID(s) with the minimum count.
As of Oracle 12c this is simply:
select department_id
from student
group by department_id
order by count(*)
fetch first row with ties
You then select the departments with an ID in the found set.
select * from department where id in (<above query>);
In older versions you could use RANK instead to rank the departments by count:
select department_id, rank() over (order by count(*)) as rnk
from student
group by department_id
The rows with rnk = 1 would be the department IDs with the lowest count. So you could select the departments with:
select * from department where (id, 1) in (<above query>);