Using WHERE clause on a joined table - sql

I am trying to find the best way to filter out rows by using conditions based on a joined table. As an example I am joining the employees and salary grade table based on the salary grade for each employee. Then I want to show only the employees that have the same grade as a certain employee (Blake). I used the following code:
SELECT e.ename, e.sal, sg.grade
FROM emp e JOIN salgrade sg
ON(e.sal BETWEEN sg.losal AND sg.hisal)
WHERE sg.grade = (SELECT sg.grade FROM emp e JOIN salgrade sg ON(e.sal BETWEEN sg.losal AND sg.hisal) WHERE e.ename = 'BLAKE')
ORDER BY e.sal DESC
Is there a more optimal way to write the query?

Here is one method that uses window functions:
SELECT es.*
FROM (SELECT e.ename, e.sal, sg.grade,
MAX(CASE WHEN e.ename = 'BLAKE' THEN sg.grade END) OVER () as blake_grade
FROM emp e JOIN
salgrade sg
ON e.sal BETWEEN sg.losal AND sg.hisal
) es
WHERE grade = blake_grade
ORDER BY e.sal DESC;

I wouldn't use a join in the select in the WHERE clause; rather, I would use an inner scalar subquery to pick up BLAKE's salary and then an outer scalar subquery to pick up his salgrade. Otherwise very similar to your query:
select e.ename, e.sal, s.grade
from emp e inner join salgrade s on e.sal between s.losal and s.hisal
where s.grade = ( select grade
from salgrade
where (select sal from emp where ename = 'BLAKE')
between losal and hisal
)
order by sal desc
;
Using the same idea, you could do away with the first join as well (by returning the losal and hisal for BLAKE as well as his salgrade), but perhaps that is taking it too far.

If this is just about not having to write the same code twice, you can use a WITH clause:
WITH emps_and_sals AS
(
SELECT e.ename, e.sal, sg.grade
FROM emp e
JOIN salgrade sg ON e.sal BETWEEN sg.losal AND sg.hisal
)
SELECT *
FROM emps_and_sals
WHERE grade = (SELECT grade FROM emps_and_sals WHERE ename = 'BLAKE')
ORDER BY sal DESC;

Related

Select the manager name with the most employees

We have a table emp with columns empno, ename, job, mgr, hiredate, sal, comm, deptno
I have tried
SELECT m.ename, COUNT(e.empno) FROM emp e
INNER JOIN emp m ON e.empno = m.empno
GROUP BY m.ename HAVING COUNT(e.empno) = GREATEST(COUNT(e.empno));
My output is the names of the managers each with the value 1
How do we output the name of the manager with the most employees?
The following works:
SELECT COUNT(e.empno), m.ename FROM emp e
INNER JOIN emp m ON e.mgr = m.empno
GROUP BY m.ename HAVING GREATEST(COUNT(e.ename)) = COUNT(e.ename)
LIMIT 1;
First, fix the ON clause. Second, Use ORDER BY and LIMIT if you want one row:
SELECT m.ename, COUNT(e.empno)
FROM emp e INNER JOIN
emp m
ON m.empno = e.mgr
GROUP BY m.ename
ORDER BY COUNT(e.empno) DESC
LIMIT 1;
Your HAVING clause does not filter anything because any (non-NULL) value is equal to itself.

Perform SELECT statement inside of SELECT Statement?

I'm using Emp, Dept... databases. I would like to get the Name, Salary, Deptno and Average salary in the department of those employees who earn more than the average of their department. So here is what I'm trying to do:
SELECT e.Ename, e.Sal, e.Deptno
, (
SELECT AVG(Sal)
FROM Emp b
WHERE b.Deptno = e.Deptno
GROUP BY Deptno
) AS 'Average Salary'
FROM Emp e
WHERE e.Sal > (
SELECT AVG(b.Sal)
FROM Emp b
WHERE b.Deptno = e.Deptno
GROUP BY Deptno
);
And I can't use AVG(Sal), because it will give the average salary of the employee, and not the department where he works.
You would use a correlated subquery:
SELECT e.Ename, e.Sal, e.Deptno,
(SELECT AVG(e2.Sal)
FROM Emp e2
WHERE e2.Deptno = e.Deptno
) AS [Average Salary]
FROM Emp e
WHERE e.Sal > (SELECT AVG(e2.Sal)
FROM Emp e2
WHERE e2.Deptno = e.Deptno
);
But in actuality, you would just use a window function:
select e.*
from (select e.*, avg(sal) over (partition by deptno) as avg_sal
from emp e
) e
where sal > avg_sal;
Just get rid of the group by.
SELECT e.Ename, e.Sal, e.Deptno, (SELECT AVG(Sal)
FROM Emp b
WHERE b.Deptno = e.Deptno
) AS 'Average Salary'
FROM Emp e
WHERE e.Sal > (SELECT AVG(b.Sal)
FROM Emp b
WHERE b.Deptno = e.Deptno
);
If you join to the subquery, you will not need to repeat it.
SELECT e.Ename, e.Sal, e.Deptno, dAvgs.avgSal AS 'Average Salary'
FROM Emp AS e
INNER JOIN (
SELECT Deptno, AVG(b.Sal) AS avgSal
FROM Emp b
GROUP BY Deptno
) AS dAvgs
ON e.Deptno = dAvgs.Deptno AND e.Sal > dAvgs.avgSal
;

Rewriting uncorrelated subquery to correlated subquery

I am working with a default oracle scott database with additional table PROJECT, where there are two columns: projectno and empno.
I want to select names of employees with the highest salaries for each project.
I know how to do it with uncorrelated subquery:
SELECT p.projno,
e.sal,
e.ename
FROM emp e
INNER
JOIN proj_emp p
ON e.empno = p.empno
WHERE (e.sal, p.projno)
IN (SELECT MAX(e.sal),
p.projno
FROM emp e INNER JOIN proj_emp p
ON e.empno = p.empno
GROUP BY p.projno)
However, i was asked to do it with a correlated subquery written in a WHERE clause, but i am wondering if it is possible?
I would do :
SELECT t.*
FROM (SELECT p.projno, e.sal, e.ename,
DENSE_RANK() OVER (PARTITION BY p.projno ORDER BY e.sal DESC) AS Seq
FROM emp e INNER JOIN
proj_emp p
ON e.empno = p.empno
) t
WHERE Seq = 1;
EDIT : If you want to do it with correlated subquery then i would rewrite your query to make correlated :
SELECT p.projno, e.sal, e.ename
FROM emp e INNER JOIN
proj_emp p
ON e.empno = p.empno
WHERE e.sal = (SELECT MAX(e1.sal)
FROM emp e1 INNER JOIN
proj_emp p1
ON e1.empno = p1.empno
WHERE p1.projno = p.projno
);
Use window functions:
SELECT projno, sal, ename
FROM (SELECT p.projno, e.sal, e.ename,
MAX(e.sal) OVER (PARTITION BY p.projno) as max_sal
FROM emp e INNER JOIN
proj_emp p
ON e.empno = p.empno
) ps
WHERE sal = max_sal;

small tricky query on self join

I have a table EMP with columns as below:
create table emp(
empno number(4,0),
ename varchar2(10),
job varchar2(9),
mgr_id number(4,0),
sal number(7,2),
deptno number(2,0));
I want to list all employees' names along with their manager names, including those who do not have a manager. For those employees, their manager's name should be displayed as 'BOSS'.
The following query should work:
select e.ename, (case when m.ename is null then 'BOSS' else m.ename end) as mgrName
from emp e
left join emp m on m.empno = e.mgr_id
To my mind, the better solution is proposed by Charanjith.
In Oracle, we could even use NVL function instead of "case when" in order to replace null value by something. The result should be the same.
select e.ename empName, NVL(m.ename, 'BOSS') mgrName from emp e
left join emp m on m.empno = e.mgr_id
Moreover, we could see another solution : using inner join to filter on emp when a manager exists. Then union for all employees who don't have any manager.
select e.ename empName, m.ename mgrName from emp e inner join emp m on e.mgr_id = m.empno
union
select e.ename empName, 'BOSS' mgrName from emp e where not exists (select 1 from emp m where e.mgr_id = m.empno)
This work fine in oracle:
SELECT e.ename,
nvl(m.ename, 'BOSS')mgr
FROM emp a
LEFT JOIN emp b
ON m.empno = e.mgr_id;

the grouping column for an aggregate function is as a join condition?

I was reading some Oracle SQL resources and I found this SQL code:
SELECT e.ename AS "NAME",
e.sal AS "Salary",
e.deptno,
AVG(a.sal) dept_avg
FROM emp e, emp a
WHERE e.deptno = a.deptno
AND e.sal > ( SELECT AVG(sal)
FROM emp
WHERE deptno = e.deptno )
GROUP BY e.ename, e.sal, e.deptno;
This SQL code is supposed to return every employee that gets more than the average salary of his department and display his name, his salary his department's ID and then the average salary in his department.
In order to return the dept_avg, we have to group by deptno, but the grouping columns are weird. What I guess, is that the grouping column is the column that is used as a join condition, the a.deptno. Is that true ? if not can someone please clarify it?
Maybe re-writing with more modern conventions makes it clearer?
WITH avgbydept as
(
SELECT deptno, avg(sal) as avgsal
FROM emp
GROUP BY deptno
)
SELECT e.ename AS "NAME",
e.sal AS "Salary",
e.deptno,
AVG(a.sal) dept_avg
FROM emp e
JOIN emp a ON e.deptno = a.deptno
JOIN avgbydept abd ON e.deptno = abd.deptno
WHERE e.sal > abd.avgsal
GROUP BY e.ename, e.sal, e.deptno;
One thing this makes clear is that it has a "bug" of an extra join and group by -- To do as you say:
This SQL code is supposed to return every employee that gets more than
the average salary of his department and display his name, his salary
his department's ID and then the average salary in his department.
I believe you want this
WITH avgbydept as
(
SELECT deptno, avg(sal) as avgsal
FROM emp
GROUP BY deptno
)
SELECT e.ename AS "NAME",
e.sal AS "Salary",
e.deptno,
abd.avgsal as dept_avg
FROM emp e
JOIN avgbydept abd ON e.deptno = abd.deptno
WHERE e.sal > abd.avgsal
If you remove GROUP BY and use SELECT *, you'll see what's happening.
emp is joined on itself, every employee with salary higher than average is joined with every other employee in his department, making an awful lot of rows. Then, from that data, average salary (from every other worker in dept) is counted again, using GROUP BY. It's impressively inefficient, look at other answers to see how it should have been done.
GROUP BY can throw us for a loop. Here's an easy way to think about grouping:
select field1, field1, sum(field3)
from ..
group by <all fields that do not participate in aggregate>
The query you noticed could be re-written somewhat like this:
select e.*, t.avgsal
from emp e
inner join (select deptno, avg(sal) avgsal from emp group by deptno) t
on e.deptno = t.deptno
where e.sal > t.avgsal
Now you can see that the subquery aliased with t will get average salary by department. We then use departments to join employee and our derived avg salary by department and eliminate the need for grouping.