correlated subquery on same table - sql

here is table schema
emp(no, name, salary)
and insert data like this.
insert into emp values (1, 'gandalf', 3000);
insert into emp values (2, 'dobby', 4000);
insert into emp values (3, 'legolas', 5000);
and I select data, like this.
select no, name, salary
from emp e
where salary > (
select AVG(salary)
from emp
where no=e.no
);
but result is empty!!!
I don't understand...
I expected this
(3, 'legolas', 5000)
I try this query, It worked.
select no, name, salary
from emp
where salary > (
select AVG(salary)
from emp d
where no=d.no
);
So, Correlated Subquery must have to alias variable on same table?
At the same time, superquery must have not to alias variable?
and I don't understand this, too.
select no, name, salary
from emp s
where salary > (
select AVG(salary)
from emp d
where s.no=d.no
);
the result is empty, too..
why!!!????

Do not use correlation. At your example you are basically comparing employees salary with average salary (of the same person where s.no=d.no).
For instance for employee no = 1 you got:
WHERE 3000 > (3000) -- false no record returned
You probably want to pick employees which salary is higher than average of all employees. In that case use:
select no, name, salary
from emp
where salary > (
select AVG(salary)
from emp d
);
EDIT:
Scenario when to use correlation (deparment_id column added):
SELECT no, name, salary, deparment_id
FROM emp e1
WHERE salary >= (SELECT AVG(salary)
FROM emp e2
WHERE e1.department_id = e2.department_id);

Assuming that emp.no uniquely identifies each row, then the average in the correlated subquery is the salary for that employee. An employee's salary can never be greater than his/her salary. Presumably you intend:
select no, name, salary
from emp e
where salary > (select AVG(e2.salary)from emp e2);
You can also write this using window functions, though:
select no, name, salary
from (select e.*, avg(salary) over () as avg_salary
from emp e
) e
where salary > avg_salary;

Related

Analytic functions and plain SQL equivalent

I'm using Oracle and SQL Developer. I have downloaded HR schema and need to do some queries with it. Now I'm working with table Employees. As an user I need to see the list of employees with lowest salary in each department. I need to provide different solutions by means of plain SQL and one of analytic functions. About analytic functions, I have used RANK():
SELECT *
FROM
(SELECT
employee_id,
first_name,
department_id,
salary,
RANK() OVER (PARTITION BY department_id
ORDER BY salary) result
FROM
employees)
WHERE
result = 1
AND department_id IS NOT NULL;
The result seems correct:
but when I try to use plain SQL I actually get all employees with their salaries.
Here is my attempt with GROUP BY:
SELECT
department_id, MIN(salary) AS "Lowest salary"
FROM
employees
GROUP BY
department_id;
This code seems good, but I need to also get columns first_name and employee_id.
I tried to do something like this:
SELECT
employee_id,
first_name,
department_id,
MIN(salary) result
FROM
employees
GROUP BY
employee_id,
first_name,
department_id;
and this:
SELECT
employee_id,
first_name,
salary,
departments.department_id
FROM
employees
LEFT OUTER JOIN
departments ON (employees.department_id = departments.department_id)
WHERE
employees.salary = (SELECT MIN(salary)
FROM departments
WHERE department_id = employees.department_id)
These seem wrong. How can I change or modify my queries to get the same result as when I'm using RANK() by means of plain SQL (two solutions at least)?
One of the options could be like here (with old EMP table)...
SELECT EMPNO, ENAME, DEPTNO, SAL
FROM EMP e
WHERE SAL = (Select MIN_SAL From (SELECT DEPTNO, Min(SAL) "MIN_SAL"
FROM EMP
GROUP BY DEPTNO)
Where DEPTNO = e.DEPTNO)
ORDER BY DEPTNO, SAL;
Second option could be...
SELECT EMPNO, ENAME, DEPTNO, SAL
FROM (SELECT e.EMPNO, e.ENAME, e.DEPTNO, e. SAL, (Select Min(SAL) "MIN_SAL" From EMP Where DEPTNO = e.DEPTNO) "MIN_SAL" From EMP e)
WHERE SAL = MIN_SAL
ORDER BY DEPTNO, SAL;
Regards...
You can use a subquery to find the lowest salary per employee and use the main query to only show the information of those employees that are selected by this subquery:
SELECT
employee_id,
first_name,
department_id,
salary
FROM employees e1
WHERE salary =
(SELECT MIN(e2.salary)
FROM employees e2
WHERE e1.employee_id = e2.employee_id);
This will produce exactly the same outcome as your query with RANK.
I think it would make sense to apply some sorting which is missing in your query. I don't know how you want to sort, but here an example to sort by the employee's name:
SELECT
employee_id,
first_name,
department_id,
salary
FROM employees e1
WHERE salary =
(SELECT MIN(e2.salary)
FROM employees e2
WHERE e1.employee_id = e2.employee_id)
ORDER BY first_name;
Since you asked for at least two solutions, let's have a look on another option:
SELECT
e1.employee_id,
e1.first_name,
e1.department_id,
e1.salary
FROM employees e1
JOIN (
SELECT employee_id, MIN(salary) salary
FROM employees
GROUP BY employee_id ) e2
ON e1.employee_id = e2.employee_id AND e1.salary = e2.salary
ORDER BY first_name;
As you can see, this differs since the sub query will apply a GROUP BY clause and it can be successfully executed as own query which is not possible for the sub query used in the previous query.
The JOIN to the main query will then make sure to get again the desired result.
Here are some options to get the employees with the minimum salary in their department:
With MIN (salary) OVER (...)
select employee_id, first_name, department_id, salary
from
(
select e.*, min(salary) over (partition by department_id) as min_sal
from employees e
)
where sal = min_sal;
With RANK and FETCH FIRST
select *
from employees
order by rank() over (partition by department_id order by salary)
fetch first row with ties;
With IN
select *
from employees
where (department_id, salary) in
(
select department_id, min(salary)
from employees
group by department_id
);
With NOT EXISTS
select *
from employees e
where not exists
(
select null
from employees other
where other.department_id = e.department_id
and other.salary < e.salary
);
If you will only ever have one person with the minimum salary per department then you can use KEEP:
SELECT department_id,
MIN(employee_id) KEEP (DENSE_RANK FIRST ORDER BY salary) AS employee_id,
MIN(first_name) KEEP (DENSE_RANK FIRST ORDER BY salary, employee_id) AS first_name,
MIN(salary) AS min_salary
FROM employees
GROUP BY department_id
Which, for the sample data:
CREATE TABLE employees (employee_id, department_id, first_name, salary) AS
SELECT 1, 1, 'Alice', 1000 FROM DUAL UNION ALL
SELECT 2, 1, 'Betty', 2000 FROM DUAL UNION ALL
SELECT 3, 2, 'Carol', 3000 FROM DUAL UNION ALL
SELECT 4, 2, 'Debra', 3000 FROM DUAL UNION ALL
SELECT 5, 2, 'Emily', 4000 FROM DUAL;
Outputs:
DEPARTMENT_ID
EMPLOYEE_ID
FIRST_NAME
MIN_SALARY
1
1
Alice
1000
2
3
Carol
3000
Note: this will not match Debra, even though she also has the lowest salary in department 2, as it will only find a single employee with the minimum salary and the minimum employee id.
If you can have multiple employees with the same minimum-per-department then you can use a correlated sub-query:
SELECT department_id,
employee_id,
first_name,
salary
FROM employees e
WHERE EXISTS(
SELECT 1
FROM employees x
WHERE e.department_id = x.department_id
HAVING MIN(x.salary) = e.salary
);
Which, for the sample data, outputs:
DEPARTMENT_ID
EMPLOYEE_ID
FIRST_NAME
SALARY
1
1
Alice
1000
2
3
Carol
3000
2
4
Debra
3000
Which does return Debra.
fiddle

SQL combining two queries inside the same table without a subquery

A table, Employee has columns EmployeeID, Salary.
How do I find the EmployeeIDs which have a salary greater than the average salary?
Using Subqueries:
SELECT EmployeeID
FROM Employee
WHERE Salary > (SELECT AVG(Salary)
FROM Employee);
Is it possible using joins?
Is any other method possible?
It seems really silly to me to write it this way, but here it is without any subqueries:
SELECT a.EmployeeID
FROM Employee a
CROSS JOIN Employee b
GROUP BY a.EmployeeID, a.Salary
HAVING a.Salary > AVG(b.Salary)
SELECT EmployeeID
FROM Employee, (SELECT AVG(Salary) avg_savary
FROM Employee) sal
WHERE Salary > sal.avg_savary;
Move the average salary calculation to FROM to calculate it once. BTW most of moder DB can optimize your query to calculate it once.
SELECT * from
(SELECT EmployeeID,salary AVG(salary) OVER() avg_salary FROM Employee)
WHERE Salary >avg_salary
SELECT e.EmployeeID FROM Employee e
JOIN
(
SELECT avg(Salary) as Salary FROM Employee
) e1
ON e.Salary > e1.Salary
Declare #AverageSalary as Money
Select #AverageSalary = AVG(Salary) From Employee
Select * from Employee Where Salary > #AverageSalary

How to get all the rows of normal emp table and Avg(Sal) field along with it in SQl Server

In my recent interview i face the below question
Consider EMP table is having below columns
E_Name Salary empid
As a result, am expecting the output as follows
E_name Salary empid Avg(Salary)
Is it possible?
This is what window functions are for:
select e_name, salary, empid, avg(salary) over ()
from emp;
Just an alternative using group by aggregation:
SELECT
e_name,
salary,
empid,
(SELECT
AVG(salary) AS average
FROM (SELECT
1 AS nil,
x.salary
FROM emp x) sub
GROUP BY nil)
AS average
FROM emp

Subquery with multiple averages

I've been struggling with a problem that goes something like this, "Find all employees with a salary greater than the average salary of their department." My sql subquery lumps all the departments salaries together to make one average salary but I need a way to get the each individual department's average salary.
My sql statement looks like this.
SELECT EmployeeName
FROM dbo.EMP
WHERE Salary > (
SELECT AVG(Salary)
FROM dbo.EMP
)
GROUP BY DeptNo
here is quick variant:
select EmployeeName
from
dbo.EMP as a
inner join
(
SELECT DeptNo, AVG(Salary) as avgSalary
FROM dbo.EMP
GROUP BY DeptNo
) as b
on (a.DeptNo=b.DeptNo and a.Salary > b.avgSalary)
You can just finish off that subquery to make it a correlated subquery:
SELECT EmployeeName
FROM dbo.EMP as t1
WHERE Salary > (
SELECT AVG(Salary)
FROM dbo.EMP
WHERE dbo.EMP.DeptNo = t1.DeptNo
)
Alternatively you could use Window Functions:
SELECT
EmployeeName,
CASE WHEN Salary > AVG(Salary) OVER (PARTITION BY DeptNo) Then 'X' END as [HigherThanAverage]
FROM dbo.EMP
That will give you all employees and an indicator if their salary is higher than their department's average, which you could filter out later on. I figured I'd stick this in here since it gives you some options as the scale of your query grows.

Retrieve the names of employees who is making least in their departments

I have two tables
EMPLOYEE (Fname, Lname, Ssn, Salary, Dno)
DEPARTMENT (Dname, Dno, Location)
I want to list the names of all employees making the least in their department
I have come up with this
select min(E.Salary) from EMPLOYEE E group by E.Dno;
but how do I join the EMPLOYEE table with it and display the 'Fname' and 'Lname';
Analytic functions would be best but this would also work:
select *
from employee e
where salary = (select min(x.salary) from employee x where x.dno = e.dno)
Use Analytic function, ROW_NUMBER() OVER( PARTITION BY DNO ORDER BY SALARY) as RN. So, WHERE RN = 1 will give you the employee with least salary in each department.
Remember, if there are two employees with same salary, then you need to use DENSE_RANK to avoid similar rank.
Note : This answer is for Oracle.
A simple way to do it is to check that no one with a lower salary exists in the same department;
SELECT e1.*
FROM employee e1
WHERE NOT EXISTS(
SELECT 1 FROM employee e2 WHERE e1.dno = e2.dno AND e1.salary > e2.salary
);
An SQLfiddle to test with.