EXISTS command vs IN command - sql

For reference, I am trying to answer a SQL question at: https://www.w3resource.com/sql-exercises/sql-subqueries-exercise-21.php
And the answer is:
SELECT first_name, last_name, department_id
FROM employees
WHERE EXISTS (SELECT *
FROM employees
WHERE salary > 3700);
But can someone please explain why the above does not return the same result as:
SELECT e1.first_name, e1.last_name, e1.department_id
FROM employees AS e1
WHERE e1.employee_id IN (SELECT e2.employee_id
FROM employees AS e2
WHERE e2.salary > 3700);

The EXISTS clause gives a value of TRUE or FALSE. The statement that you've written will give a value of TRUE if there is any row at all in employees with a salary of more than 3700. You did not include any condition that requires that row to match the id of the employee you're checking.
If you want to add such a condition, you'd write (assuming a unique employee id exists and is called employee_id):
SELECT first_name, last_name, department_id
FROM employees E1
WHERE EXISTS
(SELECT *
FROM employees
WHERE salary > 3700 AND employee_id = E1.employee_id )

The first has no correlation clause. Hence, it either returns all rows in employees (if any row matching the condition exists). Or it returns no rows in employees (if no such rows exist).
The second is comparing the rows in the subquery to the id in the outer query. The equivalent query using exists would be:
SELECT e1.first_name, e1.last_name, e1.department_id
FROM employees AS e1
WHERE EXISTS (SELECT e2.employee_id
FROM employees AS e2
WHERE e2.employee_id = e1.employee_id AND
e2.salary > 3700
);
Of course, one would expect the employee_id to be the primary key for a table called employees. So this would do the same thing:
SELECT e1.first_name, e1.last_name, e1.department_id
FROM employees AS e1
WHERE e1.salary > 3700

Related

How do we remove the column that is created by LEAD function

When I run run below query
SELECT
*
FROM
(
SELECT
LEAD(hire_date)
OVER(PARTITION BY department_id
ORDER BY
hire_date
) AS recent_joinee,
a.*
FROM
employees a
)
WHERE
recent_joinee IS NULL;
I am getting below results
But i don't need the first column "Recent_joinee" as this is created by the lead function , Rest of "n" number of columns i need.
For this scenario what i need to do ?
You should specify the columns you need.
You can either get all fields (*) or specify the fields you want.
More info
It can be cumbersome to list all the columns. Oracle doesn't have a simple way to remove columns. For your example, you can incur the overhead of an extra join (which isn't too much if the key is the primary key):
SELECT e.*
FROM employees e JOIN
(SELECT LEAD(hire_date) OVER (PARTITION BY department_id
ORDER BY hire_date
) AS recent_joinee,
e2.*
FROM employees e2
) e2
USING (employee_id)
WHERE e2.recent_joinee IS NULL;
Alternatively, you could use a correlated subquery:
select e.*
from employee e
where not exists (select 1
from employee e2
where e2.department_id = e.department_id and
e2.hire_date > e.hire_date
);

practice sql explanation

http://studybyyourself.com/seminar/sql/exercises/8-3/?lang=en
Please provide data about all employees whose salary is higher or equal to the average salary of employees working in the same department (regardless if employees have left the company or not). Required attributes are last name, first name, salary and department name.
SELECT emp.Last_name, emp.First_name, emp.Salary, D.Name
FROM Employee AS emp
INNER JOIN Department AS D ON emp.Department_id = D.ID
WHERE emp.Salary >=
(
SELECT AVG(e.Salary)
FROM Employee AS e
GROUP BY e.Department_id
HAVING e.Department_id = emp.Department_id
)
Can anyone please help explain the solution? Specifically, what does the 'having' clause do in this case that allows the sub-query to work? I get stuck up until that point without the having clause and I expectedly get the 'subquery returns more than 1 row' error but I am not sure how the having clause is fixing the problem.

subquery and join not giving the same result

1
select *
from employees
where salary > (select max(salary) from employees where department_id=50)
2
select *
from employees e left join
employees d
on e.DEPARTMENT_ID =d.DEPARTMENT_ID
where d.salary > (select max(salary) from employees where department_id=50)
why the second query is giving multiple record
i want achieve the same result as of 1st query using join.....
Thanks in Advance......
Rocky, the first select is correct. Why do you want to do any join? Without further information the objective of the second select is not clear (nonsense).
I can't see the point about joining against the same table by DEPARTMENT_ID. Anyway, the problem about duplicates is because you are joining the same two tables by a key is not pk, basically you are multiplyng each employee for all the employees of the same department. This version eliminate duplicates but still has no improvement from the first one.
select *
from employees e left join
employees d
on e.employee_ID = d.employee_ID
where d.salary > (select max(salary) from employees where department_id=50)
You are probably looking for an anti join. This is a pattern mainly used in a young DBMS where IN and EXISTS clauses are slow compared to joins, because the developers focused on joins only.
You are looking for all employees whose salaries are greater than all salaries in department 50. With other words: WHERE NOT EXISTS a salary greater or equal in department 50.
Your query can hence be written as:
select *
from employees e
where not exists
(
select null
from employees e50
where e50.department_id = 50
and e50.salary >= e.salary
);
As an anti join (an outer join where you dismiss all matches):
select *
from employees e
left join employees e50 on e50.department_id = 50 and e50.salary >= e.salary
where e50.salary is null;

somebody explain to me how this query works step by step?

I don't yet understand this SQL statement:
select FIRST_NAME
from EMPLOYEES e
where DEP_ID != (select DEP_ID
from EMPLOYEES
where e.MANAGER_ID = EMPLOYEE_ID);
I would write your query as:
select e.FIRST_NAME
from EMPLOYEES e
where e.DEP_ID <> (select e2.DEP_ID
from EMPLOYEES e2
where e.MANAGER_ID = e2.EMPLOYEE_ID
);
This does not functionally change the query but it qualifies all column references and uses <> which is the traditional SQL operator for not equals.
What this query is doing is returning all employees whose department is not the same as their managers department.
How does it do this? The subquery is a correlated subquery. For each row in employees the subquery returns the department id of the manager.
The where clause then checks whether or not it matches the employee's manager.
This subquery will get the Manager's departments.
select DEP_ID from EMPLOYEES where e.MANAGER_ID = EMPLOYEE_ID
So the main query will just get the employees that not managers.
select FIRST_NAME from EMPLOYEES e where DEP_ID != (Managers dept_ID)
It's finding the employees who are not under a particular manager.
Let's see the inner part first:
select DEP_ID from EMPLOYEES where e.MANAGER_ID = EMPLOYEE_ID
This will fetch the department under particular manager
Now the outer part:
select FIRST_NAME from EMPLOYEES e where DEP_ID != <Departments under particular manager>
Now the result will the list of employees's first name not under that manager

SQL select with multiple different conditions

I am still learning SQL and can't find a proper way to find the following information:
I have created a table "employees" with the following columns:
'department', 'age', 'salary', 'bonus';
I am trying to design a query that will give me all employees that have someone the same age as them in another department and with a bonus superior to their salary.
(to be more precise, if someone in department 'SALES' has the same age as someone in department 'RESEARCH' and have a bonus that is superior to that guy in research's salary, then I would like to display both of them)
Is this possible to do in sql?
Thank you for your time,
-Tom
You can do this using exists. Because you care about the relationship in both direction, this is as simple as looking for people with the same age in the two departments but who do not have the same bonus:
select e.*
from employees e
where exists (select 1
from employees e2
where e2.department <> e.department and e2.age = e.age and
e2.bonus <> e.bonus
);
To get the pairs on the same row, use a self-join:
select e1.*, e2.*
from employees e1 join
employees e2
on e1.age = e2.age and e1.department <> e2.department and
e1.bonus > e2.bonus;