join operation in SQL - sql

There is such a task: By joining the tables HR.DEPARTMENTS and HR.EMPLOYEES, display complete data on departments in which the minimum salary is below 5000.
I tried to do this, but it gives an error
select distinct d.department_id,department_name,
d.manager_id, location_id
from hr.departments d
left join hr.employees e on e.department_id = d.department_id
where min(e.salary) < 5000
order by 1
Error: group function is not allowed here
This is what hr.employees looks like:
EMPLOYEE_ID FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID
100 Steven King SKING 515.123.4567 17-JUN-03 AD_PRES 24000 - - 90
hr.departments:
DEPARTMENT_ID DEPARTMENT_NAME MANAGER_ID LOCATION_ID
10 Administration 200 1700

You cannot use MIN in the WHERE clause, because MIN is an aggregation result over many rows, but in a WHERE clause you look at single rows (before any aggregation takes place).
The task to get the departments in question by joining the tables is a bit weird, because this is not how this should be done in SQL. If you must do it this way, then you only need a slight change to your query: Change the join into an inner join and check the rows' salary.
select distinct
d.department_id, department_name, d.manager_id, location_id
from hr.departments d
join hr.employees e on e.department_id = d.department_id
where e.salary < 5000
order by d.department_id;
The proper solution would use EXISTS or IN instead, so as not to create an unnecessarily large intermediate result that you must get rid of with DISTINCT:
select *
from hr.departments
where department_id in (select department_id from employees where salary < 5000)
order by department_id;
or
select *
from hr.departments d
where exists
(
select null
from employees e
where e.salary < 5000
and e.department_id = d.department_id
)
order by department_id;

This works for your solution, where is use for row filtering like gender = 'Male' while having is for aggregating filtering functions like min(salary) < 5000 but for having you need to group by with something like department.
SELECT
*
FROM
DimEmployee
WHERE
EmployeeID IN (
SELECT
EmployeeID
FROM
DimEmployee
GROUP BY
EmployeeID
HAVING
MIN(Salary) < 5000
)

First of all, don't use distinct, unless you have to. Secondly, you can't use group functions like that.
In order to solve this, you need to break the task into steps, breaking down your sentences.
"...the tables"
So we have this:
SELECT * FROM hr.departments;
... and ...
SELECT * FROM hr.employees;
"HR.DEPARTMENTS and HR.EMPLOYEES"
As you pointed our, the FK is the department.
(we first test the join, then add what we need)
(the 1 is just a placeholder; you can use EMPLOYEE_ID or COUNT(1), it's irrelevant)
SELECT 1
FROM hr.employees e
LEFT JOIN hr.departments d on e.department_id = d.department_id;
"display complete data on departments"
Well, this is simple, you just enumerate the columns you need or use d.*. We'll do this later.
"which the minimum salary is below 5000"
Now we get to the blocking issue. Let's list the records.
SELECT d.*
FROM hr.employees e, hr.departments d
WHERE e.department_id = d.department_id
AND EXISTS (SELECT 1 FROM hr.employees m WHERE m.department_id = d.department_id GROUP BY m.department_id HAVING min(m.salary) < 5000);
But what's this? We get a line for every employee of that department. Well, we can either use DISTINCT, but that is bad practice or we can fix the query.
We'll just remove the employees from the join.
SELECT d.*
FROM hr.departments d
WHERE EXISTS (SELECT 1 FROM hr.employees e WHERE e.department_id = d.department_id GROUP BY e.department_id HAVING min(e.salary) < 5000);
UPDATE:
To respect the task "By joining the tables"
So we have this:
SELECT d.*
FROM hr.departments d,
(
SELECT e.department_id
FROM hr.employees e
GROUP BY e.department_id
HAVING min(salary) < 5000
) e
WHERE e.department_id = d.department_id;

Related

SQL: Unable to SELECT joined column

I've written an SQL statement to display the department_id, job_id and of employees with the lowest salary, but one of the conditions required me to exclude departments with the names 'IT' and 'SALES', which were only accessible from another table departments. As such I joined the two tables using the shared column department_id and managed to filter the results as needed however, I am unable to select the department_id to display alongside the job_id and salaries. This is what I've managed so far:
SELECT EMPLOYEES.DEPARTMENT_ID JOB_ID, MIN(SALARY)
FROM EMPLOYEES JOIN DEPARTMENTS
ON DEPARTMENTS.DEPARTMENT_ID = EMPLOYEES.DEPARTMENT_ID
WHERE JOB_ID NOT LIKE '%REP'
AND DEPARTMENTS.DEPARTMENT_NAME NOT IN ('IT','SALES')
GROUP BY EMPLOYEES.DEPARTMENT_ID
HAVING MIN(SALARY) >= 6000 AND MIN(SALARY) <= 18000;
First, table aliases make the query much easier to write and read:
SELECT e.DEPARTMENT_ID, e.JOB_ID, MIN(e.SALARY)
FROM EMPLOYEES e JOIN
DEPARTMENTS d
ON d.DEPARTMENT_ID = e.DEPARTMENT_ID
WHERE e.JOB_ID NOT LIKE '%REP' AND d.DEPARTMENT_NAME NOT IN ('IT',' SALES')
GROUP BY e.DEPARTMENT_ID, e.JOB_ID
HAVING MIN(e.SALARY) >= 6000 AND MIN(e.SALARY) <= 18000;
You need all non-aggregated columns in the GROUP BY.
Comma is missing in your query after department id in SELECT - so it considers Job ID as Alias for department ID and displayed as Job ID in query result. But again you don't have Job ID in GROUP BY Clause and need to add that in group by or have to use any aggregate function
SELECT **EMPLOYEES.DEPARTMENT_ID, JOB_ID,** MIN(SALARY)
FROM EMPLOYEES JOIN DEPARTMENTS ON DEPARTMENTS.DEPARTMENT_ID=EMPLOYEES.DEPARTMENT_ID
WHERE JOB_ID NOT LIKE '%REP' AND DEPARTMENTS.DEPARTMENT_NAME NOT IN('IT','SALES')
GROUP BY EMPLOYEES.DEPARTMENT_ID,JOB_ID
HAVING MIN(SALARY) >= 6000 AND MIN(SALARY) <= 18000;

SQL-HR Schema, Retrieving the Dept.Names,managers and employees per dept

Hello guys and thank you in advance for your time and help.
So I am trying to get a list of the Department names their manager name and the total number of employees per department.
My code so far looks like this:
select d.department_name,e.first_name,e.last_name
from employees e, departments d
where e.department_id = d.department_id and d.manager_id=e.employee_id
group by d.department_name,e.first_name,e.last_name
order by d.department_name;
which produces the list of the manager per department,but I am still short of the count of employees per department. Any ideas?
You need to use the COUNT function. Try this:
select d.department_name,e.first_name,e.last_name,count(e.employee_id) as `TotalNoOfEmployees`
from employees e JOIN departments d
ON e.department_id = d.department_id and d.manager_id=e.employee_id
group by d.department_name,e.first_name,e.last_name
order by d.department_name;
Also try not to use the old way of Joining the tables ie, comma separated JOINS.
After a lot of experimentation I got it. Posting it in case somebody might find it useful someday:
select distinct d.department_name,
(select e.first_name||', '||e.last_name from employees e
where d.department_id=e.department_id and
d.manager_id=e.employee_id)as "manager_name",
( select count( employee_id ) from employees e
where d.department_id=e.department_id ) as "total_no_of_employees"
from employees e
join departments d on d.department_id=e.department_id
order by d.department_name;
Try this:
select emp.manager_id, mgr.first_name, mgr.last_name, dept.department_name, count(emp.employee_id)
from hr.employees emp
join hr.employees mgr
on emp.manager_id = mgr.employee_id
join hr.departments dept
on mgr.department_id = dept.department_id
group by emp.manager_id, mgr.first_name, mgr.last_name, dept.department_name
order by department_name

Employees with largest salary in department

I found a couple of SQL tasks on Hacker News today, however I am stuck on solving the second task in Postgres, which I'll describe here:
You have the following, simple table structure:
List the employees who have the biggest salary in their respective departments.
I set up an SQL Fiddle here for you to play with. It should return Terry Robinson, Laura White. Along with their names it should have their salary and department name.
Furthermore, I'd be curious to know of a query which would return Terry Robinsons (maximum salary from the Sales department) and Laura White (maximum salary in the Marketing department) and an empty row for the IT department, with null as the employee; explicitly stating that there are no employees (thus nobody with the highest salary) in that department.
Return one employee with the highest salary per dept.
Use DISTINCT ON for a much simpler and faster query that does all you are asking for:
SELECT DISTINCT ON (d.id)
d.id AS department_id, d.name AS department
,e.id AS employee_id, e.name AS employee, e.salary
FROM departments d
LEFT JOIN employees e ON e.department_id = d.id
ORDER BY d.id, e.salary DESC;
->SQLfiddle (for Postgres).
Also note the LEFT [OUTER] JOIN that keeps departments with no employees in the result.
This picks only one employee per department. If there are multiple sharing the highest salary, you can add more ORDER BY items to pick one in particular. Else, an arbitrary one is picked from peers.
If there are no employees, the department is still listed, with NULL values for employee columns.
You can simply add any columns you need in the SELECT list.
Find a detailed explanation, links and a benchmark for the technique in this related answer:
Select first row in each GROUP BY group?
Aside: It is an anti-pattern to use non-descriptive column names like name or id. Should be employee_id, employee etc.
Return all employees with the highest salary per dept.
Use the window function rank() (like #Scotch already posted, just simpler and faster):
SELECT d.name AS department, e.employee, e.salary
FROM departments d
LEFT JOIN (
SELECT name AS employee, salary, department_id
,rank() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rnk
FROM employees e
) e ON e.department_id = d.department_id AND e.rnk = 1;
Same result as with the above query with your example (which has no ties), just a bit slower.
This is with reference to your fiddle:
SELECT * -- or whatever is your columns list.
FROM employees e JOIN departments d ON e.Department_ID = d.id
WHERE (e.Department_ID, e.Salary) IN (SELECT Department_ID, MAX(Salary)
FROM employees
GROUP BY Department_ID)
EDIT :
As mentioned in a comment below, if you want to see the IT department also, with all NULL for the employee records, you can use the RIGHT JOIN and put the filter condition in the joining clause itself as follows:
SELECT e.name, e.salary, d.name -- or whatever is your columns list.
FROM employees e RIGHT JOIN departments d ON e.Department_ID = d.id
AND (e.Department_ID, e.Salary) IN (SELECT Department_ID, MAX(Salary)
FROM employees
GROUP BY Department_ID)
This is basically what you want. Rank() Over
SELECT ename ,
departments.name
FROM ( SELECT ename ,
dname
FROM ( SELECT employees.name as ename ,
departments.name as dname ,
rank() over (
PARTITION BY employees.department_id
ORDER BY employees.salary DESC
)
FROM Employees
JOIN Departments on employees.department_id = departments.id
) t
WHERE rank = 1
) s
RIGHT JOIN departments on s.dname = departments.name
Good old classic sql:
select e1.name, e1.salary, e1.department_id
from employees e1
where e1.salary=
(select maxsalary=max(e.salary) --, e. department_id
from employees e
where e.department_id = e1.department_id
group by e.department_id
)
Table1 is emp - empno, ename, sal, deptno
Table2 is dept - deptno, dname.
Query could be (includes ties & runs on 11.2g):
select e1.empno, e1.ename, e1.sal, e1.deptno as department
from emp e1
where e1.sal in
(SELECT max(sal) from emp e, dept d where e.deptno = d.deptno group by d.dname)
order by e1.deptno asc;
SELECT
e.first_name, d.department_name, e.salary
FROM
employees e
JOIN
departments d
ON
(e.department_id = d.department_id)
WHERE
e.first_name
IN
(SELECT TOP 2
first_name
FROM
employees
WHERE
department_id = d.department_id);
`select d.Name, e.Name, e.Salary from Employees e, Departments d,
(select DepartmentId as DeptId, max(Salary) as Salary
from Employees e
group by DepartmentId) m
where m.Salary = e.Salary
and m.DeptId = e.DepartmentId
and e.DepartmentId = d.DepartmentId`
The max salary of each department is computed in inner query using GROUP BY. And then select employees who satisfy those constraints.
Assuming Postgres
Return highest salary with employee details, assuming table name emp having employees department with dept_id
select e1.* from emp e1 inner join (select max(sal) avg_sal,dept_id from emp group by dept_id) as e2 on e1.dept_id=e2.dept_id and e1.sal=e2.avg_sal
Returns one or more people for each department with the highest salary:
SELECT result.Name Department, Employee2.Name Employee, result.salary Salary
FROM ( SELECT dept.name, dept.department_id, max(Employee1.salary) salary
FROM Departments dept
JOIN Employees Employee1 ON Employee1.department_id = dept.department_id
GROUP BY dept.name, dept.department_id ) result
JOIN Employees Employee2 ON Employee2.department_id = result.department_id
WHERE Employee2.salary = result.salary
SQL query:
select d.name,e.name,e.salary
from employees e, depts d
where e.dept_id = d.id
and (d.id,e.salary) in
(select dept_id,max(salary) from employees group by dept_id);
Take look at this solution
SELECT
MAX(E.SALARY),
E.NAME,
D.NAME as Department
FROM employees E
INNER JOIN DEPARTMENTS D ON D.ID = E.DEPARTMENT_ID
GROUP BY D.NAME

Select the biggest value

I am trying to solve a simple problem but i am getting stack on the details.
I have 2 tables, one has employees and the other one has departments. My problem: I am trying to check which department has the most employees and output only that specific department.
So far I have:
select count(*) Number_of_employees
from department d, employee e
where d.department_id = e.department_id
group by department_name
which outputs:
NUMBER_OF_EMPLOYEES
----------------------
2
4
3
3
3
My goal is to to output only the department with the most employees which is the department with 4 employees.
I tried using the MAX and JOIN but i am not so good with join yet so any suggestions will be appreciated.
#Zsolt Botykai
I think this is correct, apart from order by needs to be DESC, and I don't think you can refer to number_of_employees inside the query. ( you can't in oracle anyway ).
select department_name
from
(select department_name
,number_of_employees
from
( select department_name, count(*) Number_of_employees
from department d, employee e
where d.department_id = e.department_id
group by department_name)
order by Number_of_employees DESC)
where rownum = 1
You could do it this way to avoid the rownum:
select
max(d.department_name) keep (dense_rank first order by count(1) desc) as department_name
, count(1) as number_of_employees
from employee e
inner join department d on (e.department_id = d.department_id)
group by d.department_name
;
select department_name from
( select department_name, count(*) Number_of_employees
from department d, employee e
where d.department_id = e.department_id
group by department_name
order by 2 desc )
where rownum = 1
should do.
HTH

Single-row subquery returns more than one row

I need some help with oracle sql. The problem: I have 2 tables employee and department. I got the average department salary from one query and i want to use it to see how many employees make more money than the average of their department. I have this so far.
This query returns the avg of the department:
select ROUND(AVG(Salary), 2) Dept_avg_sal
from employee, department
where department.department_id = employee.department_id
group by department_name
What i am trying to do is:
select employee_name,
salary,
d.department_name
from employee e,
department d
where salary > (select ROUND(AVG(Salary), 2) Dept_avg_sal
from employee,
department
where department.department_id = employee.department_id
group by department_name)
The error that im getting is :01427. 00000 - "single-row subquery returns more than one row"
I know that 2 employees in the same department make more money than the average and i think this is what is causing the issue.
EMPLOYEE_NAME - SALARY - -DEPARTMENT_NAME- DEPT_AVG_SAL
-------------------- ---------------------- -------------------- ------------
FISHER - 3000.00 - SALES - 2500.00
JONES - 3000.00 - ACCOUNTING - 2750.00
KING - 5000.00 - EXECUTIVE - 4500.00
**SCOTT - 2500.00 - IT - 2100.00
SMITH - 2900.00 - IT - 2100.00**
WILSON - 3000.00 - RESEARCH - 2633.33
Any help would be really appreciated.
Your initial query is missing any join condition on the outer query and any correlation condition in the inner query that would limit that to just the row for the department of interest. Also generally you do not want to group by name as presumably id is the primary key.
Resolving these issues to fix your correlated subquery gives
SELECT e.employee_name,
e.salary,
d.department_name
FROM employee e
JOIN department d
ON d.department_id = e.department_id
WHERE e.salary > (SELECT ROUND(AVG(Salary), 2) Dept_avg_sal
FROM employee e2
WHERE e2.department_id = e.department_id)
But you may find ditching the scalar correlated sub-query and replacing with a derived table works better.
SELECT e.employee_name,
e.salary,
d.department_name
FROM employee e
JOIN department d
ON d.department_id = e.department_id
JOIN (SELECT ROUND(AVG(Salary), 2) Dept_avg_sal,
department_id
FROM employee
GROUP BY department_id) e2
ON e2.department_id = e.department_id
AND e.salary > e2.Dept_avg_sal
For Oracle the following should also work I believe
SELECT employee_name,
salary,
d.department_name
FROM (SELECT employee_name,
salary,
d.department_name,
AVG(Salary) OVER (PARTITION BY e.department_id) AS AvgSalary
FROM employee e
JOIN department d
ON d.department_id = e.department_id)
WHERE salary > AvgSalary
The > operator accepts only one value, thus your inner SELECT has to return exactly 1 row. My guess is that you get multiple rows. Look at what your inner SELECT returns and try LIMIT 1.
I think you should put an extra d.department_id = department.department_id condition to the subquery (not tested):
select employee_name,
salary,
d.department_name
from employee e,
department d
where salary > (select ROUND(AVG(Salary), 2) Dept_avg_sal
from employee,
department
where department.department_id = employee.department_id
AND d.department_id = department.department_id
group by department_name)
Or just write:
select e.employee_name,
e.salary,
d.department_name
from employee e,
department d
where e.department_id = d.department_id
AND salary > (select ROUND(AVG(Salary), 2) Dept_avg_sal
from employee
where e.department_id = employee.department_id)