SQL Issue with my query

SQL Issue with my query - sql

I have an issue with the query, that operates on following talble:
+---------------------------+
| ID NAME SALARY DEPARTMENT |
+---------------------------+
| 1 John 100 Accounting |
| 2 Mary 200 IT |
+---------------------------+
What I am trying to achive, is find the query, that will result in the following:
For each employe, find the average salary of those employes whose salary is either up to 100 more or 100 less then salary of given employee, and they work in the same department.
So far I have this:
SELECT E1.ID, AVG(E2.SALARY) FROM E1 EMP, E2 EMP
WHERE ABS(E1.SALARY-E2.SALARY)<= 100 AND E1.DEPARTMENT = E2.DEPARTMENT
GROUP BY E1.NAME
Is this correct?

You'd better use explicit join syntax:
SELECT E1.ID, AVG(E2.SALARY)
FROM EMP E1
JOIN EMP E2
ON E1.ID <> E2.ID AND
E1.DEPARTMENT = E2.DEPARTMENT AND
ABS(E1.SALARY - E2.SALARY) <= 100
GROUP BY E1.ID
Predicate E1.ID <> E2.ID is necessary in case you don't want to include the salary of the same employee in the average calculation.

Related

SQL displaying staff with more salary than their managers

I'm trying to display staff from the same department who earn more than their managers.
SELECT ID, NAME, DEPARTMENT, SALARY, JOB
FROM STAFF
WHERE SALARY > ANY (SELECT SALARY FROM STAFF WHERE JOB = 'Manager')
This doesn't seem to work, and I'm reallly not sure why.
Here's a peep at how the tables are formatted:
ID | NAME | DEPARTMENT | SALARY | JOB
20 | JOHN | 180 | 52000 | Manager
30 | KATY | 180 | 60000 | Analyst

The problem is that you need to correlate the subquery to match the same departement:
SELECT s1.ID, s1.NAME, s1.DEPARTMENT, s1.SALARY, s1.JOB
FROM SALARY s1
WHERE
s1.JOB <> 'MANAGER' AND
s1.SALARY > (SELECT s2.SALARY FROM SALARY s2
WHERE s2.DEPARTMENT = s1.DEPARTMENT AND s2.JOB = 'MANAGER');
This answer assumes that each department would have only one manager. If there could be more than one manager, then it would be safer to write the above using exists logic:
SELECT s1.ID, s1.NAME, s1.DEPARTMENT, s1.SALARY, s1.JOB
FROM SALARY s1
WHERE
s1.JOB <> 'MANAGER' AND
NOT EXISTS (SELECT 1 FROM SALARY s2
WHERE s2.DEPARTMENT = s1.DEPARTMENT AND
s2.JOB = 'MANAGER' AND
s2.SALARY >= s1.SALARY);

SQL - Retrieving data within groups before and after some condition

With the two following tables:
EMPLOYEE (Fname, Lname, SSN, DNO)
DEPARTMENT (Dname, Dnumber)
For each department that has more than five employees, retrieve the
department name and the number of its employees who are making more
than $40,000
Here is an incorrect solution to this:
SELECT
dname,
COUNT(*)
FROM
Department, Employee
WHERE
dnumber = dno
AND salary > 40000
GROUP BY
dname
HAVING
COUNT(*) > 5;
It is clear that it would not list any department that have five or more employees unless they all have more than $40,000 salary, because where is applied before group by clause. which is not what we want.
Here is the correct solution:
SELECT
dname, COUNT(*)
FROM
Department, Employee
WHERE
dnumber = dno
AND salary > 40000
AND dno IN (SELECT dno
FROM Employee
GROUP BY dno
HAVING COUNT(*) > 5)
GROUP BY
dname
I cant see why is this correct?
Isn't it going to restrict the rows first with employees who have more than $40,000, then do the grouping just like the first query? what is different here?

Sub-Query, the basic:
First, let make this query a bit easier to read :
SELECT
dname,
COUNT(*)
FROM
Department,
Employee
WHERE
dnumber = dno
AND salary > 40000
AND dno IN (
SELECT dno
FROM Employee
GROUP BY dno
HAVING COUNT(*) > 5
)
GROUP BY dname
As you can see, there is what we call a "sub-query": a query inside the query.
This is the part in dno IN (/*HERE is the Sub-query*/).
As in mathematics parenthesis are run first, so SQL will go find DNO that have more than 5 employees, producing the following query :
SELECT
dname,
COUNT(*)
FROM
Department,
Employee
WHERE
dnumber = dno
AND salary > 40000
AND dno IN (
'dno10emp', 'dno24emp', 'dno45emp'
)
GROUP BY dname
Now, you find yourself with a simple query that will produce the result:
of department that have a least one employee with >40k$ salary
and are part of the department with more the 5 employee
What's wrong ?!
Well, I'll said your "good query" isn't that good, and that's why you're struggling: It'll not bring department if they don't have at least one employee with > 40k$.
Here is the query that'll do this :
SELECT
Department.dname,
COUNT(Employee.salary)
FROM
Department
LEFT JOIN Employee
ON Department.dnumber = Employee.dno
AND Employee.salary > 40000
WHERE
Department.dnumber IN (
SELECT Employee.dno
FROM Employee
GROUP BY Employee.dno
HAVING COUNT(*) > 5
)
GROUP BY Department.dname
This will bring you all department that have at least 6 employee, then count the number of employee with at least 40K$ (a department could have 0).
Could you show me ?
As an image worth a thousand word :
SQL Fiddle
MySQL 5.6 Schema Setup:
| dname | nb | salary |
|-------------------|----|--------|
| accounting | 2 | 30000 |
| accounting | 4 | 50000 |
| boss | 6 | 150000 |
| garbage-collector | 6 | 15000 |
Query 1:
SELECT
dname,
COUNT(*)
FROM
Department,
Employee
WHERE
dnumber = dno
AND salary > 40000
GROUP BY dname
HAVING COUNT(*) > 5
Results:
| dname | COUNT(*) |
|-------|----------|
| boss | 6 |
Query 2:
SELECT
dname,
COUNT(*)
FROM
Department,
Employee
WHERE
dnumber = dno
AND salary > 40000
AND
dno IN (
SELECT dno FROM Employee
GROUP BY dno
HAVING COUNT(*) > 5
)
GROUP BY dname
Results:
| dname | COUNT(*) |
|------------|----------|
| accounting | 4 |
| boss | 6 |
Query 3:
SELECT
Department.dname,
COUNT(Employee.salary)
FROM
Department
LEFT JOIN Employee
ON Department.dnumber = Employee.dno
AND Employee.salary > 40000
WHERE
Department.dnumber IN (
SELECT Employee.dno
FROM Employee
GROUP BY Employee.dno
HAVING COUNT(*) > 5
)
GROUP BY Department.dname
Results:
| dname | COUNT(Employee.salary) |
|-------------------|------------------------|
| accounting | 4 |
| boss | 6 |
| garbage-collector | 0 |

See sample data below.
http://sqlfiddle.com/#!9/357d29/2
The first query will only get departments with 6 or more highy paid employees WHILE the 2nd query will get highly paid employees of those departments with 6 or more employees. Below sample will not show in the 1st query but will show in the 2nd query.
Department Employee Salary
accounting john doe 50k
jan smith 55k
dan brown 60k
eric murphy 60k
al daniels 70k
ellen boyle 30k
1st query: nothing because only five emp have > 40k salary
2nd query: All except ellen boyle. Department has > 5 employees and all except 1 has > 40k salary

For the record, you already got correct answers. I'll just try to explain it in a different way.
Your first query has 1 select statement. It only returns employees with salary > 40k and from departments > 5 employees. Every record will only contain information about an employee with salary > 40k and from departments > 5 employees.
Your second query has 2 select statements:
This is the first one:
Select dname, count(*)
from Department, Employee
where dnumber = dno
and salary > 40000
it returns the count of all employees, by department name who earn > 40000. There are no conditions on the count(*) here. And the condition on the salary has no power over the second select statement:
SELECT Employee.dno
FROM Employee
GROUP BY Employee.dno
HAVING COUNT(*) > 5
This one returns ALL employees in all departments. This is where we have the condition on the count(*) - but it is only applied locally, to limit the number of employees per department.
And then two statements are joined together - so, first we limit the departments to the ones we are interested in, and then from those only select high-salary employees.

First, never use commas in the FROM clause. Always use proper, explicit JOIN syntax.
I think the best and simplest solution uses conditional aggregation:
SELECT d.dname, SUM(CASE WHEN e.salary > 40000 THEN 1 ELSE 0 END) as num_40kplus
FROM Department d JOIN
Employee e
ON d.dno = e.dnumber
GROUP BY dname
HAVING COUNT(*) > 5;
I see no reason why a subquery would be necessary or desirable.

How to Improve This Self-Joins

I am learning Oracle SQL by working with its primitive HR schema where there is EMPLOYEES table which has three columns that I'm mainly interested in: MANAGER_ID, which is basically a self reference to EMPLOYEES.EMPLOYEE_ID, DEPARTMENT_ID, and SALARY. (You can find the schema diagram and schema objects here).
I wish, for each employee, to retrieve his/her SALARY, alongside of employee's manager's departmental average salary. For instance, if we have the following (EMPLOYEE_ID = 140 is the interested party here):
+-------------+--------+---------------+------------+
| EMPLOYEE_ID | SALARY | DEPARTMENT_ID | MANAGER_ID |
+-------------+--------+---------------+------------+
| 140 | 12000 | 50 | 110 |
| 110 | 20000 | 60 | 101 |
| 156 | 18000 | 60 | 101 |
| 175 | 15000 | 60 | 105 |
| 320 | 24000 | 60 | 105 |
+-------------+--------+---------------+------------+
I am interested in obtaining an average salary of all the managers (not all other non-managerial employees) in department where employee's manager works at (in this case, DEPARTMENT_ID =60), and compare it with employee's (in this case, 140). In a sample data above, the output should be:
+-------------+--------+-------------+-------------+------------+
| EMPLOYEE_ID | SALARY | AVG_MGR_SAL | MGR_DEPT_ID | MANAGER_ID |
+-------------+--------+-------------+-------------+------------+
| 140 | 12000 | 19250 | 60 | 110 |
+-------------+--------+-------------+-------------+------------+
where we have four (4) managers working in department 60, and $19250 being calculated as (20000 + 18000 + 15000 + 24000) / 4. I have come up with the following query that seems to work (and excludes those employees that don't have a manager):
select
employee_id
, salary employee_salary
, trunc(mgr_info.avg_manager_salary_per_dept, 0) emp_manager_avg_sal_dept
, mgr_info.manager_dept_id
, mgr_info.manager_id
from employees
join (
select
e1.employee_id manager_id
, e1.department_id manager_dept_id
, e1.salary manager_salary
, avg(e1.salary) over (partition by e1.department_id) avg_manager_salary_per_dept
from employees e1
join (
select distinct manager_id
from employees
where manager_id is not null
) mgr_ids
on e1.employee_id = mgr_ids.manager_id
) mgr_info
on employees.manager_id = mgr_info.manager_id
order by employee_id
However, I feel like that there should be a better way of getting the same result with fewer self-joins. Is there a way to get a better performance?

Something like this... You only need one join, you can compute the average salary for the manager's department on the "manager" copy of the table. I only included a few columns, you may need more, or fewer, but I believe the core of what you wanted is covered.
(NOTE: Edited since I realized I missed one detail in the requirement)
select e.employee_id as employee_id,
e.salary as employee_salary,
m.employee_id as manager_id,
m.department_id as manager_dept_id,
m.avg_salary as avg_sal_of_mgr_dept
from hr.employees e inner join
( select employee_id, department_id,
avg(salary) over (partition by department_id) as avg_salary
from hr.employees
where employee_id in (select manager_id from hr.employees)
) m
on e.manager_id = m.employee_id
;

Here is an option which uses a series of joins to get your result:
SELECT DISTINCT t1.EMPLOYEE_ID,
t1.SALARY,
t1.DEPARTMENT_ID,
COALESCE(t2.SALARY, 0.0) AS ManagerAvgSal
FROM employees t1
LEFT JOIN
(
SELECT e1.DEPARTMENT_ID, AVG(e1.SALARY) AS SALARY
FROM employees e1
WHERE e1.EMPLOYEE_ID IN (SELECT DISTINCT MANAGER_ID FROM employees)
GROUP BY e1.DEPARTMENT_ID
) t2
ON t1.DEPARTMENT_ID = t2.DEPARTMENT_ID

Finding second highest record from a table

I want to find salary and the employees name from employee table.This employee table have column like emp_id, emp_name, emp_salary. To be clear:
emplyee
--------------
| emp_id|emp_name|emp_salary|
-----------------------------
| 100 |John | 2500 |
| 200 |Nash | 1500 |
| 300 |Koffe | 100 |
| 400 |Anan | 6000 |
| 500 |Moon | 2600 |
-----------------------------
From the above table second highest salary is 2600. How can I find this?

You can try:
SELECT max(salary)
FROM emptable
WHERE salary < (SELECT max(salary)
FROM emptable);

we can achieve this one in correlated sub query
select e1.emp_ename,e1.emp_salary from employee e1
where 2=(select count(distinct e2.emp_salary) from employee e2 where e2.salary>=e2.salary);
see how it works
select * from employees
emp_id emp_ename emp_salary
1 naresh 100
2 suresh 150
3 mahesh 200
4 sai 250
in the above table second highest salary is 200
correlated sub query means first parent query is executed first then child query is executed
see how this query is working
select e1.emp_ename,e1.emp_salary from employee e1
where 2=(select count(distinct e2.emp_salary) from employee e2 where e2.salary>=e2.salary);
1st step: first row sal goes to child query where clause
select e1.emp_ename,e1.emp_salary from employee e1
where 2=(select count(distinct e2.emp_salary) from employee e2 where e2.salary>=100)
so it counts how rows greater than 100(i.e=4)
so condition false
select e1.emp_ename,e1.emp_salary from employee e1
where 2=4(false)
2nd step
select e1.emp_ename,e1.emp_salary from employee e1
where 2=(select count(distinct e2.emp_salary) from employee e2 where e2.salary>=150);
select e1.emp_ename,e1.emp_salary from employee e1
where 2=3(FALSE)
3r step
select e1.emp_ename,e1.emp_salary from employee e1
where 2=(select count(distinct e2.emp_salary) from employee e2 where e2.salary>=200);
select e1.emp_ename,e1.emp_salary from employee e1
where 2=2(TRUE)
SO FINAL WE GOT 200
I HOPE IT WELL USE FULL FOR YOU ALL THE BEST

How to select data from two table with out using "NOT IN" in sql server?

I have 2 tables
Emp1
ID | Name
1 | X
2 | Y
3 | Z
Emp2
ID | Salary
1 | 10
2 | 20
I want to show the IDs from Emp1 which are not present in Emp2 with out using NOT IN
so the result should be like this
ID
3
now what i have done is this :
select e1.ID
from Emp1 e1 left join Emp2 e2
on e1.ID <> e2.ID
but i am getting this :
ID
1
2
3
3
so what should i do ?? WITH OUT using NOT IN

Try left join with is null condition as below
select e1.id
from emp1 e1
left join emp2 e2 on e2.id = e1.id
where e2.id is null
or not exists condition as below
select e1.id
from emp1 e1
where not exists
(
select 1
from emp2 e2
where e2.id = e1.id
)

Use this
select id from emp1
except
select id from emp2;
SQL Fiddle

Try this:
SELECT
e1.ID
FROM Emp1 e1 LEFT JOIN Emp2 e2 on e1.ID = e2.ID
WHERE e2.ID IS NOT NULL

What you need is what parado already said. Here's a good picture for some other Joins and what Range they give back:
Sarajog

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Issue with my query - sql

Related

SQL displaying staff with more salary than their managers

SQL - Retrieving data within groups before and after some condition

How to Improve This Self-Joins

Finding second highest record from a table

How to select data from two table with out using "NOT IN" in sql server?

Categories

Resources