Sql - How many people earn min and max salary? - sql

I have a table with Employees, Departments and Salaries and I would like to get min and max salaries per department (what is just min/max with group by on department), but how to count how many Employees earn that min and max salary per department?
Select Department,
Count(distinct EmployeeID) as Employees,
Min(Salary) as Min,
Max(Salary) as Max
From Employees
Group by Department;

You need to make your query as a subquery for the following one :
Select Department, count(*) as "Count People"
From Employees
Where (Department,Salary) IN
(
Select Department, Min(Salary)
From Employees
Group by Department
Union all
Select Department, Max(Salary)
From Employees
Group by Department
)
Group by Department;
Rextester Demo

Maybe you need 2 distinct values for the counters of minimum and maximum salaries of each department:
SELECT t1.Department, t1.MinCounter, t2.MaxCounter FROM
(SELECT t.Department, COUNT(*) AS MinCounter
FROM
(SELECT Department, MIN(Salary) AS MinSalary FROM Employees
GROUP BY Department) AS t
INNER JOIN Employees
ON (t.MinSalary = Employees.Salary) AND (t.Department = Employees.Department)
GROUP BY t.Department) AS t1
INNER JOIN
(SELECT t.Department, COUNT(*) AS MaxCounter
FROM
(SELECT Department, MAX(Salary) AS MaxSalary FROM Employees
GROUP BY Department) AS t
INNER JOIN Employees
ON (t.MaxSalary = Employees.Salary) AND (t.Department = Employees.Department)
GROUP BY t.Department) AS t2
ON t1.Department = t2.Department
The above query consists of 2 subqueries joined by Department, each of which fetches the counter of the minimum and the maximum salary per department. If you also want included in the query the amounts of the minimum and the maximum salary per department, then check this:
SELECT t1.Department, t1.MinSalary, t1.MinCounter, t2.MaxSalary, t2.MaxCounter
FROM
(SELECT t.Department, t.MinSalary, COUNT(*) AS MinCounter
FROM (SELECT Department, MIN(Salary) AS MinSalary FROM Employees
GROUP BY Department) AS t
INNER JOIN Employees
ON (t.Department = Employees.Department)
AND (t.MinSalary = Employees.Salary)
GROUP BY t.Department, MinSalary) AS t1
INNER JOIN
(SELECT t.Department, t.MaxSalary, COUNT(*) AS MaxCounter
FROM (SELECT Department, MAX(Salary) AS MaxSalary FROM Employees
GROUP BY Department) AS t
INNER JOIN Employees
ON (t.Department = Employees.Department)
AND (t.MaxSalary = Employees.Salary)
GROUP BY t.Department, MaxSalary) AS t2
ON t1.Department = t2.Department

You can get the min and max salaries in a subquery, then count how many employees have those salaries in an outer query, like so: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=b89ac6b76112bf7db65ce6e37754198b
select agg.*
, count(case when e.Salary = agg.MinSalary then 1 end) EmployeesWithMin
, count(case when e.Salary = agg.MaxSalary then 1 end) EmployeesWithMax
from
(
select Department
, count(1) EmployeesInDepartment
, min(Salary) MinSalary
, max(Salary) MaxSalary
from Employees
group by Department
) agg
inner join Employees e
on e.Department = agg.department
and e.Salary in (agg.MaxSalary, agg.MinSalary)
group by agg.Department
, agg.EmployeesInDepartment
, agg.MinSalary
, agg.MaxSalary
Or an alternative approach: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=8a5e539fcd3e5985d44f86b1e331f030
select agg.Department
, min(agg.MinSalary) MinSalary
, max(agg.MaxSalary) MaxSalary
, count(case when Salary = MinSalary then 1 end) EmployeesWithMin
, count(case when Salary = MaxSalary then 1 end) EmployeesWithMax
from
(
select Department
, min(Salary) over (partition by Department) MinSalary
, max(Salary) over (partition by Department) MaxSalary
, Salary
from Employees
) agg
group by agg.Department
The former may be simpler to understand for a beginner. The latter is less verbose, so easier to read. i.e. Pick whichever you're more comfortable supporting / whichever makes most sense to you.
In terms of performance, the second performed slightly better for my sample data; but a number of variables may influence this (number of rows, spread of values, availability of indexes), so it's best to test on data as similar to real-world as possible if you need to pick the optimal query.
Let me know if any of the methods used here aren't familiar to you and I can add explanations as required.

Related

How to retrieve highest salary for each department across employees?

I am trying to compile a query which gives me the highest salary per each department and for each unique employee. The complexity is that 1 employee can be part of multiple departments.
In case the same employee has the highest salary in several departments, only the department with a lower salary should show. This is my start but I am not sure how to continue from here:
select max(salary) as salary, dd.dept_name,d.emp_no
from salaries s
inner join dept_emp d on
s.emp_no=d.emp_no
inner join departments dd on
d.dept_no=dd.dept_no
group by 2,3;
My output is:
What should I modify from here?
For an employee, you seem to only want to include the department with the smallest salary. I would recommend using window functions:
select s.*
from (select s.*,
rank() over (partition by dept_name order by salary desc) as seqnum_d
from (select s.*, d.dept_name,
rank() over (partition by dept_name order by salary) as seqnum_ed
from salaries s join
dept_emp d
on s.emp_no = d.emp_no join
departments dd
d.dept_no = dd.dept_no
) s
where seqnum_ed = 1
) s
where seqnum_d = 1;
Something like this?
select m.salary, m.emp_no, salary.dept_name from salary,
(select emp_no, min(salary) salary from salary group by emp_no) m
where
m.emp_no=salary.emp_no and m.salary=salary.salary;

How to find dept name that has the highest average salary within two tables

Having two tables
Employee
Id
Name
Salary
DepartmentId
and
Departament
Id
Name
How can I get the highest average salary within two tables
like
Joe and Max belong to dept 1 so, avg is (70K+90K)/2
= 80K
and
Henry and Sam belog to dept 2, avg is (80K + 60K)/2=70k
so How to select the greatest avg salary by depto?, in this case
IT 80K
i have been trying:
'group the salary by each department and use the Max function to obtain the highest one.
select
Department.Name as Department,
T.M as Salary
from
Employee,
Department,
(select DepartmentId as ID, Max(Salary) as M from Employee group by DepartmentId) as T
where
Employee.Salary = T.M and
Department.Id = T.ID and
Employee.DepartmentId = Department.Id
enter image description here
If multiple department having same maximum avg salary then this solution will return multiple rows.
SELECT *
FROM(
SELECT d.Id, d.Name, AVG(e.Salary) avg_salary, RANK() OVER(ORDER BY AVG(e.Salary) DESC) AS rank_
FROM Employee e
INNER JOIN Departament d ON e.DepartmentId = d.Id
GROUP BY d.Id, d.Name
)T
WHERE rank_ = 1
If you want to get the average just for the department, you can use in this way.
select DepartmentId as ID, de.name as Deptname, Avg(Salary) as M from Employee em1
join Department de on de.departmentID = em1.DepartmentId
group by DepartmentId, de.name
If you want employee name along with highest average then you can use this approach as well.
select
Deptname as Department,
e.Name as Employeename,
z.M as Salary
from
Employee e
join
( select DepartmentId,Deptname, M, row_number() (order by m desc) rownum from ( select DepartmentId as ID, de.name as Deptname, Avg(Salary) as M from Employee em1
join Department de on de.departmentID = em1.DepartmentId
group by DepartmentId, de.name) as T) z
on
e.DepartmentId = T.DepartmentId and z.rownum = 1
If you want a full answer, you should provide DDL, sample data and desired result.
If I understand you correctly, you are looking for something like:
SELECT DepartmentID, AVG(Salary) AS AverageSalaryForDept
FROM Employee
GROUP BY DepartmentID
ORDER BY AverageSalaryForDept DESC;
This will give you all the averages, ordered from the highest to the lowest. Now if you want just the top one, add a FETCH clause:
SELECT DepartmentID, AVG(Salary) AS AverageSalaryForDept
FROM Employee
GROUP BY DepartmentID
ORDER BY AverageSalaryForDept DESC
OFFSET 0 ROWS FETCH NEXT 1 ROW ONLY;
HTH

Max Salary with a single GroupBy without Joins

Schema for EMPLOYEE
(ID, EMPLOYEENAME, SALARY, ORGANIZATIONID)
Query to Solve: Find employee Names in each organization with Maximum Salary without a Join.
SELECT E.*
FROM EMPLOYEE E,
(SELECT EMP.ORGANIZATIONID, MAX(EMP.SALARY)
FROM EMPLOYEE EMP
GROUP BY EMP.ORGANIZATIONID) MAXSALARY
WHERE MAXSALARY.SALARY =E.SALARY
AND E.ORGANIZATIONID=EMP.ORGANIZATIONID ;
Is there a way to avoid the join? I am using Spark SQL API and joins cause an extra shuffle operation which is expensive. Is there a way to get the employee name while getting the max salary?
Assume you have a single employee in each organization having the max salary
You can use PARTITION BY with Spark SQL as shown below (Although it will require a subquery)
SELECT E.*
FROM
(SELECT EMP.EMPLOYEENAME, EMP.ORGANIZATIONID, EMP.SALARY,
row_number() OVER (PARTITION BY ORGANIZATIONID ORDER BY SALARY DESC) as rank
FROM EMPLOYEE EMP
) AS E
WHERE E.rank=1
Try this:
SELECT P.ORGANIZATIONID, P.EMPLOYEENAME
FROM EMPLOYEE P
WHERE P.SALARY = (SELECT MAX(E.SALARY) FROM EMPLOYEE E WHERE P.ORGANIZATIONID = E.ORGANIZATIONID)
GROUP BY P.ORGANIZATIONID, P.EMPLOYEENAME
Try this:
SELECT EMPLOYEENAME FROM EMPLOYEE
WHERE SALARY IN (SELECT MAX(SALARY) FROM EMPLOYEE GROUP BY ORGANIZATIONID)

SQL COUNT modify

From the SQL Oracle HR scheme I used the folowing:
SELECT DEPARTMENT_ID, ROUND(AVG(SALARY),2)
FROM EMPLOYEES
WHERE DEPARTMENT_ID IS NOT NULL
GROUP BY DEPARTMENT_ID
ORDER BY DEPARTMENT_ID
To get:
DEPARTMENT_ID ROUND(AVG(SALARY),2)
10 4400
20 9500
30 4150
40 6500
50 3475,56
60 5760
...
How do I change it so: it only count the departments that have the max avg salary (in my case 1) and show also the max avg salary?
Thank you for your time!
If I understood you, this is one possible way:
SELECT DEPARTMENT_ID, ROUND(AVG(SALARY),2) AS AVG_SALARY
FROM EMPLOYEES
WHERE DEPARTMENT_ID IS NOT NULL
AND ROUND(AVG(SALARY),2) = (
SELECT MAX(T.AVG_SALARY)
FROM (
SELECT DEPARTMENT_ID, ROUND(AVG(SALARY),2) AS AVG_SALARY
FROM EMPLOYEES
WHERE DEPARTMENT_ID IS NOT NULL
GROUP BY DEPARTMENT_ID) AS T)
GROUP BY DEPARTMENT_ID
ORDER BY DEPARTMENT_ID
This will show you ALL THE DEPARTMENTS that have max avg salary. If you want only the count:
SELECT COUNT(A.*), AVG(A.AVG_SALARY)
FROM (
SELECT DEPARTMENT_ID, ROUND(AVG(SALARY),2) AS AVG_SALARY
FROM EMPLOYEES
WHERE DEPARTMENT_ID IS NOT NULL
GROUP BY DEPARTMENT_ID) A
WHERE A.AVG_SALARY = (
SELECT MAX(T.AVG_SALARY) AS MAX_AVG_SALARY
FROM (
SELECT DEPARTMENT_ID, ROUND(AVG(SALARY),2) AS AVG_SALARY
FROM EMPLOYEES
WHERE DEPARTMENT_ID IS NOT NULL
GROUP BY DEPARTMENT_ID) AS T)
Another way that should work, using joins:
SELECT t1.DEPARTMENT_ID, ROUND(AVG(t1.SALARY),2) AS AVG_SALARY
FROM EMPLOYEES t1
LEFT JOIN (SELECT MAX(AVG_SALARY) AS MAX_AVG_SALARY FROM EMPLOYEES) t2
ON AVG_SALARY=t2.MAX_AVG_SALARY
WHERE t1.DEPARTMENT_ID IS NOT NULL
AND AVG_SALARY=t2.MAX_AVG_SALARY
ORDER BY t1.DEPARTMENT_ID ASC;
I tested the idea on a sample table of mine using oracle, and on a w3schools sql testing page with this code:
SELECT Customers.CustomerName, Orders.maxid
FROM Customers
LEFT JOIN (select max(Orders.CustomerID) as maxid from Orders) orders
ON Customers.CustomerID=Orders.maxid
where customers.customername is not null and customers.customerid=orders.maxid
ORDER BY orders.maxid desc;
It should grab only the departments that match their average salary with the max average salary that was selected in the join statement.
If you are only looking for the count of the departments, and not a list of the department names, then this slight modification should work for you:
SELECT COUNT(t1.DEPARTMENT_ID) as Num_Of_Depts, ROUND(AVG(t1.SALARY),2) AS AVG_SALARY
FROM EMPLOYEES t1
LEFT JOIN (SELECT MAX(AVG_SALARY) AS MAX_AVG_SALARY FROM EMPLOYEES) t2
ON AVG_SALARY=t2.MAX_AVG_SALARY
WHERE t1.DEPARTMENT_ID IS NOT NULL
AND AVG_SALARY=t2.MAX_AVG_SALARY
GROUP BY AVG_SALARY;

How can I get all employees with a salary less than the average salary?

I can get the count of employees and avg salary but when I try to get the the addition select of listing the number of employees paid below the average it fails.
select count(employee_id),avg(salary)
from employees
Where salary < avg(salary);
select count(*), (select avg(salary) from employees)
from employees
where salary < (select avg(salary) from employees);
The problem is that AVG is an aggregation function. SQL is not smart enough to figure out how to mix aggregated results within the rows. The traditional way is to use a join:
select count(*), avg(e.salary),
sum(case when e.salary < const.AvgSalary then 1 else 0 end) as NumBelowAverage
from employees e cross join
(select avg(salary) as AvgSalary from employees) as const
select TotalNumberOfEmployees,
AverageSalary,
count(e.employee_id) NumberOfEmployeesBelowAverageSalary
from (
select count(employee_id) TotalNumberOfEmployees,
avg(salary) AverageSalary
from employees
) preagg
left join employees e on e.salary < preagg.AverageSalary
group by TotalNumberOfEmployees,
AverageSalary
Note: I used a LEFT join so if you had 3 equal employees, it would show 0 instead of no results (nobody below below average).
It isn't clear which columns you want in your result set, which makes it difficult to answer your question. Making the question clear improves the quality of the answers.
You seem to want 3 facts:
Number of employees.
Average salary.
Number of employees earning less than the average salary.
And you show a query which does the job for the first two facts:
SELECT COUNT(*) AS NumberOfEmployees,
AVG(Salary) AS AverageSalary
FROM Employees
What's the difference between COUNT(*) and COUNT(Employee_ID)? The difference is that the latter only counts the rows where there is a non-NULL value in the Employee_ID column. A good optimizer will recognize that Employee_ID is a primary key and contains no NULL values, and the query will be the same. But COUNT(*) is more conventional and less reliant on the optimizer.
The other statistic can be generated as a simple value in the select-list via a sub-query:
SELECT COUNT(*) AS NumberOfEmployees,
AVG(Salary) AS AverageSalary,
(SELECT COUNT(*)
FROM Employees
WHERE Salary < (SELECT AVG(Salary) FROM Employees)
) AS NumberOfEmployeesPaidSubAverageWages
FROM Employees
Under many circumstances, it would not be appropriate to write the sub-query like that, but for the interpretation of the specified query, it is fine.
select * from <table name> where salary < (select avg(<salary column name) from <table name>);
Example:
select * from EMPLOYEE where sal < (select avg(emp_sal) from EMPLOYEE);
SELECT e.ename,e.deptno,e.sal,d.avg
FROM emp e,(SELECT deptno, avg(sal) avg
FROM emp
GROUP BY deptno) d
WHERE e.deptno=d.deptno
AND
e.sal < d.avg