Applying Aggregate Functions On all Columns - sql

I have an Employee table with Salary. I want to list Salary - Avg(Salary) for every Employees. Can someone please help me with the SQL query for the same.

You can do this using window functions:
select e.*,
(salary - avg(salary) over ()) as diff
from employees e;

You could use an inline view to return a single row with the average salary. and join that row back to employee table.
Something like this:
SELECT e.emp_id
, e.salary
, e.salary - a.avg_salary
FROM employee e
CROSS
JOIN ( SELECT AVG(t.salary) AS avg_salary
FROM employee t
) a
ORDER BY e.emp_id

Just a couple other variations; the others already posted should serve the purpose quite well -- assuming we've all correctly inferred the desired output, given no DDL nor sample data, nor the expected output from the given inputs.
If there is no requirement to include the averaged salary [as the average over all rows] in addition to the calculated difference, then the following [shown with an optional casting to a decimal result] uses a scalar subselect to get the value to subtract from each employee salary:
select emp.*
, dec( salary - ( select avg(salary)
from employee_table )
, 11, 0 ) as saldif
from employee_table as emp
Or to use the averaged salaries in both the difference and as a column by itself, then again the scalar subselect, but made available for lateral reference in the [explicitly joined-to] subquery; again, optional casting for decimal results:
select x.*
from table
( select avg(salary)
from employee_table
) as a ( avgsal )
cross join lateral
( select emp.*
, dec( salary - avgsal , 11 ) as saldif
, dec( avgsal , 11 ) as salavg
from employee_table as emp
) as x

other solution ;)
with avgsalary as (
select avg(salary) avgsal from employee_table
)
select emp.*, case when emp.salary is null then cast(null as decimal) else round(emp.salary - avgs.avgsal, 2) end as diffsal, avgs.avgsal
from employee_table as emp cross join avgsalary avgs

And even another variation:
with EmpAvg (avgSalary)
as ( SELECT avg( salary ) from employee_table )
select e.*, a.avgSalary,
(e.salary - a.avgSalary) as diffAvg
from employee_table e cross join EmpAvg a
There may be many forms of queries that can give equivalent results. (Although I left off casting the calculated values, so not exactly equal result values.)

Related

SQL how can i detect if a value decrease over time?

Hi, how can i check who are the employees whose salary has fallen ?
:
SELECT employees.emp_no, first_name, last_name, salary, from_date, to_date, hire_date
from employees
INNER JOIN salaries ON employees.emp_no = salaries.emp_no;
I only want to fetch the name of employees whose salary has fallen
You can use the positional analytic function LAG() to find these rows. This is a standard SQL function that peeks at a previous row, according to a specific criteria.
For example:
select emp_no, first_name, last_name
from (
select
e.*,
s.salary,
lag(s.salary) over(partition by e.emp_no order by from_date) as prev_salary
from employees e
join salaries s on s.emp_no = e.emp_no
) x
where salary < prev_salary
You should look into using windowing functions. It should look something like this:
with salary as (
SELECT employees.emp_no, concat(first_name, " ",last_name) as emp, salary, coalesce(hire_date, from_date) as from_date, to_date
from employees
INNER JOIN salaries ON employees.emp_no = salaries.emp_no
), last_sal as (
select emp_no, emp, salary, to_date, lag(salary) over (partition by emp_no, order by to_date) as last_salary
from salary
)
select *
from last_sal
where salary < last_salary
a windowing function basically takes a look at a subset of the data. In this case, the subset is of each employee, and then that window is ordered by to_date. Lag tells it to look backwards, and effectively produces a row which has the prior row's salary result next to the current row for the other columns.

basic sql employee

The employee table (EMP) specifies groups by department code, the sum of salaries by group, the average (constant treatment), and the number of people in each group's salary, and is presented below, listed in department code order. I would like to modify the following SQL syntax to look up departments whose average salary exceeds 2800000.
SELECT
DEPT
, SUM(SALARY) 합계
, FLOOR(AVG(SALARY)) 평균
, COUNT(*) 인원수
FROM
EMP
GROUP BY
DEPT
ORDER BY DEPT ASC;
question 1. Conditions that need to be modified
question 2. What should I add to the presented code?
I can't read your aliases so I'll just presume what they mean.
If query - you posted in the question - works OK, then use it as a CTE and select desired data from it:
with data as
(select dept,
sum(salary) sumsal,
floor(avg(salary)) avgsal,
count(*) cnt
from emp
group by dept
)
select *
from data
where avgsal > 2800000;
you can use below sql:
SELECT
DEPT
, SUM(SALARY) 합계
, FLOOR(AVG(SALARY)) 평균
, COUNT(*) 인원수
FROM
EMP
GROUP BY
DEPT having FLOOR(AVG(SALARY) > 2800000
ORDER BY DEPT ASC;
You can filter aggregated result using having
SELECT
DEPT
, SUM(SALARY) 합계
, FLOOR(AVG(SALARY)) 평균
, COUNT(*) 인원수
FROM
EMP
GROUP BY DEPT
HAVING AVG(SALARY) >2800000
ORDER BY DEPT ASC;

Max Salary with a single GroupBy without Joins

Schema for EMPLOYEE
(ID, EMPLOYEENAME, SALARY, ORGANIZATIONID)
Query to Solve: Find employee Names in each organization with Maximum Salary without a Join.
SELECT E.*
FROM EMPLOYEE E,
(SELECT EMP.ORGANIZATIONID, MAX(EMP.SALARY)
FROM EMPLOYEE EMP
GROUP BY EMP.ORGANIZATIONID) MAXSALARY
WHERE MAXSALARY.SALARY =E.SALARY
AND E.ORGANIZATIONID=EMP.ORGANIZATIONID ;
Is there a way to avoid the join? I am using Spark SQL API and joins cause an extra shuffle operation which is expensive. Is there a way to get the employee name while getting the max salary?
Assume you have a single employee in each organization having the max salary
You can use PARTITION BY with Spark SQL as shown below (Although it will require a subquery)
SELECT E.*
FROM
(SELECT EMP.EMPLOYEENAME, EMP.ORGANIZATIONID, EMP.SALARY,
row_number() OVER (PARTITION BY ORGANIZATIONID ORDER BY SALARY DESC) as rank
FROM EMPLOYEE EMP
) AS E
WHERE E.rank=1
Try this:
SELECT P.ORGANIZATIONID, P.EMPLOYEENAME
FROM EMPLOYEE P
WHERE P.SALARY = (SELECT MAX(E.SALARY) FROM EMPLOYEE E WHERE P.ORGANIZATIONID = E.ORGANIZATIONID)
GROUP BY P.ORGANIZATIONID, P.EMPLOYEENAME
Try this:
SELECT EMPLOYEENAME FROM EMPLOYEE
WHERE SALARY IN (SELECT MAX(SALARY) FROM EMPLOYEE GROUP BY ORGANIZATIONID)

SQL - Using Sum command multiplies the salary for the same person

I'm trying to increase the salary for those employees who treat at least 2 patients by 10%. My problem is that the salary first multiplies by 2 for every patient they treat and then multiplies by 10% at the end. For exampel if the employee earns 25.000 and treats 3 people the new salary becomes 82.500.
select distinct t.empNbr, e.Salary, sum(e.Salary*1.1) as NewSalary from Treats t
inner join Employee e
on e.empNbr=t.empNbr
WHERE t.empNbr IN
(
SELECT empNbr
FROM Treats
GROUP BY empNbr
HAVING COUNT(*) >= 2)
group by t.empNbr, e.Salary
CROSS APPLY should help:
SELECT e.empNbr,
e.Salary,
e.Salary*1.1 as NewSalary
FROM Employee e
CROSS APPLY (
SELECT empNbr
FROM Treats
WHERE e.empNbr = empNbr
GROUP BY empNbr
HAVING COUNT(*) > 1
) as t
The t part gets empNbr we need. Then we select empNbr and salary from Employee table and do math :)
One more way:
SELECT TOP 1 WITH TIES
e.empNbr,
e.Salary,
e.Salary*1.1 as NewSalary
FROM Employee e
INNER JOIN Treats t
ON e.empNbr = t.empNbr
ORDER BY
CASE WHEN COUNT(t.empNbr) OVER (PARTITION BY t.empNbr ORDER BY t.empNbr) > 1 THEN 1 ELSE 0 END DESC,
ROW_NUMBER() OVER (PARTITION BY t.empNbr ORDER BY t.empNbr)
This should be the right query. Let me know if this works
select empNbr, Salary, sum(Salary*1.1) as NewSalary
from employee
where empNbr in (select empNbr
from Treats
group by empNbr
having count(*) >=2) ----- Ordered as a code
Christopher, You can use below query to get the result,
SELECT t.empNbr,
e.Salary,
(e.Salary * e.count +(e.Salary/10) ) as NewSalary
from Treats t
INNER JOIN
(SELECT empNbr, COUNT(*) AS count Employee GROUP BY empNbr) e
ON e.empNbr=t.empNbr
AND e.count >=2
Explination
1. We can calculate result without having clause2.In the inner join, empNbr and the count of that employee in derived3.Using this count in the select query, the current salary is multiplied and 10% is added with the salary
Hope this is what you need. Any issues, feel free to ask

How can I get all employees with a salary less than the average salary?

I can get the count of employees and avg salary but when I try to get the the addition select of listing the number of employees paid below the average it fails.
select count(employee_id),avg(salary)
from employees
Where salary < avg(salary);
select count(*), (select avg(salary) from employees)
from employees
where salary < (select avg(salary) from employees);
The problem is that AVG is an aggregation function. SQL is not smart enough to figure out how to mix aggregated results within the rows. The traditional way is to use a join:
select count(*), avg(e.salary),
sum(case when e.salary < const.AvgSalary then 1 else 0 end) as NumBelowAverage
from employees e cross join
(select avg(salary) as AvgSalary from employees) as const
select TotalNumberOfEmployees,
AverageSalary,
count(e.employee_id) NumberOfEmployeesBelowAverageSalary
from (
select count(employee_id) TotalNumberOfEmployees,
avg(salary) AverageSalary
from employees
) preagg
left join employees e on e.salary < preagg.AverageSalary
group by TotalNumberOfEmployees,
AverageSalary
Note: I used a LEFT join so if you had 3 equal employees, it would show 0 instead of no results (nobody below below average).
It isn't clear which columns you want in your result set, which makes it difficult to answer your question. Making the question clear improves the quality of the answers.
You seem to want 3 facts:
Number of employees.
Average salary.
Number of employees earning less than the average salary.
And you show a query which does the job for the first two facts:
SELECT COUNT(*) AS NumberOfEmployees,
AVG(Salary) AS AverageSalary
FROM Employees
What's the difference between COUNT(*) and COUNT(Employee_ID)? The difference is that the latter only counts the rows where there is a non-NULL value in the Employee_ID column. A good optimizer will recognize that Employee_ID is a primary key and contains no NULL values, and the query will be the same. But COUNT(*) is more conventional and less reliant on the optimizer.
The other statistic can be generated as a simple value in the select-list via a sub-query:
SELECT COUNT(*) AS NumberOfEmployees,
AVG(Salary) AS AverageSalary,
(SELECT COUNT(*)
FROM Employees
WHERE Salary < (SELECT AVG(Salary) FROM Employees)
) AS NumberOfEmployeesPaidSubAverageWages
FROM Employees
Under many circumstances, it would not be appropriate to write the sub-query like that, but for the interpretation of the specified query, it is fine.
select * from <table name> where salary < (select avg(<salary column name) from <table name>);
Example:
select * from EMPLOYEE where sal < (select avg(emp_sal) from EMPLOYEE);
SELECT e.ename,e.deptno,e.sal,d.avg
FROM emp e,(SELECT deptno, avg(sal) avg
FROM emp
GROUP BY deptno) d
WHERE e.deptno=d.deptno
AND
e.sal < d.avg