Extra Fields with SQL MIN() & GROUP BY - sql

When using the SQL MIN() function, along with GROUP BY, will any additional columns (not the MIN column, or one of the GROUP BY columns) match the data in the matching MIN row?
For example, given a table with department names, employee names, and salary:
SELECT MIN(e.salary), e.* FROM employee e GROUP BY department
Obviously I'll get two good columns, the minimum salary and the department. Will the employee name (and any other employee fields) be from the same row? Namely the row with the MIN(salary)?
I know there could very possibly be two employees with the same (and lowest) salary, but all I'm concerned with (now) is getting all the information on the (or a single) cheapest employee.
Would this select the cheapest salesman?
SELECT min(salary), e.* FROM employee e WHERE department = 'sales'
Essentially, can I be sure that the data returned along with the MIN() function will matches the (or a single) record with that minimum value?
If the database matters, I'm working with MySql.

If you wanted to get the "cheapest" employee in each department you would have two choices off the top of my head:
SELECT
E.* -- Don't actually use *, list out all of your columns
FROM
Employees E
INNER JOIN
(
SELECT
department,
MIN(salary) AS min_salary
FROM
Employees
GROUP BY
department
) AS SQ ON
SQ.department = E.department AND
SQ.min_salary = E.salary
Or you can use:
SELECT
E.*
FROM
Employees E1
LEFT OUTER JOIN Employees E2 ON
E2.department = E1.department AND
E2.salary < E1.salary
WHERE
E2.employee_id IS NULL -- You can use any NOT NULL column here
The second statement works by effectively saying, show me all employees where you can't find another employee in the same department with a lower salary.
In both cases, if two or more employees have equal salaries that are the minimum you will get them both (all).

SELECT e.*
FROM employee e
WHERE e.id =
(
SELECT id
FROM employee ei
WHERE ei.department = 'sales'
ORDER BY
e.salary
LIMIT 1
)
To get values for each department, use:
SELECT e.*
FROM department d
LEFT JOIN
employee e
ON e.id =
(
SELECT id
FROM employee ei
WHERE ei.department = d.id
ORDER BY
e.salary
LIMIT 1
)
To get values only for those departments that have employees, use:
SELECT e.*
FROM (
SELECT DISTINCT eo.department
FROM employee eo
) d
JOIN
employee e
ON e.id =
(
SELECT id
FROM employee ei
WHERE ei.department = d.department
ORDER BY
e.salary
LIMIT 1
)
Of course, having an index on (department, salary) will greatly improve all three queries.

The fastest solution:
SET #dep := '';
SELECT * FROM (
SELECT * FROM `employee` ORDER BY `department`, `salary`
) AS t WHERE IF ( #dep = t.`department`, FALSE, ( #dep := t.`department` ) OR TRUE );

Another approach can be using Analytical functions. Here is the query using analytical and ROW_NUM functions
select first_name, salary from (select first_name,salary, Row_NUMBER() over (PARTITION BY DEPARTMENT_ID ORDER BY salary ASC) as row_count from employees) where row_count=1;

Related

Repeat query for every value in a column and union all results

I wrote a query which gives me top 3 salaries for the specific department id. For example for DepartmentId=100:
select distinct top 3 e.Id, e.Salary, dep.Id
from Employee e
inner join Department dep on e.DepartmentId = dep.Id and dep.Id = 100
and it's working fine.
Now, I want to run previous query for every department id and union all results. Something like this (written in a pseudocode):
Result <- empty
foreach depId in [Department].id
Result = Result UNION run previous query with depId (insead of 100)
How can I achive this with SQL?
If you only need the department's id in the results (and not its name also) then the join is not necessary because this id exists in the table Employee.
Use row_number() window function:
select e.Id, e.Salary, e.DepartmentId
from (
select
Id, Salary, DepartmentId,
row_number() over (partition by DepartmentId order by Salary desc) rn
from Employee
) e
where e.rn <= 3
You can use APPLY
select t.Id, t.Salary, t.DepartmentId
from Department dep
cross apply (
select distinct top 3 e.Id, e.Salary, e.DepartmentId
from Employee e
where e.DepartmentId = dep.Id) t
This will only return Employee rows for which parent Department exists.

The best way to find the Department with the maximum total salary Postgresql

Lets we have 2 standard tables Employees and Departments
CREATE TABLE departments (
id SERIAL PRIMARY KEY,
name VARCHAR
);
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
department_id INTEGER,
name VARCHAR,
salary NUMERIC(13,2)
);
What is the best way to find the name of the department with the maximum employees' total salary.
I've found two solutions and they looks too complicated for such simple task.
Using rank()
SELECT name FROM (
SELECT name, rank() OVER ( ORDER BY salary DESC ) AS rank
FROM (
SELECT
departments.name,
sum(salary) AS salary
FROM employees
JOIN departments ON department_id = departments.id
GROUP BY departments.name
) AS t1
) AS t2
WHERE rank = 1;
Using subquery
WITH t1 AS (SELECT
departments.name,
sum(salary) AS salary
FROM employees
JOIN departments ON departments.id = employees.department_id
GROUP BY departments.name
)
SELECT name FROM t1
WHERE t1.salary = (SELECT max(salary) FROM t1);
At first glance using rank should be less efficient as it performs unnecessary sorting. Though EXPLAIN shows that the first option is more efficient.
Or maybe someone suggests another solution.
So, what is the best way to find the Department with the maximum total salary using postgres?
I would write the rank() as:
SELECT *
FROM (SELECT d.name, SUM(e.salary) AS salary,
RANK() OVER (ORDER BY SUM(e.salary)) as rnk
FROM employees e JOIn
departments d
ON e.department_id = d.id
GROUP BY d.name
) d
WHERE rnk = 1;
(The additional subquery should not affect performance, but it adds nothing to clarify the query either.)
Because window functions are built-in to the database, the database has methods for making them more efficient. And there is overhead for getting the MAX() as well. But, to be honest, I would expect both methods to have similar performance.
I should note that if you want only one department returned -- even when there are ties -- then the simplest method is:
SELECT d.name, SUM(e.salary) AS salary
FROM employees e JOIn
departments d
ON e.department_id = d.id
GROUP BY d.name
ORDER BY SUM(e.salary) DESC
FETCH FIRST 1 ROW ONLY

TROUBLE IN MULTIPLE SUBSELECT QUERY

I need to find the name, function, officename and the salary of all employee with the same function as PETER or a salary greater or equal than the salary of SANDERS. order by function and salary.
There are two tables: office and employee
table office contains:
officenumber
name
city
table employee contains:
employeenumber
name
function
manager
sal
officenumber
this is my current SQL query:
SELECT NAME,
FUNCTION,
SAL
FROM EMPLOYEE
WHERE FUNCTIE = (SELECT FUNCTION
FROM EMPLOYEE
WHERE NAME = 'PIETERS')
I'm stuck with the query.
Assuming this is SQL Server (you never specified), something like this should work.
SELECT
e.name,
e.function,
e.sal,
o.name AS officename
FROM employee e
JOIN office o ON e.officenumber = o.officenumber
WHERE
e.function = (SELECT function FROM employee WHERE name = 'PIETERS') OR
e.sal >= (SELECT sal FROM employee WHERE name = 'SANDERS')
ORDER BY e.function, e.salary
You'll have to tweak this a bit if you're working with MySQL or something else.
Three things you need to do here:
1. join the two tables, since you need results from both tables
2. filter the results according to the two criterias
3. order the results:
The first part is easy, just need to join them according to the officenumber:
select e.name, e.function, o.name as officeName, e.salary from
employee e inner join office o
on e.officenumber = o.officenumber
second part, simple where clause:
where e.function = (select function from employee where name = 'PETER')
or e.salary >= (select salary from employee where name = 'SANDERS')
and the last, ordering:
order by e.function, e.salary
Putting it all together:
select e.name, e.function, o.name as officeName, e.salary from
employee e inner join office o
on e.officenumber = o.officenumber
where e.function = (select function from employee where name = 'PETER')
or e.salary >= (select salary from employee where name = 'SANDERS')
order by e.function, e.salary

SQL Query: "Write one SQL Query to calculate the maximum salaries for employees by Job Classification

The task is to "Write one SQL Query to calculate the maximum salaries for employees by Job Classification. (Output shows Alias).
There were two tables created, the Employee and Job_Title. The Employee table consists of the Salary while the the Job_Title table consists of the Job Classification such as 'Manager'.
The current code I have shown, the employee who has the max salary in that classification, however, the Alias is not showing. It just displays all of the information for that Employee.
Here is my code:
SELECT *
FROM Employee
WHERE Salary IN (
SELECT MAX(Salary) AS 'Maximum_Salary_Class'
FROM Employee
WHERE JobID IN ( SELECT JobID
FROM Job_Title_Table_
WHERE Job_Classification = 'Manager' ) );
Something like this:
select t.Job_Classification, max(e.salary) as 'Maximum_Salary_Class'
from Employee e join Job_Title_Table_ t on e.JobID = t.JobId
group by t.Job_Classification;
Try this.
;with cte as
(
select E.*,J.Job_Classification,Dense_RAnk() over(partition by J.Job_Classification order by E.Salary) as DenseRank
from Employee E
inner join Job_Title J on E.JobID = J.JobID
)
select * from cte
where DenseRank = 1
I tried to understand your purpose.
Let me know if I didnt get correctly.
I think you want to get this
SELECT E.*,Job_Classification
FROM Employee
,(
SELECT J.JobID,J.Job_Classification,MAX(Salary) AS 'Maximum_Salary_Class'
FROM Employee AS E
,Job_Title_Table_ AS J
WHERE E.JobID = J.JobID
AND Job_Classification = 'Manager'
GROUP BY J.JobID,J.Job_Classification
) AS EJ
WHERE E.JobID = EJ.JobID
AND Salary = Maximum_Salary_Class

Get MAX element based on two different tables

I have problem with SQL query on Oracle DB.. I have following tables:
DEPARTMENT(`ID` NUMBER(11), `NAME` VARCHAR(25))
EMPLOYEE(`ID` INT(11), `LASTNAME` VARCHAR(25), `DEP_ID` INT(11));
SALARIES(`ID` INT(11), `EMPLOYEE_ID` INT(11), `SALARY` INT(11));
Now, I want to get name of depratment with highest average sum of salary. Department isn't directly related to Salaries so probably I need to use Employee table as well.
I've created a query:
SELECT NAME, (SELECT SUM(SALARIES.SALARY) FROM SALARIES JOIN EMPLOYEE ON EMPLOYEE.EMPLOYEE_ID = EMPLOYEE.ID WHERE EMPLOYEE.DEP_ID = DEPARTMENT.ID GROUP BY EMPLOYEE.ID) AS AVG_OF_SUM FROM DEPARTMENT;
It returns list of department's name and avg sum. But now I need to get only one department name for the highest averange row.
Is my query actually OK? Or can be improved? And how can I get only one record?
Thanks for any help.
Regards,
D
Make use of the ANALYTIC function SUM...OVER
In the subquery, apply the analytic function, and then select only those rows which you desire.
For example,
SELECT DISTINCT DEPT, SUM(SAL) OVER (PARTITION BY DEPT ORDER_BY DEPT) SUM_SAL
FROM EMPLOYEE
ORDER_BY DEPT;
SELECT NAME, (SELECT MAX(SUM(SALARIES.SALARY))
FROM SALARIES
JOIN EMPLOYEE ON EMPLOYEE.EMPLOYEE_ID = EMPLOYEE.ID )
WHERE EMPLOYEE.DEP_ID = DEPARTMENT.ID
GROUP BY EMPLOYEE.ID) AS AVG_OF_SUM FROM DEPARTMENT;
SELECT NAME, avg_sal FROM
(SELECT d.NAME, avg(s.SALARY) avg_sal
FROM SALARIES s
JOIN EMPLOYEE e ON s.EMPLOYEE_ID = e.ID
JOIN DEPARTMENT d ON e.DEP_ID = d.ID
GROUP BY d.NAME
ORDER BY 2 DESC)
WHERE rownum = 1;
(This query shows a department with the highest avg salary. If you need sum replace AVG -> SUM)