SQL Query - Unsure How to Fix Logical Error - sql

Edit: Sorry! I am using Microsoft SQL Server.
For clarification, you can have a department named "x" with a list of jobs, a department named "y" with a different list of jobs, etc.
I also need to use >= ALL instead of TOP 1 or MAX because I need it to return more than one value if necessary (if job1 has 20 employees, job2 has 20 employees and they are both the biggest values, they should both return).
In my query I'm trying to find the most common jobTitle and the number of employees that work under this jobTitle, which is under the department 'Research and Development'. The query I've written consists of joins to be able to return the necessary data.
The problem I am having is with the WHERE statement. The HAVING COUNT(JobTitle) >= ALL is finding the biggest number of employees that work under a job, however the problem is that my WHERE statement is saying the Department must be 'Research and Development', but the job with the most amount of employees comes from a different department, and thus the output produces only the column names and nothing else.
I want to redo the query so that it returns the job with the largest amount of employees that comes from the Research and Development department.
I know this is probably pretty simple, I'm a noob :3 Thanks a lot for the help!
SELECT JobTitle, COUNT(JobTitle) AS JobTitleCount, Department
FROM HumanResources.Employee AS EMP JOIN
HumanResources.EmployeeDepartmentHistory AS HIST
ON EMP.BusinessEntityID = HIST.BusinessEntityID JOIN
HumanResources.Department AS DEPT
ON HIST.DepartmentID = DEPT.DepartmentID
WHERE Department = 'Research and Development'
GROUP BY JobTitle, Department
HAVING COUNT(JobTitle) >= ALL (
SELECT COUNT(JobTitle) FROM HumanResources.Employee
GROUP BY JobTitle
)

If you only want one row, then a typical method is:
SELECT JobTitle, COUNT(*) AS JobTitleCount
FROM HumanResources.Employee AS EMP JOIN
HumanResources.EmployeeDepartmentHistory AS HIST
ON EMP.BusinessEntityID = HIST.BusinessEntityID JOIN
HumanResources.Department AS DEPT
ON HIST.DepartmentID = DEPT.DepartmentID
WHERE Department = 'Research and Development'
GROUP BY JobTitle
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
Although FETCH FIRST 1 ROW ONLY is the ANSI standard, some databases spell it LIMIT or even SELECT TOP (1).
Note that I removed DEPARTMENT both from the SELECT and the GROUP BY. It seems redundant.
And, if I had to guess, your query is going to overstate results because of the history table. If this is the case, ask another question, with sample data and desired results.
EDIT:
In SQL Server, I would recommend using window functions. To get the one top job title:
SELECT JobTitle, JobTitleCount
FROM (SELECT JobTitle, COUNT(*) AS JobTitleCount,
ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) as seqnum
FROM HumanResources.Employee AS EMP JOIN
HumanResources.EmployeeDepartmentHistory AS HIST
ON EMP.BusinessEntityID = HIST.BusinessEntityID JOIN
HumanResources.Department AS DEPT
ON HIST.DepartmentID = DEPT.DepartmentID
WHERE Department = 'Research and Development'
GROUP BY JobTitle
) j
WHERE seqnum = 1;
To get all such titles, when there are duplicates, use RANK() or DENSE_RANK() instead of ROW_NUMBER().

with employee_counts as (
select
hist.DepartmentID, emp.JobTitle, count(*) as cnt,
case when dept.Department = 'Research and Development' then 1 else 0 end as is_rd,
from HumanResources.Employee as emp
inner join HumanResources.EmployeeDepartmentHistory as hist
on hist.BusinessEntityID = emp.BusinessEntityID
inner join HumanResources.Department as dept
on dept.DepartmentID = hist.DepartmentID
group by
hist.DepartmentID, emp.JobTitle
)
select * from employee_counts
where is_rd = 1 and cnt = (
select max(cnt) from employee_counts
/* where is_rd = 1 */ -- ??
);

Related

SQL: How to concatenate two cells from one column of multiple rows if all of the other row's cells are equal

Looking for DepartmentName1 + ', ' + DepartmentName2
I'm trying to merge two rows into one row when only one column has different values. Specifically I'm trying to list the name, job title, gender, pay rate, hire date and department name of the top 100 highest paid employees of the AdventureWorks2017 database. Here is the code I have so far:
SELECT TOP 100 (P.FirstName + ' ' + P.LastName) AS Name, HRE.JobTitle, HRE.Gender,
CAST(HRPH.Rate AS Decimal(10,2)) AS PayRate, HRE.HireDate, HRD.Name AS Department
FROM ((((Person.Person AS P
INNER JOIN HumanResources.Employee AS HRE
ON P.BusinessEntityID = HRE.BusinessEntityID)
INNER JOIN
(SELECT BusinessEntityID, MAX(RateChangeDate) AS RCD, MAX(Rate) AS Rate
FROM HumanResources.EmployeePayHistory
GROUP BY BusinessEntityID) AS HRPH
ON HRE.BusinessEntityID = HRPH.BusinessEntityID)
INNER JOIN HumanResources.EmployeeDepartmentHistory AS HRDH
ON HRE.BusinessEntityID = HRDH.BusinessEntityID)
INNER JOIN HumanResources.Department AS HRD
ON HRDH.DepartmentID = HRD.DepartmentID)
ORDER BY HRPH.Rate DESC;
This gives me the following result:
Two questions:
How can I get every 'Name' to be listed only once, regardless of DepartmentName? For example: Rows 5 & 6 to be only Row 5: Laura Norman | Chief Financial Officer | F | 60.10 | 2009-01-31 | Executive, Finance.
OR, David Bradley...|...Marketing, Purchasing
Does my code include an employee that may have gotten a pay cut? Meaning, the RateChangeDate (RCD) is MAX but the Rate is not?
Using Microsoft SQL Server 2019
I bet you can make use of the string_agg() to aggregate the values with a delimiter in a query field.
SELECT TOP 100 (P.FirstName + ' ' + P.LastName) AS Name, HRE.JobTitle, HRE.Gender,
CAST(HRPH.Rate AS Decimal(10,2)) AS PayRate, HRE.HireDate, STRING_AGG(HRD.Name,',') AS Department
FROM ((((Person.Person AS P
INNER JOIN HumanResources.Employee AS HRE
ON P.BusinessEntityID = HRE.BusinessEntityID)
INNER JOIN
(SELECT BusinessEntityID, MAX(RateChangeDate) AS RCD, MAX(Rate) AS Rate
FROM HumanResources.EmployeePayHistory
GROUP BY BusinessEntityID) AS HRPH
ON HRE.BusinessEntityID = HRPH.BusinessEntityID)
INNER JOIN HumanResources.EmployeeDepartmentHistory AS HRDH
ON HRE.BusinessEntityID = HRDH.BusinessEntityID)
INNER JOIN HumanResources.Department AS HRD
ON HRDH.DepartmentID = HRD.DepartmentID)
GROUP BY P.FirstName,P.LastName,HRE.JobTitle, HRE.Gender, HRPH.Rate, HRE.HireDate
ORDER BY HRPH.Rate DESC;
To answer the second part, I took the liberty of creating an example and you may be able to work into your solution. The data you are working with lacks a unique key and using FirstName, LastName, and Gender is an obviously bad candidate for a unique key. You also mention RateChangeDate but do not mention how to handle that value when the data aggregates. The query below basically ignores RateChangeDate on the output and marks the records that have a decrease in pay. Another query into the data is needed to remove those records, below I did it using a HAVING clause.
DECLARE #X TABLE (ID INT, Rate MONEY, RateChangeDate DATETIME, Department NVARCHAR(50))
INSERT #X VALUES
(1,25.00,'01/01/2021','A'),
(1,23.00,'05/01/2021','A'),
(2,25.00,'01/01/2021','A'),
(3,25.00,'01/01/2021','A'),
(3,26.00,'02/01/2021','A'),
(4,25.00,'01/01/2021','A'),
(4,25.00,'01/01/2021','B')
SELECT
ID,
SUM(LatestRate) AS LatestRate,
MAX(MaxRateChange) AS RateChanges,
Departments
FROM
(
SELECT
ID,
STRING_AGG(Department,',') AS Departments,
Rate,
MAX(RateChangeDate) AS MaxRateChange,
CASE WHEN LAG(Rate) OVER (PARTITION BY ID ORDER BY RateChangeDate) > Rate THEN 1 ELSE 0 END AS DecreaseInPay,
CASE WHEN MAX(RateChangeDate)OVER(PARTITION BY ID) = RateChangeDate THEN Rate ELSE NULL END LatestRate
FROM
#X
GROUP BY
ID,Rate,RateChangeDate
)AS X
GROUP BY
ID,Departments
HAVING
MAX(DecreaseInPay) = 0

How to display last names and numbers of all managers together with the number of employees that are his/her subordinates

This is the schema
\,
I try using inners but the results were trash
SELECT
employees.last_name AS last_name,
COUNT(employees.job_id) AS EMPLOYEES_Subordinates,
COUNT(employees.manager_id) AS Manager_Numbers
FROM
employees left
JOIN departments ON departments.manager_id = employees.manager_id
GROUP BY
employees.last_name
ORDER BY
EMPLOYEES_Subordinates desc;
( i really don't know how to show you the tables from hr)
If any 1 has the HR in Oracle Database and have time to help me , I gladly appreciate .
not quite sure but try something like this:
SELECT
e.LAST_NAME
,(SELECT COUNT(ee.EMPLOYEE_ID) FROM EMPLOYEES ee WHERE ee.MANAGER_ID = e.EMPLOYEE_ID) AS 'NUMBER OF WORKERS'
FROM EMPLOYEES e
WHERE MANAGER_ID IS NULL

JOIN - 2 tasks - sql developer

I have 2 tasks:
1. FIRST TASK
Show first_name, last_name (from employees), job_title, employee_id (from jobs) start_date, end_date (from job_history)
My idea:
SELECT s.employee_id
, first_name
, last_name
, job_title
, employee_id
, start_date
, end_date
FROM employees
INNER JOIN jobs hp
on s.employee_id = hp.employee_id
INNER JOIN job_history
on hp.jobs = h.jobs
I know it doesn't work. I'm receiving: "HP"."EMPLOYEE_ID": invalid identifier
What does it mean "on s.employee_id = hp.employee_id". Maybe I should write sthg else instead of this.
2. SECOND TASK
Show department_name (from departments), average and max salary for each department (those data are from employees) and how many employees are working in those departments (from employees). Choose only departments with more than 1 person. The result round to 2 decimal places.
I have the pieces, but i don't know to connect it
My idea:
SELECT department_name,average(salary),max(salary),count(employees_id)
FROM employees
INNER JOIN departments
on employees_id = departments_id
HAVING count(department) > 1
SELECT ROUND(average(salary),2) from employees
I modified your queries a bit by improving table aliasing. Hopefully, if the right columns are present in the tables as you say, it should work:
SELECT s.employee_id, s.first_name, s.last_name,
hp.job_title, hp.employee_id,
h.start_date, h.end_date
FROM employees s
INNER JOIN jobs hp
on s.employee_id = hp.employee_id
INNER JOIN job_history h
on hp.jobs = h.jobs;
When we say on s.employee_id = hp.employee_id it means that if, for example, there is an employee_id = 1234 present in both the tables employees and jobs, then SQL will bring all the columns from both the tables in the same line that corresponds to employee_id = 1234. You can now pick different columns in the SELECT clause as if they are in the same/single table(which was not the case before joining). This is the main logic behind SQL joins.
As to your 2nd task, try the below query. I made some modifications in aggregation by introducing COUNT(DISTINCT s.employees_id). If the same employees_id is present twice for some reason, you still want to count that as one person.
SELECT d.department_name, avg(s.salary), max(s.salary), count(distinct s.employees_id)
FROM employees s
INNER JOIN departments d
on e.employees_id = d.departments_id
GROUP BY d.department_name
HAVING COUNT(DISTINCT s.employees_id) > 1;
Let me know if there is still any issue. Hopefully, this works.

Limit the data set of a single table within a multi-table sql select statement

I'm working in an Oracle environment.
In a 1:M table relationship I want to write a query that will bring me each row from the "1" table and only 1 matching row from the "many" table.
To give a made up example... ( * = Primary Key/Foreign Key )
EMPLOYEE
*emp_id
name
department
PHONE_NUMBER
*emp_id
num
There are many phone numbers for one employee.
Let's say I wanted to return all employees and only one of their phone numbers. (Please forgive the far-fetched example. I'm trying to simulate a workplace scenario)
I tried to run:
SELECT emp.*, phone.num
FROM EMPLOYEE emp
JOIN PHONE_NUMBER phone
ON emp.emp_id = phone.emp_id
WHERE phone.ROWNUM <= 1;
It turns out (and it makes sense to me now) that ROWNUM only exists within the context of the results returned from the entire query. There is not a "ROWNUM" for each table's data set.
I also tried:
SELECT emp.*, phone.num
FROM EMPLOYEE emp
JOIN PHONE_NUMBER phone
ON emp.emp_id = phone.emp_id
WHERE phone.num = (SELECT MAX(num)
FROM PHONE_NUMBER);
That one just returned me one row total. I wanted the inner SELECT to run once for each row in EMPLOYEE.
I'm not sure how else to think about this. I basically want my result set to be the number of rows in the EMPLOYEE table and for each row the first matching row in the PHONE_NUMBER table.
Obviously there are all sorts of ways to do this with procedures and scripts and such but I feel like there is a single-query solution in there somewhere...
Any ideas?
I'd use a rank (or dense_rank or row_number depending on how you want to handle ties)
SELECT *
FROM (SELECT emp.*,
phone.num,
rank() over (partition by emp.emp_id
order by phone.num) rnk
FROM EMPLOYEE emp
JOIN PHONE_NUMBER phone
ON emp.emp_id = phone.emp_id)
WHERE rnk = 1
will rank the rows in phone for each emp_id by num and return the top row. If there could be two rows for the same emp_id with the same num, rank would assign both a rnk of 1 so you'd get duplicate rows. You could add additional conditions to the order by to break the tie. Or you could use row_number rather than rank to arbitrarily break the tie.
All above answers will work beautifully with the scenario you described.
But if you have some employees which are missing in phone tables, then you need to do a left outer join like below. (I faced similar scenario where I needed isolated parents also)
EMP
---------
emp_id Name
---------
1 AA
2 BB
3 CC
PHONE
----------
emp_id no
1 7555
1 7777
2 5555
select emp.emp_id,ph.no from emp left outer join
(
select emp_id,no,
ROW_NUMBER() OVER (PARTITION BY emp_id ORDER BY emp_id) as rnum
FROM phone) ph
on emp.emp_id = ph.emp_id
where ph.rnum = 1 or ph.rnum is null
Result
EMP_ID NO
1 7555
2 5555
3 (null)
If you want only one phone number, then use row_number():
SELECT e.*, p.num
FROM EMPLOYEE emp JOIN
(SELECT p.*,
ROW_NUMBER() OVER (PARTITION BY emp_id ORDER BY emp_id) as seqnum
FROM PHONE_NUMBER p
) p
ON e.emp_id = p.emp_id and seqnum = 1;
Alternatively, you can use aggregation, to get the minimum or maximum value.
This is my solution. Simple but maybe wont scale well for lot of columns.
Sql Fiddle Demo
select e.emp_id, e.name, e.dep, min(p.phone_num)
from
EMPLOYEE e inner join
PHONE_NUMBER p on e.emp_id = p.emp_id
group by e.emp_id, e.name, e.dep
order by e.emp_id;
And this fix the query you try
Sql Fiddle 2
SELECT emp.*, phone.num
FROM EMPLOYEE emp
JOIN PHONE_NUMBER phone
ON emp.emp_id = phone.emp_id
WHERE phone.num = (SELECT MAX(num)
FROM PHONE_NUMBER p
WHERE p.emp_id = emp.emp_id );

Every derived table needs an alias

Here's my code, I need the Full names and salaries of faculty with lowest salary from every department. My subquery works on it's own, but I can't get the rest to work together.
SELECT CONCAT(FName,' ',LName) AS 'Faculty',DepartmentID,Salary
FROM Faculty,
(SELECT DISTINCT DepartmentID AS 'Department', MIN(Salary) AS 'MinSalary'
FROM Faculty GROUP BY DepartmentID)
WHERE Faculty.DepartmentID= 'Department' AND Salary= 'MinSalary'
ORDER BY DepartmentID
you need to pud alias on your subquery
SELECT CONCAT(FName,' ',LName) AS 'Faculty',DepartmentID,Salary
FROM Faculty,
(
SELECT DISTINCT DepartmentID AS 'Department',
MIN(Salary) AS 'MinSalary'
FROM Faculty GROUP BY DepartmentID
) xx -- <<< this is the alias. (Don't forget this)
WHERE Faculty.DepartmentID= 'Department' AND Salary= 'MinSalary'
ORDER BY DepartmentID
xx is the name of your subquery in the example above. (and I think the query is not giving you the results you want)
To modify your query for better performance, (Assuming you are using MYSQL because of CONCAT function)
SELECT CONCAT(a.FName,' ',a.LName) AS FacultyName,
a.DepartmentID,
a.Salary
FROM Faculty a
INNER JOIN
(
SELECT DepartmentID ,
MIN(Salary) AS MinSalary
FROM Faculty
GROUP BY DepartmentID
) xx -- <<< this is the alias.
ON a.DepartmentID = xx.DepartmentID AND
a.Salary = xx.MinSalary
-- WHERE .. (add extra condition here)
ORDER BY DepartmentID
SQLFiddle Demo
UPDATE 1
It might also be SQL Server 2012. (already supports CONCAT())
SELECT CONCAT([FName], ' ',[LName]) FullName, [DepartmentID], [Salary]
FROM
(
SELECT [FName], [LName], [DepartmentID], [Salary],
ROW_NUMBER() OVER (partition BY DepartmentID
ORDER BY Salary) rn
FROM Faculty
) x
WHERE x.rn = 1
SQLFiddle Demo
As the error message in the title hints, Standard SQL requires sub-queries in the FROM clause to have names. You should also learn to use the JOIN notation, not the comma-separated list of table names. You need to know the old (pre-SQL92) notation to recognize it; you should not use it yourself.
SELECT CONCAT(F.FName, ' ', F.LName) AS 'Faculty', F.DepartmentID, F.Salary
FROM Faculty AS F
JOIN (SELECT DISTINCT DepartmentID AS 'Department', MIN(Salary) AS 'MinSalary'
FROM Faculty
GROUP BY Department) AS D
ON F.DepartmentID = D.Department AND F.Salary = D.MinSalary
ORDER BY F.DepartmentID;
Better to use aliases for both table and the subselect query table:
SELECT CONCAT(F.FName,' ',F.LName) AS Faculty,F.DepartmentID, F.Salary
FROM Faculty F,
(SELECT DISTINCT DepartmentID AS Department, MIN(Salary) AS MinSalary
FROM Faculty GROUP BY DepartmentID) AS DEP_MIN_SALARY
WHERE F.DepartmentID= DEP_MIN_SALARY.Department
AND F.Salary= DEP_MIN_SALARY.MinSalary
ORDER BY FDepartmentID;
One observation: You are not retrieving any columns from the derived table. Hope that is intentional.
SELECT CONCAT(F.FName,' ',F.LName) AS Faculty, F.DepartmentID, F.Salary
FROM Faculty F
JOIN (
SELECT DepartmentID,
MIN(Salary) AS MinSalary
FROM Faculty
GROUP BY DepartmentID
) G
ON F.DepartmentID = G.DepartmentID AND F.Salary = G.MinSalary
ORDER BY F.DepartmentID
The problem with using single-quotes around alias names is that you fall into the trap evident in your query - your first WHERE condition is matching DepartmentID against the string literal 'Department'. This goes too for the 'MinSalary' column.
Of note also is the fact that GROUP BY does not need DISTINCT - because you are already getting distinct values of DepartmentID.
And to the first problem that brought you here - each derived table (aka sub-query) needs an alias, or something with which to identify the columns from it. In the example below, you can see that the first part of the ON clause compares two DepartmentID columns. The aliases F and G are imperative here to distinguish between the two, hence the requirement to alias each table.
Your query needs an alias for subquery
select
concat(F.FName, ' ', F.LName) as Faculty,
F.DepartmentID,
D.MinSalary
from faculty as F
inner join
(
select T.DepartmentID, min(T.Salary) as MinSalary
from Faculty as T
group by T.DepartmentID
) as D on D.DepartmentID = F.DepartmentID and D.MinSalary = F.Salary
order by F.DepartmentID
Your derived table needs a name which is what the error message is complaining about (see YouNeedToPutATableAliasHere below):
SELECT CONCAT(FName,' ',LName) AS Faculty,DepartmentID,Salary
FROM Faculty,
(SELECT DISTINCT DepartmentID AS Department, MIN(Salary) AS MinSalary
FROM Faculty GROUP BY DepartmentID) AS YouNeedToPutATableAliasHere
WHERE Faculty.DepartmentID= 'Department' AND Salary= MinSalary
ORDER BY DepartmentID
Note I've also removed some single-quotes where they didn't make sense to me. (Perhaps you meant double-quotes, square brackets, or backticks for column names, but single quotes probably won't work - you may need to do the same for 'Department'?)
You will also be creating a Cartesian join unless you specify conditions correlating the two tables. A better query for "faculty with the minimum salary in the dept 'Department'" might be (note I have removed the redundant GROUP BY):
SELECT CONCAT(FName,' ',LName) AS Faculty,F.DepartmentID,F.Salary
FROM Faculty F
INNER JOIN (
SELECT DepartmentID,
MIN(Salary) AS MinSalary
FROM Faculty
GROUP BY DepartmentID
) AS MinSal ON MinSal.DepartmentID = F.DepartmentID
AND F.Salary = MinSal.MinSalary
WHERE F.DepartmentID= 'Department'
ORDER BY F.DepartmentID
Or, since you are only selecting the one department 'Department', you could do:
SELECT CONCAT(FName,' ',LName) AS Faculty,F.DepartmentID,F.Salary
FROM Faculty F
WHERE F.DepartmentID= 'Department'
AND F.Salary = (
SELECT MIN(Salary)
FROM Faculty
WHERE DepartmentID = F.DepartmentID
)
Or even (which would also work for multiple departments but may be slower than a join):
SELECT CONCAT(FName,' ',LName) AS Faculty,F.DepartmentID,F.Salary
FROM Faculty F
WHERE F.DepartmentID= 'Department'
AND NOT EXISTS(
SELECT 1
FROM Faculty FF
WHERE FF.DepartmentID = F.DepartmentID
AND FF.Salary < F.Salary
)
In fact, in looking at your original query and in an effort to make this question complete while reading #RichardTheKiwi's comments, I will incorporate that interpretation into this answer. This assumes you are using MySQL (from the CONCAT function) and that you meant backticks instead of forward ticks for column names. It also assumes you wanted the minimum-salaried faculty for every department, not just the specified one as 'Department'. It just requires a small change from my first suggested query though - namely the removal of the WHERE clause:
SELECT CONCAT(FName,' ',LName) AS Faculty,F.DepartmentID,F.Salary
FROM Faculty F
INNER JOIN (
SELECT DepartmentID,
MIN(Salary) AS MinSalary
FROM Faculty
GROUP BY DepartmentID
) AS MinSal ON MinSal.DepartmentID = F.DepartmentID
AND F.Salary = MinSal.MinSalary
ORDER BY F.DepartmentID