SQL nested select and aliases - sql

The Situation
I have a typical MS Access database containing information on Companies, Pay, Employees and Positions. Some of the tables are:
tbl_Report (Report_ID PK, Report_Year)
tbl_Employee (Employee_ID PK)
tbl_Pay (Pay_ID PK, Salary, Employee_ID FK, Report_ID FK)
tbl_Position (Position_ID PK, Position, Employee_ID FK, Report_ID FK)
I have a query that selects the salary for each position and year, to produce:
qry_Salary_by_Position_Year: (This query is parameterised to accept a 'Year').
Year | Salary | Position
------------------------
2014 | 100 | CEO
2013 | 200 | CEO
2014 | 300 | CFO
2014 | 200 | Chairman
2013 | 150 | CEO
etc.
I then use another query to extract the top x percent of salaries for a given position:
qry_Select_Top_25:
SELECT TOP 25 PERCENT Salary, Year, Position
FROM qry_Salary_by_Position_Year;
which gives something like:
Salary | Year | Position
------------------------
100 | 2014 | CEO
100 | 2014 | CFO
200 | 2014 | CFO
The Question
What I would like is a final table that displays the Max(25%), Max(50%), Max(75%), Max(X%) values, grouped by Position and Year, eg:
Year | Position | 25th | 50th | 75th
-------------------------------------
2013 | CEO | 10 | 30 | 75
2014 | CEO | 20 | 50 | 80
2014 | CFO | 15 | 30 | 90
2014 | Chairman | 20 | 25 | 30
I can do this for one percentile value using
SELECT Year, Position, Max(qry50.Salary) AS 50_Percentile
FROM (SELECT TOP 50 PERCENT qry_Salary_by_Position_Year.Salary, Year, Position
FROM qry_Salary_by_Position_Year) AS qry50
WHERE Position IN (SELECT DISTINCT Position FROM qry_Salary_by_Position_Year) AND Year IN (SELECT DISTINCT Year FROM qry_Salary_by_Position_Year)
GROUP BY Year, Position;
But I can't get my head around how to construct the query with the correct aliases etc. to add in the other percentage values as other columns. Does anyone have any suggestions/comments/questions?
Edit
I may have come up with a solution that I'm now checking:
SELECT qry.Year, qry.Position, Max(qry25.Salary) AS 25_Percentile, Max(qry50.Salary) AS 50_Percentile, Max(qry75.Salary) AS 75_Percentile, Max(qry100.Salary) AS 100_Percentile
FROM
((((qry_Salary_by_Position_Year qry
LEFT OUTER JOIN (SELECT TOP 50 PERCENT Salary, Year, Position FROM qry_Salary_by_Position_Year) AS qry50 ON qry.Year = qry50.Year AND qry.Position = qry50.Position)
LEFT OUTER JOIN (SELECT TOP 25 PERCENT Salary, Year, Position FROM qry_Salary_by_Position_Year) AS qry25 ON qry.Year = qry25.Year AND qry.Position = qry25.Position)
LEFT OUTER JOIN (SELECT TOP 75 PERCENT Salary, Year, Position FROM qry_Salary_by_Position_Year) AS qry75 ON qry.Year = qry75.Year AND qry.Position = qry75.Position)
LEFT OUTER JOIN (SELECT TOP 100 PERCENT Salary, Year, Position FROM qry_Salary_by_Position_Year) AS qry100 ON qry.Year = qry100.Year AND qry.Position = qry100.Position)
GROUP BY qry.Year, qry.Position

I think this is what I'm after:
SELECT qry.Year, qry.Position, Max(qry25.Salary) AS 25_Percentile, Max(qry50.Salary) AS 50_Percentile, Max(qry75.Salary) AS 75_Percentile, Max(qry100.Salary) AS 100_Percentile
FROM
((((qry_Salary_by_Position_Year qry
LEFT OUTER JOIN (SELECT TOP 50 PERCENT Salary, Year, Position FROM qry_Salary_by_Position_Year) AS qry50 ON qry.Year = qry50.Year AND qry.Position = qry50.Position)
LEFT OUTER JOIN (SELECT TOP 25 PERCENT Salary, Year, Position FROM qry_Salary_by_Position_Year) AS qry25 ON qry.Year = qry25.Year AND qry.Position = qry25.Position)
LEFT OUTER JOIN (SELECT TOP 75 PERCENT Salary, Year, Position FROM qry_Salary_by_Position_Year) AS qry75 ON qry.Year = qry75.Year AND qry.Position = qry75.Position)
LEFT OUTER JOIN (SELECT TOP 100 PERCENT Salary, Year, Position FROM qry_Salary_by_Position_Year) AS qry100 ON qry.Year = qry100.Year AND qry.Position = qry100.Position)
GROUP BY qry.Year, qry.Position
I was lead to this solution by this answer to another post: https://stackoverflow.com/a/7855015/4002530

Related

How to use GROUP BY when fetching values from More than one Table [duplicate]

This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed 4 months ago.
We have 2 Tables Employees and Department.
We want to show the maximum salary from each department and their corresponding employee name from the employee table and the department name from the department table.
Employee Table
EmpId | EmpName |salary |DeptId
101 shubh1 1000 1
101 shubh2 4000 1
102 shubh3 3000 2
102 shubh4 5000 2
103 shubh5 12000 3
103 shubh6 1000 3
104 shubh7 1400 4
104 shubh8 1000 4
Department Table
DeptId | DeptName
1 ComputerScience
2 Mechanical
3 Aeronautics
4 Civil
I tried doing it but was getting error
SELECT DeptName FROM Department where deptid IN(select MAX(salary),empname,deptid
FROM Employee
GROUP By Employee.deptid)
Error
Token error: 'Column 'Employee.EmpName' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.' on server 4e0652f832fd executing on line 1 (code: 8120, state: 1, class: 16)
Can someone please help me.
select salary
,EmpName
,DeptName
from (
select e.salary
,e.EmpName
,d.DeptName
,rank() over(partition by e.DeptId order by e.salary desc) as rnk
from Employee e join Department d on d.DeptId = e.DeptId
) t
where rnk = 1
salary
EmpName
DeptName
4000
shubh2
ComputerScience
5000
shubh4
Mechanical
12000
shubh5
Aeronautics
1400
shubh7
Civil
Fiddle
Now that I know it's MS SQL Server, technically; we could use cross or outer Apply; it's a table value function not a join per say... but this will depend on the version of SQL Server; and if you want data if it doesn't exist in another
I find this the "Best" Design pattern to use for this type of query.
What the engine does is for each record in department, it runs a query for the employees Finding those in that department returning the 1 record having the max salary. With top we could specify with ties to return more than one. but we would need to know how to handle Ties of salary. Use top 1 with ties or order the results so you get the "Top" result you want.
Demo: dbfddle.uk
SELECT Sub.empName, Sub.Salary, D.DeptName
FROM Department D
CROSS Apply (SELECT Top 1 *
--(SELECT TOP 1 with Ties * -- could use this if we ties
FROM Employee E
WHERE E.DeptID = D.DeptID
ORDER BY Salary Desc) Sub --add additional order by if we don't want ties.
The cross apply gives us:
+---------+--------+-----------------+
| empName | Salary | DeptName |
+---------+--------+-----------------+
| shubh2 | 4000 | ComputerScience |
| shubh4 | 5000 | Mechanical |
| shubh5 | 12000 | Aeronautics |
| shubh7 | 1400 | Civil |
+---------+--------+-----------------+
Before window functions, before cross Apply or lateral... We'd write an inline view
It would get us the max salary for each dept, we then join that back to our base tables to find the employee within each dept with max salary...
Demo: DbFiddle.uk
SELECT E.*, D.*
FROM Employee E
INNER JOIN Department D
on E.DeptID = D.DeptID
INNER JOIN (SELECT MAX(SALARY) maxSal , DeptID
FROM Employee
GROUP BY DeptID) Sub
on Sub.DeptID = E.DeptID
and Sub.MaxSal = E.Salary
One has to do a join to get the department info an the employee info. However, we can eliminate the join for salarymax by using exists and correlation instead.
Demo DbFiddle.uk
SELECT E.*, D.*
FROM Employee E
INNER JOIN Department D
on E.DeptID = D.DeptID
WHERE EXISTS (SELECT MAX(Sub.SALARY) maxSal , Sub.DeptID
FROM Employee Sub
WHERE sub.DeptID=E.DeptID --correlation 1
GROUP BY Sub.DeptID
HAVING E.Salary = max(Sub.Salary)) --correlation 2
We could eliminate the last join too I suppose:
Demo: Dbfiddle.uk
SELECT E.*, (SELECT DeptName from Department where E.DeptID = DeptID)
FROM Employee E
WHERE EXISTS (SELECT MAX(Sub.SALARY) maxSal , Sub.DeptID
FROM Employee Sub
WHERE sub.DeptID=E.DeptID --correlation 1
GROUP BY Sub.DeptID
HAVING E.Salary = max(Sub.Salary)) --correlation 2
The top 3 give us this result:
+-----+---------+--------+--------+--------+-----------------+
| id | empName | salary | deptID | DeptID | DeptName |
+-----+---------+--------+--------+--------+-----------------+
| 101 | shubh2 | 4000 | 1 | 1 | ComputerScience |
| 102 | shubh4 | 5000 | 2 | 2 | Mechanical |
| 103 | shubh5 | 12000 | 3 | 3 | Aeronautics |
| 104 | shubh7 | 1400 | 4 | 4 | Civil |
+-----+---------+--------+--------+--------+-----------------+

PostgreSQL - How to get month/year even if there are no records within that date?

What I'm trying to do in this case is to get the ''most future'' record of a Bills table and get all the record prior 13 months from that last record, so what I've tried is something like this
SELECT
users.name,
EXTRACT(month from priority_date) as month,
EXTRACT(year from priority_date) as year,
SUM("money_balance") as "money_balance"
FROM bills
JOIN users on users.id = bills.user_id
WHERE priority_date >= ( SELECT
DATE_TRUNC('month', MAX(debts.priority_date))
FROM bills
INNER JOIN users ON bills.property_id = users.id
WHERE users.company_id = 15
AND users.active = true
AND bills.paid = false ) - interval '13 month'
AND priority_date <= ( SELECT
MAX(bills.priority_date)
FROM bills
INNER JOIN users ON bills.property_id = users.id
WHERE users.community_id = 15
AND users.active = true
AND debts.paid = false )
AND users.company_id = 15
AND bills.paid = false
AND users.active = true
GROUP BY 1,2,3
ORDER BY year, month
So for instance, lets say the most future date for a created bill is December 2022, this query will give me the info from November 2021 to December 2022
The data will give me something like
name
month
year
money_balance
Joshua..
11
2021
300
Joshua..
1
2022
111
Mark..
1
2022
200
...
...
...
...
John
12
2022
399
In the case of Joshua, because he had no bills to pay in December 2021, it doesn't return anything for that month/year.
Is it possible to return the months/year where there are no records for that month, for each user?
Something like
name
month
year
money_balance
Joshua..
11
2021
300
Joshua..
12
2021
0
Joshua..
1
2022
111
other users
....
...
...
Thank you so much!
We can use a CTE to create the list of months, using the maximum and minimum dates from bill, and then cross join it onto users to get a line for all users for all months. We then left join onto bills to populate the last column.
The problem with this approach is that we can end up with a lot of rows with no value.
create table bills(user_id int,priority_date date, money_balance int);
create table users(id int, name varchar(25));
insert into users values(1,'Joshua'),(2,'Mark'),(3,'John');
insert into bills values(1,'2021-11-01',300),(1,'2022-01-01',111),(2,'2022-01-01',200),(3,'2021-12-01',399);
;with months as
(SELECT to_char(generate_series(min(priority_date), max(priority_date), '1 month'), 'Mon-YY') AS "Mon-YY"
from bills)
SELECT
u.name,
"Mon-YY",
--EXTRACT(month from "Mon-YY") as month,
--EXTRACT(year from "Mon-YY") as year,
SUM("money_balance") as "money_balance"
FROM months m
CROSS JOIN users u
LEFT JOIN bills b
ON u.id = b.user_id
AND to_char(priority_date,'Mon-YY') = m."Mon-YY"
GROUP BY
u.name,
"Mon-YY"
ORDER BY "Mon-YY", u.name
name | Mon-YY | money_balance
:----- | :----- | ------------:
John | Dec-21 | 399
Joshua | Dec-21 | null
Mark | Dec-21 | null
John | Jan-22 | null
Joshua | Jan-22 | 111
Mark | Jan-22 | 200
John | Nov-21 | null
Joshua | Nov-21 | 300
Mark | Nov-21 | null
db<>fiddle here

An SQL query to pull count of employees absent under each manager on all dates

The objective of the query is get a count of employees absent under each manager.
Attendance (Dates when employees are present)
id date
1 16/05/2020
2 16/05/2020
1 17/05/2020
2 18/05/2020
3 18/05/2020
Employee
id manager_id
1 2
2 3
3 NA
The desired output should be in this format:
Date manager_id Number_of_absent_employees
16/05/2020 NA 1
17/05/2020 3 1
17/05/2020 NA 1
18/05/2020 2 1
I have tried writing code but partially understood it, intuition being calculating total number of actual employees under each manager and subtracting it from number of employees present on given day. Please help me in completing this query, many thanks!
with t1 as /* for counting total employees under each manager */
(
select employee.manager_id,count(*) as totalc
from employee as e
inner join employee on e.employee_id=employee.employee_id
group by employee.manager_id
)
,t2 as /* for counting total employees present each day */
(
select Attendence.date, employee.manager_id,count(*) as present
from employee
Left join Attendence on employee.employee_id=Attendence.employee_id
group by Attendence.date, employee.manager_id
)
select * from t2
Left join t1 on t2.manager_id=t1.manager_id
order by date
Cross join the distinct dates from Attendance to Employee and left join Attendance to filter out the matching rows.
The remaining rows are the absences so then you need to aggregate:
select d.date, e.manager_id,
count(*) Number_of_absent_employees
from (select distinct date from Attendance) d
cross join Employee e
left join Attendance a on a.date = d.date and a.id = e.id
where a.id is null
group by d.date, e.manager_id
See the demo.
Results:
| date | manager_id | Number_of_absent_employees |
| ---------- | ---------- | -------------------------- |
| 16/05/2020 | NA | 1 |
| 17/05/2020 | 3 | 1 |
| 17/05/2020 | NA | 1 |
| 18/05/2020 | 2 | 1 |
Try this query. In first cte just simplify your code. And in the last query calculate absent employees.
--in this CTE just simplify counting
with t1 as /* for counting total employees under each manager */
(
select employee.manager_id,count(*) as totalc
from employee
group by manager_id
)
,t2 as
(
select Attendence.date, employee.manager_id,count(*) as present
from employee
Left join Attendence on employee.employee_id=Attendence.employee_id
group by Attendence.date, employee.manager_id
)
select t2.date,t2.manager_id, (t1.totalc-t2.present) as employees_absent from t2
Left join t1 on t2.manager_id=t1.manager_id
order by date
Select ec.manager_id, date, (total_employees - employee_attended) as employees_absent from
(Select manager_id, count(id) as total_employees
from employee
group by manager_id) ec,
(Select distinct e.manager_id, a.date, count(a.id) over (partition by e.manager_id, a.date) as employee_attended
from Employee e, attendence, a
where e.id = a.id(+)) ea
where ec.manager_id = ea.manager_id (+)
I guess this should work

Select rows where every child row meets a condition

In my Oracle DB, I have two tables in a one-to-many relationship: Managers and Employees.
+------------+-------+------------+
| Manager_ID | Name | Department |
+------------+-------+------------+
| 1 | Steve | Sales |
| 2 | Ben | Sales |
| 3 | Molly | Accounts |
+------------+-------+------------+
+-------------+------------+--------+-----+
| Employee_ID | Manager_ID | Name | Age |
+-------------+------------+--------+-----+
| 1 | 1 | Kyle | 25 |
| 2 | 1 | Gary | 31 |
| 3 | 2 | Renee | 31 |
| 4 | 2 | Oliver | 32 |
+-------------+------------+--------+-----+
How do I select only those Managers where every one of his Employees is over the age of 30?
In my example data, the only Manager who meets this condition is Ben, because both of his employees are over 30.
I thought something like this would do it, but it's wrong:
SELECT m.manager_id
FROM managers m
WHERE m.manager_id IN (SELECT e.manager_id
FROM employees e
GROUP BY e.manager_id
HAVING e.age > 30)
Use not exists :
select m.*
from manager m
where not exists (select 1
from Employees e
where e.Manager_ID = m.Manager_ID and e.Age < 30
) and
exists (select 1 from Employees e where e.Manager_ID = m.Manager_ID)
The only thing I don't like about Yogesh's answer (which I upvoted, since it's probably the way I'd write it) is that you have to go to the employees table a second time, to make sure the manager actually has at least one employee.
On the plus side, the NOT EXISTS that Yogesh used will allow Oracle to stop looking at a manager's employees once it finds one that is too young. So, maybe it's a toss-up.
I'll offer this alternative. It is shorter than the NOT EXISTS and does not have to go to the employees table a second time.
SELECT m.*
FROM manager m
CROSS APPLY (
SELECT min(age) min_age
FROM employee e
WHERE e.manager_id = m.manager_id ) ma
where ma.min_age >= 30;
Using sub-query for counts
SQL> WITH manager(Manager_ID, Name, Department) AS (
2 SELECT 1, 'Steve', 'Sales' FROM dual UNION ALL
3 SELECT 2, 'Ben', 'Sales' FROM dual UNION ALL
4 SELECT 3, 'Molly', 'Accounts' FROM dual),
5 employee(Employee_ID, Manager_ID, Name, Age) AS (
6 SELECT 1 , 1, 'Kyle', 25 FROM dual UNION ALL
7 SELECT 2 ,1, 'Gary', 31 FROM dual UNION ALL
8 SELECT 3, 2, 'Renee', 31 FROM dual UNION ALL
9 SELECT 4, 2 , 'Oliver', 32 FROM dual)
10 ---------------------------
11 --- End of data preparation
12 ---------------------------
13 SELECT m.name
14 FROM manager m
15 JOIN (SELECT manager_id,
16 COUNT(1) total,
17 COUNT(CASE WHEN age > 30 THEN 1 ELSE NULL END) age_30_above
18 FROM employee
19 GROUP BY manager_id) ee
20 ON m.manager_id = ee.manager_id
21 WHERE total = age_30_above;
Output
NAME
-----
Ben
Your query will be:
SELECT m.name
FROM manager m
JOIN (SELECT manager_id,
COUNT(1) total,
COUNT(CASE WHEN age > 30 THEN 1 ELSE NULL END) age_30_above
FROM employee
GROUP BY manager_id) ee
ON m.manager_id = ee.manager_id
WHERE total = age_30_above;
SELECT manager_id
FROM employees -- managers
minus
select manager_id
from employees
where age <= 30
You can use ALL function like this:
SELECT m.manager_id
FROM managers m
WHERE (30 <= ALL (SELECT e.age FROM employees e WHERE e.manager_id = m.manager_id));
You might want to reverse the conditions, select all managers, who dont have any employee below 30
select * from managers
where manager_id not in (select manager_id
from employees
where age < 30)

SQL | Aggregation

My objective is to get a table such as this:
Request:
Wards where the annual cost of drugs prescribed exceeds 25
ward_no | year | cost
w2 | 2007 | 34
w4 | 2007 | 160
w5 | 2006 | 26
w5 | 2007 | 33
I would input a picture but lack reputation points.
Here is what I have done so far:
select w.ward_no,
year(pn.start_date) as year,
pn.quantity * d.price as cost
from ward w,
patient pt,
prescription pn,
drug d
where w.ward_no = pt.ward_no
and
pn.drug_code = d.drug_code
and
pn.patient_id = pt.patient_id
group by w.ward_no,
year,
cost
having cost > 25
order by w.ward_no, year
My current output is such:
ward_no|year|cost
'w2' |2006|0.28
'w2' |2007|3.20
'w2' |2007|9.50
'w2' |2007|21.60
'w3' |2006|10.08
'w3' |2007|4.80
'w4' |2006|4.41
'w4' |2007|101.00
'w4' |2007|58.80
'w5' |2006|25.20
'w5' |2006|0.56
'w5' |2007|20.16
'w5' |2007|12.60
How would I reduce each ward_no to have only a single row of year (say 2006 or 2007) instead of having x of them?
Any help would be very much appreciated I am really stuck and have no clue what to do.
You need to group by ward and year, and sum up the costs:
select w.ward_no,
year(pn.start_date) as year,
sum(pn.quantity * d.price) as cost
from ward w
inner join patient pt
on pt.ward_no = w.ward_no
inner join prescription pn
on pn.patient_id = pt.patient_id
inner join drug d
on d.drug_code = pn.drug_code
group by w.ward_no,
year(pn.start_date)
having sum(pn.quantity * d.price) > 25
order by w.ward_no, year
What flavor of SQL is this supposed to be for?
Share and enjoy.
You are grouping by cost, so each cost gets a new line. Remove the cost from your grouping
select w.ward_no,
year(pn.start_date) as year,
avg(pn.quantity * d.price) as cost
from ward w,
patient pt,
prescription pn,
drug d
where w.ward_no = pt.ward_no
and
pn.drug_code = d.drug_code
and
pn.patient_id = pt.patient_id
group by w.ward_no,
year
having avg(pn.quantity * d.price) > 25
order by w.ward_no, year