Postgresql with Outer Join to Get Nulls Included - sql

New to databases. I have a schema as follows:
Employee(**eid**, pname, age)
Dept(**dname**, num_managers) / Dept = department
Dept_Wing(**wingno**, wing_name, dname) / Dept_Wing = which wing of department
Wing_Sub_Division(**dname**, **wingno**, **sub_div_no**) / Wing_Sub_Division = which sub-division of the wing
On_Duty(**eid**, **dname**, **wingno**, sub_div_no)
I want to get the: dname, wingno, number on duty for that wing
for each and all wings than have less than 15% employees on duty of the total number of employees (so this includes dept_wings with 0 employees on duty).
I tried
select w.wingo, od.wingo
from dept_wing w left outer join on_duty od on on w.wingo=od.wingno
just to get all the wings which have even no employees on duty, but I'm not able to even return those wings! Any is guidance appreciated!

It seems you are working with composite keys; a department wing is identified by dname and wingno, so you have the pair in Dept_Wing, Wing_Sub_Division, and On_Duty. This makes this query quite simple.
Here is how to get the number of employees on duty per wing:
select dname, wingno, count(*)
from on_duty
group by dname, wingno
So the whole query is
select
wing.dname, wing.wingno, coalesce(duty.cnt, 0) as employees_on_duty
from dept_wing wing
left join
(
select dname, wingno, count(*) as cnt
from on_duty
group by dname, wingno
) duty on duty.dname = wing.dname and duty.wingno = wing.wingno
where coalesce(duty.cnt, 0) < 0.15 * (select count(*) from employee);

Related

SQL Choose a manager in whose department the total value of sales for 1990 is the highest

I have a hard time trying to figure out the way the tables Sales_order and Employees are connected.
The question is "How can I extract data on the manager with the highest value of sales for 1990 within his department if there is no common column between these tables?"
join indirectly via customer to get manager_id
select top 1 manager_id,sum(total)
from salesorders so
join customers cus on cus.customer_id = so.customer_id
join employee emp on emp.employee_id = cus.salesperson_id
where so.order_date between '1999-01-01' and '1999-12-31'
group by manager_id
order sum(total) desc

JOIN - 2 tasks - sql developer

I have 2 tasks:
1. FIRST TASK
Show first_name, last_name (from employees), job_title, employee_id (from jobs) start_date, end_date (from job_history)
My idea:
SELECT s.employee_id
, first_name
, last_name
, job_title
, employee_id
, start_date
, end_date
FROM employees
INNER JOIN jobs hp
on s.employee_id = hp.employee_id
INNER JOIN job_history
on hp.jobs = h.jobs
I know it doesn't work. I'm receiving: "HP"."EMPLOYEE_ID": invalid identifier
What does it mean "on s.employee_id = hp.employee_id". Maybe I should write sthg else instead of this.
2. SECOND TASK
Show department_name (from departments), average and max salary for each department (those data are from employees) and how many employees are working in those departments (from employees). Choose only departments with more than 1 person. The result round to 2 decimal places.
I have the pieces, but i don't know to connect it
My idea:
SELECT department_name,average(salary),max(salary),count(employees_id)
FROM employees
INNER JOIN departments
on employees_id = departments_id
HAVING count(department) > 1
SELECT ROUND(average(salary),2) from employees
I modified your queries a bit by improving table aliasing. Hopefully, if the right columns are present in the tables as you say, it should work:
SELECT s.employee_id, s.first_name, s.last_name,
hp.job_title, hp.employee_id,
h.start_date, h.end_date
FROM employees s
INNER JOIN jobs hp
on s.employee_id = hp.employee_id
INNER JOIN job_history h
on hp.jobs = h.jobs;
When we say on s.employee_id = hp.employee_id it means that if, for example, there is an employee_id = 1234 present in both the tables employees and jobs, then SQL will bring all the columns from both the tables in the same line that corresponds to employee_id = 1234. You can now pick different columns in the SELECT clause as if they are in the same/single table(which was not the case before joining). This is the main logic behind SQL joins.
As to your 2nd task, try the below query. I made some modifications in aggregation by introducing COUNT(DISTINCT s.employees_id). If the same employees_id is present twice for some reason, you still want to count that as one person.
SELECT d.department_name, avg(s.salary), max(s.salary), count(distinct s.employees_id)
FROM employees s
INNER JOIN departments d
on e.employees_id = d.departments_id
GROUP BY d.department_name
HAVING COUNT(DISTINCT s.employees_id) > 1;
Let me know if there is still any issue. Hopefully, this works.

SQL Query - Unsure How to Fix Logical Error

Edit: Sorry! I am using Microsoft SQL Server.
For clarification, you can have a department named "x" with a list of jobs, a department named "y" with a different list of jobs, etc.
I also need to use >= ALL instead of TOP 1 or MAX because I need it to return more than one value if necessary (if job1 has 20 employees, job2 has 20 employees and they are both the biggest values, they should both return).
In my query I'm trying to find the most common jobTitle and the number of employees that work under this jobTitle, which is under the department 'Research and Development'. The query I've written consists of joins to be able to return the necessary data.
The problem I am having is with the WHERE statement. The HAVING COUNT(JobTitle) >= ALL is finding the biggest number of employees that work under a job, however the problem is that my WHERE statement is saying the Department must be 'Research and Development', but the job with the most amount of employees comes from a different department, and thus the output produces only the column names and nothing else.
I want to redo the query so that it returns the job with the largest amount of employees that comes from the Research and Development department.
I know this is probably pretty simple, I'm a noob :3 Thanks a lot for the help!
SELECT JobTitle, COUNT(JobTitle) AS JobTitleCount, Department
FROM HumanResources.Employee AS EMP JOIN
HumanResources.EmployeeDepartmentHistory AS HIST
ON EMP.BusinessEntityID = HIST.BusinessEntityID JOIN
HumanResources.Department AS DEPT
ON HIST.DepartmentID = DEPT.DepartmentID
WHERE Department = 'Research and Development'
GROUP BY JobTitle, Department
HAVING COUNT(JobTitle) >= ALL (
SELECT COUNT(JobTitle) FROM HumanResources.Employee
GROUP BY JobTitle
)
If you only want one row, then a typical method is:
SELECT JobTitle, COUNT(*) AS JobTitleCount
FROM HumanResources.Employee AS EMP JOIN
HumanResources.EmployeeDepartmentHistory AS HIST
ON EMP.BusinessEntityID = HIST.BusinessEntityID JOIN
HumanResources.Department AS DEPT
ON HIST.DepartmentID = DEPT.DepartmentID
WHERE Department = 'Research and Development'
GROUP BY JobTitle
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
Although FETCH FIRST 1 ROW ONLY is the ANSI standard, some databases spell it LIMIT or even SELECT TOP (1).
Note that I removed DEPARTMENT both from the SELECT and the GROUP BY. It seems redundant.
And, if I had to guess, your query is going to overstate results because of the history table. If this is the case, ask another question, with sample data and desired results.
EDIT:
In SQL Server, I would recommend using window functions. To get the one top job title:
SELECT JobTitle, JobTitleCount
FROM (SELECT JobTitle, COUNT(*) AS JobTitleCount,
ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) as seqnum
FROM HumanResources.Employee AS EMP JOIN
HumanResources.EmployeeDepartmentHistory AS HIST
ON EMP.BusinessEntityID = HIST.BusinessEntityID JOIN
HumanResources.Department AS DEPT
ON HIST.DepartmentID = DEPT.DepartmentID
WHERE Department = 'Research and Development'
GROUP BY JobTitle
) j
WHERE seqnum = 1;
To get all such titles, when there are duplicates, use RANK() or DENSE_RANK() instead of ROW_NUMBER().
with employee_counts as (
select
hist.DepartmentID, emp.JobTitle, count(*) as cnt,
case when dept.Department = 'Research and Development' then 1 else 0 end as is_rd,
from HumanResources.Employee as emp
inner join HumanResources.EmployeeDepartmentHistory as hist
on hist.BusinessEntityID = emp.BusinessEntityID
inner join HumanResources.Department as dept
on dept.DepartmentID = hist.DepartmentID
group by
hist.DepartmentID, emp.JobTitle
)
select * from employee_counts
where is_rd = 1 and cnt = (
select max(cnt) from employee_counts
/* where is_rd = 1 */ -- ??
);

How to solve this query in SQL?

I have the following two relations:
EMP(ENO, ENAME, JOB, DATEJOB, SAL, DNO)
DEPT(DNO, DNAME, DIR)
EMP is a relation of the employees with their number ENO, their names ENAME, their job titles JOB, the dates when they get hired, their salary SAL, and the department number they are working on DNO (forgein key which references DNO in DEPT).
DEPT is a relation of the department with the number of each department DNO, the name of the department DNAME, the director of the department DIR (forgein key which references ENO).
My question is:
Write the following query in SQL.
Find the names of the employees that have the same job and the same director as 'Joe'.
My attempt was:
SELECT ENAME
FROM EMP, DEPT
WHERE EMP.DNO = DEPT.DNO
AND (DIR, JOB) IN (
SELECT DIR, JOB
FROM EMP, DEPT
WHERE ENAME = 'Joe'
AND EMP.DEPT = DEPT.DNO
)
AND ENO NOT IN (
SELECT ENO
FROM EMP, DEPT
WHERE ENAME = 'Joe'
AND EMP.DEPT = DEPT.DNO
)
I found the solution of this problem but I couldn't agree of it.
This is what I found:
SELECT ENAME
FROM EMP, DEPT
WHERE ENAME <> 'Joe'
AND EMP.DNO = DEPT.DNO
AND (DIR, JOB) = (
SELECT DIR, JOB
FROM EMP, DEPT
WHERE ENAME = 'Joe'
AND EMP.DEPT = DEPT.DNO
)
The thing is, we have to not consider 'Joe' in the result. But which 'Joe'?
It looks like there's a potential for a "director" to head multiple departments. At least, the information model doesn't seem to be anything to restrict that (i.e. no unique constraint on DIR)
Presumably, we identify employee 'Joe' by finding the tuples(s) in EMP with ENAME attribute equal to 'Joe'.
And presumably, we would identify Joe's "director" by getting the value of the DIR attribute from the DEPT relation.
If we wanted employees in the "same department" as Joe, we could just use the value of the DNO attribute,... but the requirement says "same director". So, just in case the same director heads multiple departments, we'll get all the departments headed by that director.
Then, it's a simple matter of getting all of the employees in those departments, and check for a "job" that matches Joe's "job".
SELECT e.ENAME
FROM EMP j
JOIN DEPT i
ON i.DNO = j.DNO
JOIN DEPT d
ON d.DIR = i.DIR
JOIN EMP e
ON e.DNO = d.DNO
AND e.JOB = j.JOB
WHERE j.ENAME = 'Joe'
Again, if we wanted only the employees in the "same department" as Joe, we could dispense with one of those references to DEPT. The result from this would be different, if Joe's director heads another department, and there's an employee in that other department has the same job... that employee would be excluded from this query:
SELECT e.ENAME
FROM EMP j
JOIN DEPT i
ON i.DNO = j.DNO
-- JOIN DEPT d
-- ON d.DIR = i.DIR
JOIN EMP e
-- ON e.DNO = d.DNO
ON e.DNO = i.DNO
AND e.JOB = j.JOB
WHERE j.ENAME = 'Joe'
If there's a requirement to exclude Joe from the resultset, then we could add another predicate to the WHERE clause. If we don't assume that ENAME can't have a NULL value...
AND ( e.ENAME IS NULL OR e.ENAME <> 'Joe')
You're correct in that the second solution is wrong. If there are two 'Joe's it won't work right. That's why you should exclude based on the unique ENO instead of the non-unique name. The first query won't work for the same reason. In order to be certain, you can't select either just by names or titles or departments, because those can be duplicate. We have three Chris programmers in our department.
Also, that join syntax is obsolete because it can cause confusion to the database in certain circumstances. Please see http://www.w3schools.com/sql/sql_join_inner.asp for an explanation of the current syntax.
The comma style of join you are using has been obsolete for a long time. I think the below is what you're after. The idea is to join a table to its self. This is done by giving the table aliases- source and twin here.
SELECT twin.ENAME
FROM EMP AS source
JOIN EMP AS twin ON twin.DNO = source.DNO AND twin.JOB = source.JOB
WHERE source.ENAME = 'Joe' AND source.ENO <> target.ENO

Oracle SQL Exclude row from query results

I have two tables in Oracle SQL:
PROJECT (PID, Pname, Budget, DID)
DIVISION (DID, Dname)
Bold = Primary key
Italic = Foreign key
I want to list the division that has more projects than the division marketing.
Here is my code:
select dname as "Division"
from division d, project p
where d.did = p.did
group by dname
having count(pid) >= all
(select count(p.pid)
from project p, division d
where p.did = d.did and d.dname = 'marketing')
I return the correct record but also the marketing record. How can I exclude the marketing record from the results?
Why don't you exclude the marketing record from your initial SQL by adding:
and d.dname != 'marketing'
To the first where clause.
You might gain efficiency with a common table expression (WITH clause) to aggregate the count by department, then you can query it ...
with cte as (
select dname,
count(*) projects
from project p,
division d
where p.did = d.did
group by dname)
select dname,
projects
from cte
where projects > (select projects
from cte
where dname = 'Marketing)