subquery exercise - sql

I need to write a query that contains a subquery where it would list the name of departments and the number of employees per department having the word 'Representative' in their job_title and the list must be ordered by department_id.
I've written this query
SELECT d.department_name, emp.employee_id
FROM departments d, employees emp, jobs j
WHERE emp.department_id=d.department_id
AND j.job_title LIKE '%Representative%';

If there is a requirement to use a subquery, then the following will achieve what you're looking for:
select d.department_id,
d.department_name,
(select count(*)
from employees emp
join jobs j
on j.job_id = emp.employee_job -- I've made some assumptions, here!
where emp.department_id = d.department_id
and j.job_title like '%Representative%') reps
from departments d
order by d.department_id;
Personally, however, I would use a query like this:
select d.department_id,
d.department_name,
count(emp.employee_id) reps
from departments d
join employees emp
on emp.department_id = d.department_id
join jobs j
on j.job_id = emp.employee_job -- Same assumption as before!
where j.job_title like '%Representative%'
group by d.department_id,
d.department_name
order by d.department_id;
I find it easier to read/interpret, but that's ultimately up to you.

You don't need a subquery for this. Use a simple JOIN.
SELECT d.department_name, COUNT(*) AS cnt
FROM employee e JOIN department d
ON e.department_id = d.department_id
JOIN jobs j ON e.jobid = j.jobid
WHERE j.job_title LIKE '%Representative%'
GROUP BY d.department_name
Or, if it is just an exercise for you, I would suggest using the following query:
SELECT d.department_name
, (SELECT COUNT(e.*)
FROM employees e JOIN jobs j ON e.jobid = j.jobid
WHERE e.department_id = d.department_id
AND j.job_title LIKE '%Representative%') AS cnt
FROM departments d
In the real world, however, you code for convenience, not just for exercise. Your code should be convenient to read, understand and maintain for all those who are involved in your software development process. If it is just an exercise then you can use the second query. But if you have to use the query in a live application, the approach in the first query is better for everyone around you.

There is some join logic missing in your example, but something like this may work for you:
select d.department_name, count(emp.employee_id)
from departments d, employees emp, jobs j
where j.job_title in (select job_title from jobs where job_title like '%Representative%')
group by d.department_name
The SQL may not be 100% correct but you can see the point. it's hard to complete it without all of the join logic.
This should return you all of the department names and the employee count where the employee job title contains representative.

Related

What are Oracle's old-syntax join equivalents of these queries?

What are the equivalent joins written in the Oracle's old join syntax of these queries?
SELECT first_name, last_name, department_name, job_title
FROM employees e RIGHT JOIN departments d
ON(e.department_id = d.department_id)
RIGHT JOIN jobs j USING(job_id);
-->106 rows returned
SELECT first_name, last_name, department_name, job_title
FROM employees e RIGHT JOIN jobs j
ON(e.job_id = j.job_id)
RIGHT JOIN departments d
USING(department_id);
--> 122 rows returned
I would do something like this (for the first query) - making explicit the fact that a multiple join is, by definition, an iteration of joins of two tables (or more generally "rowsets") at a time. Think of it as "using parentheses explicitly".
select first_name, last_name, department_name, job_title
from (
select first_name, last_name, job_id, department_name
from employees e, departments d
where e.department_id (+) = d.department_id
) sq
, jobs j
where sq.job_id (+) = j.job_id
;
This can be rewritten (perhaps) using a single SELECT statement, with more WHERE conditions - but the query will be less readable; it wont' be quite as clear what it is doing.
Respectively:
SELECT first_name,
last_name,
department_name,
job_title
FROM employees e,
jobs j,
departments d
WHERE e.job_id (+) = j.job_id
AND e.department_id = d.department_id (+);
and:
SELECT first_name,
last_name,
department_name,
job_title
FROM employees e,
departments d,
jobs j
WHERE e.department_id (+) = d.department_id
AND e.job_id = j.job_id (+);
db<>fiddle here
However, please just use the ANSI join syntax. The old legacy join syntax is confusing to read and you will get errors from putting the (+) on the wrong side of the join condition and you should be teaching people how to use the less-confusing, "new" (its hard to call it new when its been around since Oracle 9i in 2001) syntax rather than reverting to old methods.
Just to add to Mathguy's answer, this is interesting because those innocent-looking right joins are not what they seem. My first (incorrect) attempt was this:
select e.department_id, e.job_id, e.first_name, e.last_name, d.department_name
from jobs j
, departments d
, employees e
where e.job_id(+) = j.job_id
and e.department_id(+) = d.department_id;
but as Mathguy points out it gives different results because of the departments with no employees and the cross join between departments and jobs, and a subtle join precedence effect that appears as a result of the right joins not being in one chain.
I'm not sure what the intention of the original query is. Using the Oracle HR demo schema, the results are the same as an inner join, but only because every job has at least one employee. This illustrates a pitfall in testing outer join queries, as you might run a test, get the same results, and think your rewrite was logically the same thing when it is not.
If you rewrite the original right joins as left joins, it would have to become something like this:
select e.department_id, e.job_id, e.first_name, e.last_name, d.department_name
from jobs j
left join (
departments d
left join employees e on e.department_id = d.department_id
)
on e.job_id = j.job_id;
(You could also expand the departments > employees join into an inline view or with clause, or use an outer apply construction to include the job_id join.)
This is because the two right joins in the original query are driven from jobs and departments, so even though the outer join from departments to employees includes the 16 departments with no employees, once we outer join from jobs to that, we implicitly exclude rows with no job_id, because we are driving it from jobs. So the outer join to departments is filtered to become in effect an inner join, and so long as all jobs have corresponding employees then that gives the same results as an inner join too. To see the difference you would have to insert another job, which adds a row in the results with the job title but no employee details.
Therefore the old-style version needs to be either this:
select de.first_name, de.last_name, de.department_name, j.job_title
from jobs j
, lateral (
select e.department_id, e.job_id, e.first_name, e.last_name, d.department_name
from departments d
, employees e
where e.department_id(+) = d.department_id
) de
where de.job_id(+) = j.job_id;
or without lateral:
select first_name, last_name, department_name, job_title
from jobs j
, ( select e.first_name, e.last_name, e.job_id, d.department_name
from departments d, employees e
where e.department_id (+) = d.department_id ) de
where de.job_id(+) = j.job_id
The second query just switches jobs and departments:
select first_name, last_name, department_name, job_title
from departments d
, ( select e.first_name, e.last_name, e.department_id, e.job_id, j.job_title
from jobs j, employees e
where e.job_id(+) = j.job_id ) je
where je.department_id(+) = d.department_id

SQL in Oracle HR Schema

I have made a query in Oracle HR schema to see the following information:
The city where the department is located
The total number of employees in the department
However, the query cannot be executed correctly and said this is "not a GROUP BY expression".
Does anyone knows what's the problem is? Thanks in advance.
SELECT department_name, city, COUNT(employees.department_id)
FROM departments
JOIN employees on (departments.department_id=employees.department_id)
JOIN locations USING (location_id)
GROUP BY department_name;
You are grouping by department and want to show the department's city. You expect this to work, because each department is in exactly one city. (SQL people call this functional dependency.)
For this to work, ...
there would have to be a unique contraint on the department name or you'd have to group by department_id instead
the DBMS must detect and support functional dependency in aggregation queries
Unfortunately, Oracle doesn't support functional dependency in aggregation queries. It forces us to put every such column in the GROUP BY clause or into an aggregation function.
So either extend the GROUP BY clause:
SELECT d.department_name, l.city, COUNT(e.department_id)
FROM departments d
JOIN employees e ON e.department_id = d.department_id
JOIN locations l USING (location_id)
GROUP BY d.department_name, l.city
ORDER BY d.department_name;
or use some aggregation function as MIN or MAX on that single value.
SELECT d.department_name, MAX(l.city) AS city, COUNT(e.department_id)
FROM departments d
JOIN employees e ON e.department_id = d.department_id
JOIN locations l USING (location_id)
GROUP BY d.department_name
ORDER BY d.department_name;
What I prefer though, is to aggregate first and only then join. You want to join the departments with their employee count, so do just that:
SELECT d.department_name, l.city, COALESCE(e.cnt, 0) AS employee_count
FROM departments d
JOIN locations l USING (location_id)
LEFT JOIN
(
SELECT department_id, COUNT(*) as cnt
FROM employees
GROUP BY department_id
) e ON e.department_id = d.department_id
ORDER BY d.department_name;
The problem is you have both aggregated and non-aggregated column (in your case city in the select list.
As I don't know the structure of location table and considering a department have only one location defined you can use max(city),
SELECT department_name, max(city) city, COUNT(employees.department_id) no_of_employees
FROM departments
JOIN employees on (departments.department_id=employees.department_id)
JOIN locations USING (location_id)
GROUP BY department_name;
As excellently explained by Thorsten, you could also group the data using OVER and PARTITION BY function which would eliminate the use of GROUP BY function.
SELECT d.department_name, l.city, COUNT(e.department_id) OVER (PARTITION BY e.department_id) as emp_count
FROM departments d
JOIN employees e ON e.department_id = d.department_id
JOIN locations l USING (location_id)
ORDER BY d.department_name;

SQL-HR Schema, Retrieving the Dept.Names,managers and employees per dept

Hello guys and thank you in advance for your time and help.
So I am trying to get a list of the Department names their manager name and the total number of employees per department.
My code so far looks like this:
select d.department_name,e.first_name,e.last_name
from employees e, departments d
where e.department_id = d.department_id and d.manager_id=e.employee_id
group by d.department_name,e.first_name,e.last_name
order by d.department_name;
which produces the list of the manager per department,but I am still short of the count of employees per department. Any ideas?
You need to use the COUNT function. Try this:
select d.department_name,e.first_name,e.last_name,count(e.employee_id) as `TotalNoOfEmployees`
from employees e JOIN departments d
ON e.department_id = d.department_id and d.manager_id=e.employee_id
group by d.department_name,e.first_name,e.last_name
order by d.department_name;
Also try not to use the old way of Joining the tables ie, comma separated JOINS.
After a lot of experimentation I got it. Posting it in case somebody might find it useful someday:
select distinct d.department_name,
(select e.first_name||', '||e.last_name from employees e
where d.department_id=e.department_id and
d.manager_id=e.employee_id)as "manager_name",
( select count( employee_id ) from employees e
where d.department_id=e.department_id ) as "total_no_of_employees"
from employees e
join departments d on d.department_id=e.department_id
order by d.department_name;
Try this:
select emp.manager_id, mgr.first_name, mgr.last_name, dept.department_name, count(emp.employee_id)
from hr.employees emp
join hr.employees mgr
on emp.manager_id = mgr.employee_id
join hr.departments dept
on mgr.department_id = dept.department_id
group by emp.manager_id, mgr.first_name, mgr.last_name, dept.department_name
order by department_name

SQL query to find SUM

I'm new too SQL and I've been struggling to write this query. I want to find the SUM of all salaries for employees in a give department, let's say 'M', and a given hire date, let's say '2002', any ideas? I'm thinking I have to JOIN the tables somehow but having trouble, I've set up a schema like this.
jobs table and columns
JOBS
------------
job_id
salary
hire_date
employees table and columns
EMPLOYEES
------------
employee_id
name
job_id
department_id
department table and columns
DEPARTMENTS
------------
department_id
department_name
This is very similar to the way the HR schema does it in Oracle so I think the schema should be OK just need help with the query now.
You need a statement like this:
SELECT e.name,
d.department_name,
SUM(j.salary)
FROM employees e,
departments d,
jobs j
WHERE d.department_name = 'M'
AND TO_CHAR(j.hire_date, 'YYYY') = '2002'
AND d.department_id = e.department_id
AND e.job_id = j.job_id
GROUP BY e.name,
d.department_name;
FWIW, you shouldn't use the old ANSI-89 implicit join notation (using ,). It is considered as deprecated since the ANSI-92 standard (more than 20 yers ago!) and some vendors start dropping its support (MS SQL Server 2008; I don't know if there is a deprecation warning for this "feature" with Oracle?).
So, as a newcomer, you shouldn't learn bad habits from the start.
With the "modern" syntax, your query should be written:
SELECT e.name,
d.department_name,
SUM(j.salary)
FROM employees e
JOIN departments d USING(department_id)
JOIN jobs j USING(job_id)
WHERE d.department_name = 'M'
AND TO_CHAR(j.hire_date, 'YYYY') = '2002'
GROUP BY e.name, d.department_name;
With that syntax, there is a clear distinction between the JOIN relation (USING or ON) and the filter clause WHERE. It will later ease things when you will encounter "advanced" joins such as OUTER JOIN.
Yes, you just need a simple inner JOIN between all three tables.
SELECT SUM(salary)
FROM JOBS j
JOIN EMPLOYEES e ON j.job_id = e.job_id
JOIN DEPARTMENTS d ON e.department_id = d.department_id
WHERE d.department_name = 'M'
AND e.hire_date = 2002

SQL query errors and mistakes

I have an issue with a SQL query.
The question: show all the departments in which the max salary is bigger than 10000.
I am getting an output with this but it doesn't seem right.
My code:
SELECT
Department_Name, Max_Salary
FROM
Departments
INNER JOIN
Job_History ON Departments.department_id = Job_History.department_id
INNER JOIN
Jobs ON Job_History.job_id = jobs.job_id
WHERE
Max_Salary > 10000
Output:
DEPT_NAME | MAX_SALARY
------------------------
Accounting | 16,000
Sales | 12,080
Sales | 20,080
There is only one Sales department in the database.
Any help on why this is happening would be appreciated.
Likely, there are multiple rows in job_history that are related to 'Sales' Department.
The join operation is returning all matching rows.
To get a distinct list of Department_Name, you could add GROUP BY Department_name to the end of the query. You'll also want to use an aggregate function around the Max_Salary column in the select list... e.g. MAX(Max_Salary).
Best practice is to qualify all column references in the query. For a reader not familiar with the database schema, it's not clear whether Max_Salary is from the Job_History table, or the Job table. Also, the keyword INNER has no effect on the join operation, that keyword can be omitted.
--This works
SELECT d.department_name
, MAX(j.max_salary) AS max_salary
FROM Departments d
JOIN Job_History h
ON h.department_id = d.department_id
JOIN Jobs j
ON j.job_id = h.job_id
WHERE j.max_salary > 10000
GROUP BY d.department_name
I would prefer to comment, but I don't have a high enough reputation.
Can you try this and see what it brings for the ids.
Select Department_Name, Max_Salary,D.department_id,J.job_id
From Departments D
INNER JOIN Job_History J_H
ON D.department_id=J_H.department_id
INNER JOIN Jobs J
ON J_H.job_id=J.job_id
WHERE Max_Salary > 10000
Select d.department_name
, MAX(j.max_salary) AS max_salary
From Departments D
INNER JOIN Job_History J_H
ON D.department_id=J_H.department_id
INNER JOIN Jobs J
ON J_H.job_id=J.job_id
GROUP BY d.department_name
having max(j.max_salary)>10000