SQL query errors and mistakes - sql

I have an issue with a SQL query.
The question: show all the departments in which the max salary is bigger than 10000.
I am getting an output with this but it doesn't seem right.
My code:
SELECT
Department_Name, Max_Salary
FROM
Departments
INNER JOIN
Job_History ON Departments.department_id = Job_History.department_id
INNER JOIN
Jobs ON Job_History.job_id = jobs.job_id
WHERE
Max_Salary > 10000
Output:
DEPT_NAME | MAX_SALARY
------------------------
Accounting | 16,000
Sales | 12,080
Sales | 20,080
There is only one Sales department in the database.
Any help on why this is happening would be appreciated.

Likely, there are multiple rows in job_history that are related to 'Sales' Department.
The join operation is returning all matching rows.
To get a distinct list of Department_Name, you could add GROUP BY Department_name to the end of the query. You'll also want to use an aggregate function around the Max_Salary column in the select list... e.g. MAX(Max_Salary).
Best practice is to qualify all column references in the query. For a reader not familiar with the database schema, it's not clear whether Max_Salary is from the Job_History table, or the Job table. Also, the keyword INNER has no effect on the join operation, that keyword can be omitted.
--This works
SELECT d.department_name
, MAX(j.max_salary) AS max_salary
FROM Departments d
JOIN Job_History h
ON h.department_id = d.department_id
JOIN Jobs j
ON j.job_id = h.job_id
WHERE j.max_salary > 10000
GROUP BY d.department_name

I would prefer to comment, but I don't have a high enough reputation.
Can you try this and see what it brings for the ids.
Select Department_Name, Max_Salary,D.department_id,J.job_id
From Departments D
INNER JOIN Job_History J_H
ON D.department_id=J_H.department_id
INNER JOIN Jobs J
ON J_H.job_id=J.job_id
WHERE Max_Salary > 10000

Select d.department_name
, MAX(j.max_salary) AS max_salary
From Departments D
INNER JOIN Job_History J_H
ON D.department_id=J_H.department_id
INNER JOIN Jobs J
ON J_H.job_id=J.job_id
GROUP BY d.department_name
having max(j.max_salary)>10000

Related

Trouble understanding inner join

Im trying to understand inner join correct use but im having trouble when I have to call more than 2 tables, im using oracle HR schema for tests, so for example im trying this query:
select emps.employee_id,
emps.first_name,
dep.department_name,
dep.department_id,
jh.job_id
from employees emps
inner join departments dep
on emps.department_id = dep.department_id
inner join job_history jh
on dep.department_id = jh.department_id;
which shows the following result:
I know its wrong because first its showing duplicate rows, and second if I run this query
select * from job_history where employee_id = 100;
It shows no results which means that employee 100 (steven) shouldnt be appearing in the results of the first query, I expect first query to show me the results of data that is on employees and also on department and also on job_history which have in common the department_id
HR (human resources schema):
Can anyone help me, what im missing, why it shows duplicate rows?
The job_history table has 3 foreign keys:
employee_id
job_id
department_id
Your INNER JOIN only joins on the DEPARTMENT_ID so you are matching an employee to the job history for any jobs that occurred for their department but not necessarily jobs specific to that employee. Instead, you probably want to join on either the employee_id or the combination of employee_id and department_id.
So, if you want the jobs history for that employee for any department:
select emps.employee_id,
emps.first_name,
dep.department_name,
dep.department_id,
jh.job_id
from employees emps
inner join departments dep
on ( emps.department_id = dep.department_id )
inner join job_history jh
on ( emps.employee_id = jh.employee_id );
Or, if you want the job history for that employee within that department:
select emps.employee_id,
emps.first_name,
dep.department_name,
dep.department_id,
jh.job_id
from employees emps
inner join departments dep
on ( emps.department_id = dep.department_id )
inner join job_history jh
on ( emps.employee_id = jh.employee_id
and emps.department_id = jh.department_id );

SQLite Multiple Subqueries Logic

The problem is asking for one to write a query to find the names (first_name, last_name) of the employees who have a manager who works for a department based in the United States. Here is a link to the problem, to see the tables, https://www.w3resource.com/sqlite-exercises/sqlite-subquery-exercise-3.php.
For the subquery, I did a left join on the location id between the department and location tables then I selected 'US' for the country_id, and returned the manager_id
For the outer query, I chose the Employee table, selected manager_ids from sub-query list.
SELECT first_name, last_name
FROM Employees
WHERE manager_id IN (SELECT manager_id
FROM Departments d LEFT JOIN Locations l ON d.location_id = l.location_id
WHERE country_id = 'US')
ORDER BY first_name;
With my code, I do not get the correct answer, the same results as the result-set/output shown on the website.
There are three subqueries in total in the correct answer. I do not understand what the purpose of including the subquery involving the employees table (outermost subquery). I understand that is where I messed up but don't understand why.
SELECT first_name, last_name
FROM employees
WHERE manager_id IN
(SELECT employee_id
FROM employees
WHERE department_id IN
(SELECT department_id
FROM departments
WHERE location_id IN
(SELECT location_id
FROM locations
WHERE country_id='US')));
You need to join all the tables:
select e.first_name, e.last_name
from employees e
inner join employees m on m.employee_id = e.manager_id
inner join departments d on d.department_id = m.department_id
inner join locations l on l.location_id = d.location_id
where l.country_id='US'

Subquery using 3 Tables SQL

I'm trying to display the last name of the lowest paid employees from each city. The city column falls under a table titled LOCATIONS while employee information(salary, last name) falls under EMPLOYEES. Both of these tables are related share no common table, so I have to rely on a third table, DEPARTMENTS to connect the two as DEPARTMENTS contains a department_id that it shares with EMPLOYEES as well as a LOCATION_ID that it shares with LOCATIONS. This is what I have so far, but I'm having trouble with this as I've mostly worked with only two tables in the past.
SELECT LAST_NAME
FROM EMPLOYEES
WHERE (DEPARTMENT_ID) IN
(SELECT DEPARTMENT_ID
FROM DEPARTMENTS
WHERE LOCATION_ID IN
(SELECT LOCATION_ID
FROM LOCATIONS
GROUP BY CITY
HAVING MIN(SALARY)));
This seems to be an assignment in an intro course in SQL. So let's assume you can't use analytic functions, match_recognize clause, etc. Just joins and aggregates.
In the subquery in the WHERE clause below, we compute the min salary for each city. We need to join all three tables for this. Then in the overall query we join the three tables again, and we use the subquery for an IN condition (a semi-join). The overall query looks like this:
select e.last_name
from employees e join departments d
on e.department_id = d.department_id
join locations l
on d.location_id = l.location_id
where ( e.salary, l.city ) in
(
select min(salary), city
from employees e join departments d
on e.department_id = d.department_id
join locations l
on d.location_id = l.location_id
group by city
)
;
You should separate out the concept of table joins from WHERE clauses.
Use WHERE for filtering data, use JOIN for connecting data together.
I think this is what you are wanting. By the way, lose the ALL CAPS if you can.
SELECT
LAST_NAME
FROM
EMPLOYEES
INNER JOIN (
SELECT
DEPARTMENTS.DEPARTMENT_ID,
CITY,
MIN(SALARY) AS LOWEST_SALARY
FROM
EMPLOYEES
INNER JOIN DEPARTMENTS ON EMPLOYEES.DEPARTMENT_ID = DEPARTMENTS.DEPARTMENT_ID
INNER JOIN LOCATIONS ON DEPARTMENTS.LOCATION_ID = LOCATIONS.LOCATION_ID
GROUP BY
DEPARTMENTS.DEPARTMENT_ID,
LOCATIONS.CITY
) AS MINIMUM_SALARIES
ON EMPLOYEES.DEPARTMENT_ID = MINIMUM_SALARIES.DEPARTMENT_ID
AND EMPLOYEES.SALARY = MINIMUM_SALARIES.LOWEST_SALARY
First of all join the tables, so you see city and employee in one row. If we group by city we get the minimum salary per city.
with city_employees as
(
select l.city, e.*
from locations l
join departments d using (location_id)
join employees e using (department_id)
)
select last_name
from city_employees
where (city, salary) in
(
select city, min(salary)
from city_employees
group by l.city
);
It is easier to achieve the same, however, with window functions (min over or rank over here).
select last_name
from
(
select
e.last_name,
e.salary,
min(e.salary) over (partition by l.city) as min_salary
from locations l
join departments d using (location_id)
join employees e using (department_id)
)
where salary = min_salary;

SQL query to find SUM

I'm new too SQL and I've been struggling to write this query. I want to find the SUM of all salaries for employees in a give department, let's say 'M', and a given hire date, let's say '2002', any ideas? I'm thinking I have to JOIN the tables somehow but having trouble, I've set up a schema like this.
jobs table and columns
JOBS
------------
job_id
salary
hire_date
employees table and columns
EMPLOYEES
------------
employee_id
name
job_id
department_id
department table and columns
DEPARTMENTS
------------
department_id
department_name
This is very similar to the way the HR schema does it in Oracle so I think the schema should be OK just need help with the query now.
You need a statement like this:
SELECT e.name,
d.department_name,
SUM(j.salary)
FROM employees e,
departments d,
jobs j
WHERE d.department_name = 'M'
AND TO_CHAR(j.hire_date, 'YYYY') = '2002'
AND d.department_id = e.department_id
AND e.job_id = j.job_id
GROUP BY e.name,
d.department_name;
FWIW, you shouldn't use the old ANSI-89 implicit join notation (using ,). It is considered as deprecated since the ANSI-92 standard (more than 20 yers ago!) and some vendors start dropping its support (MS SQL Server 2008; I don't know if there is a deprecation warning for this "feature" with Oracle?).
So, as a newcomer, you shouldn't learn bad habits from the start.
With the "modern" syntax, your query should be written:
SELECT e.name,
d.department_name,
SUM(j.salary)
FROM employees e
JOIN departments d USING(department_id)
JOIN jobs j USING(job_id)
WHERE d.department_name = 'M'
AND TO_CHAR(j.hire_date, 'YYYY') = '2002'
GROUP BY e.name, d.department_name;
With that syntax, there is a clear distinction between the JOIN relation (USING or ON) and the filter clause WHERE. It will later ease things when you will encounter "advanced" joins such as OUTER JOIN.
Yes, you just need a simple inner JOIN between all three tables.
SELECT SUM(salary)
FROM JOBS j
JOIN EMPLOYEES e ON j.job_id = e.job_id
JOIN DEPARTMENTS d ON e.department_id = d.department_id
WHERE d.department_name = 'M'
AND e.hire_date = 2002

subquery exercise

I need to write a query that contains a subquery where it would list the name of departments and the number of employees per department having the word 'Representative' in their job_title and the list must be ordered by department_id.
I've written this query
SELECT d.department_name, emp.employee_id
FROM departments d, employees emp, jobs j
WHERE emp.department_id=d.department_id
AND j.job_title LIKE '%Representative%';
If there is a requirement to use a subquery, then the following will achieve what you're looking for:
select d.department_id,
d.department_name,
(select count(*)
from employees emp
join jobs j
on j.job_id = emp.employee_job -- I've made some assumptions, here!
where emp.department_id = d.department_id
and j.job_title like '%Representative%') reps
from departments d
order by d.department_id;
Personally, however, I would use a query like this:
select d.department_id,
d.department_name,
count(emp.employee_id) reps
from departments d
join employees emp
on emp.department_id = d.department_id
join jobs j
on j.job_id = emp.employee_job -- Same assumption as before!
where j.job_title like '%Representative%'
group by d.department_id,
d.department_name
order by d.department_id;
I find it easier to read/interpret, but that's ultimately up to you.
You don't need a subquery for this. Use a simple JOIN.
SELECT d.department_name, COUNT(*) AS cnt
FROM employee e JOIN department d
ON e.department_id = d.department_id
JOIN jobs j ON e.jobid = j.jobid
WHERE j.job_title LIKE '%Representative%'
GROUP BY d.department_name
Or, if it is just an exercise for you, I would suggest using the following query:
SELECT d.department_name
, (SELECT COUNT(e.*)
FROM employees e JOIN jobs j ON e.jobid = j.jobid
WHERE e.department_id = d.department_id
AND j.job_title LIKE '%Representative%') AS cnt
FROM departments d
In the real world, however, you code for convenience, not just for exercise. Your code should be convenient to read, understand and maintain for all those who are involved in your software development process. If it is just an exercise then you can use the second query. But if you have to use the query in a live application, the approach in the first query is better for everyone around you.
There is some join logic missing in your example, but something like this may work for you:
select d.department_name, count(emp.employee_id)
from departments d, employees emp, jobs j
where j.job_title in (select job_title from jobs where job_title like '%Representative%')
group by d.department_name
The SQL may not be 100% correct but you can see the point. it's hard to complete it without all of the join logic.
This should return you all of the department names and the employee count where the employee job title contains representative.