Minimizing SQL queries using join with one-to-many relationship - sql

So let me preface this by saying that I'm not an SQL wizard by any means. What I want to do is simple as a concept, but has presented me with a small challenge when trying to minimize the amount of database queries I'm performing.
Let's say I have a table of departments. Within each department is a list of employees.
What is the most efficient way of listing all the departments and which employees are in each department.
So for example if I have a department table with:
id name
1 sales
2 marketing
And a people table with:
id department_id name
1 1 Tom
2 1 Bill
3 2 Jessica
4 1 Rachel
5 2 John
What is the best way list all departments and all employees for each department like so:
Sales
Tom
Bill
Rachel
Marketing
Jessica
John
Pretend both tables are actually massive. (I want to avoid getting a list of departments, and then looping through the result and doing an individual query for each department). Think similarly of selecting the statuses/comments in a Facebook-like system, when statuses and comments are stored in separate tables.

You can get it all in a single query with a simple join, e.g.:
SELECT d.name AS 'department', p.name AS 'name'
FROM department d
LEFT JOIN people p ON p.department_id = d.id
ORDER BY department
This returns all the data, but it's a bit of a pain to consume, since you'll have to iterate through every person anyway. You can go further and group them together:
SELECT d.name AS 'department',
GROUP_CONCAT(p.name SEPARATOR ', ') AS 'name'
FROM department d
LEFT JOIN people p ON p.department_id = d.id
GROUP BY department
You'll get something like this as the output:
department | name
-----------|----------------
sales | Tom, Bill, Rachel
marketing | Jessica, John

SELECT d.name, p.name
FROM department d
JOIN people p ON p.department_id = d.id
I suggest also reading a SQL Join tutorial or three. This is a very common and very basic SQL concept that you should understand thoroughly.

This is normally done in a single query:
SELECT DepartmentTable.Name, People.Name from DepartmentTable
INNER JOIN People
ON DepartmentTable.id = People.department_id
ORDER BY DepartmentTable.Name
This will suppress empty departments. If you want to show empty departments, change INNER to LEFT OUTER

Related

How to make sure result pairs are unique - without using distinct?

I have three tables I want to iterate over. The tables are pretty big so I will show a small snippet of the tables. First table is Students:
id
name
address
1
John Smith
New York
2
Rebeka Jens
Miami
3
Amira Sarty
Boston
Second one is TakingCourse. This is the course the students are taking, so student_id is the id of the one in Students.
id
student_id
course_id
20
1
26
19
2
27
18
3
28
Last table is Courses. The id is the same as the course_id in the previous table. These are the courses the students are following and looks like this:
id
type
26
History
27
Maths
28
Science
I want to return a table with the location (address) and the type of courses that are taken there. So the results table should look like this:
address
type
The pairs should be unique, and that is what's going wrong. I tried this:
select S.address, C.type
from Students S, Courses C, TakingCourse TC
where TC.course_id = C.id
and S.id = TC.student_id
And this does work, but the pairs are not all unique. I tried select distinct and it's still the same.
Multiple students can (and will) reside at the same address. So don't expect unique results from this query.
Only an overview is needed, so that's why I don''t want duplicates
So fold duplicates. Simple way with DISTINCT:
SELECT DISTINCT s.address, c.type
FROM students s
JOIN takingcourse t ON s.id = t.student_id
JOIN courses c ON t.course_id = c.id;
Or to avoid DISTINCT (why would you for this task?) and, optionally, get counts, too:
SELECT c.type, s.address, count(*) AS ct
FROM students s
JOIN takingcourse t ON s.id = t.student_id
JOIN courses c ON t.course_id = c.id
GROUP BY c.type, s.address
ORDER BY c.type, s.address;
A missing UNIQUE constraint on takingcourse(student_id, course_id) could be an additional source of duplicates. See:
How to implement a many-to-many relationship in PostgreSQL?

INNER JOIN and Count POSTGRESQL

I am learning postgresql and Inner join I have following table.
Employee
Id Name DepartmentId
1 John S. 1
2 Smith P. 1
3 Anil K. 2
Department
Department
Id Name
1 HR
2 Admin
I want to query to return the Department Name and numbers of employee in each department.
SELECT Department.name , COUNT(Employee.id) FROM Department INNER JOIN Employee ON Department.Id = Employee.DepartmentId Group BY Employee.department_id;
I dont know what I did wrong as I am new to database Query.
When involving all rows or major parts of the "many" table, it's typically faster to aggregate first and join later. Certainly the case here, since we are after counts for "each department", and there is no WHERE clause at all.
SELECT d.name, COALESCE(e.ct, 0) AS nr_employees
FROM department d
LEFT JOIN (
SELECT department_id AS id, count(*) AS ct
FROM employee
GROUP BY department_id
) e USING (id);
Also made it a LEFT [OUTER] JOIN, to keep departments without any employees in the result. And COALESCE to report 0 employees instead of NULL in that case.
Related, with more explanation:
Query with LEFT JOIN not returning rows for count of 0
Your original query would work too, after fixing the GROUP BY clause:
SELECT department.name, COUNT(employee.id)
FROM department
INNER JOIN employee ON department.id = employee.department_id
Group BY department.id; --!
That's assuming department.id is the PRIMARY KEY of the table, in which case it covers all columns of that table, including department.name. And you may want LEFT JOIN like above.
Aside: Consider legal, lower-case names exclusively in Postgres. See:
Are PostgreSQL column names case-sensitive?

Postgresql join tables to several columns

I have a list of students' name called table Names and I want to find their categories from another table called Categories as below:
Class_A Class_B Class_C Class_D Category
Sam Adam High
Sarah Medium
James High
Emma Simon Nick Low
My solution is to do a left join but students name from first table should be matching with one of four columns so I am not sure how to write queries. At the moment my query is just matching to Class_A while I need to check all categories and if the student's name exist, return category.
(Note: some rows have more than one student's name)
SELECT Names.name, Categories.Category
FROM Names
LEFT JOIN Categories ON Names.name = Categories.Class_A;
Table Names looks like this:
Name
----
Emma
Nick
James
Adam
Jack
Sarah
And I am expecting an output as below:
Name Category
---- ----
Emma Low
Nick Low
James High
Adam High
Jack -
Sarah Medium
I would be inclined to unpivot the first table. This looks like:
select n.name, c.category
from name n left join
(categories c cross join lateral
(values (c.class_a), (c.class_b), (c.class_c), (c.class_d)
) v(name)
)
on n.name = v.name
where v.name is not null;
Although you can also solve this using in (or or) in the on clause, that may produce a much less efficient execution plan.
Try this using OR in on clause:
SELECT Names.name, coalesce(Categories.Category,'-') as category
FROM Names
LEFT JOIN Categories ON Names.name = Categories.Class_A or Names.name = Categories.Class_B or Names.name = Categories.Class_C or Names.name = Categories.Class_D

oracle sql nested select in select

SELECT idemployee, lastname, firstname,
(SELECT namedep FROM department
WHERE numdep = 120) depname
FROM employee
WHERE numdep = 120;
what does the statement return?
how does the nested select impact the result?
This is called a correlated subquery. It is more commonly written as an explicit join:
SELECT e.idemployee, e.lastname, e.firstname, d.namedep
FROM employee e left join
department d
on e.numdep = d.numdep
WHERE e.numdep = 120;
These two formulations are not exactly the same, but in this case they probably return the same results.
This a really simple query. Without examples of the tables and the data in them we can't tell you exactly what the query would return but it should be something like this.
idemployee | lastname | firstname | depname
1 | Smith | John | Sanitation
2| Doe | Jane | Sanitation
Breaking it down: SELECT namedep FROM department
WHERE numdep = 120 seems to be getting the department name for department #120. This will then show up in the last column for each row of the main query.
SELECT idemployee, lastname, firstname FROM employee WHERE numdep = 120 seems to be simply selecting the employee id, last name and first name from the employee table for everyone in department 120.
You could write this much more simply as
SELECT e.idemployee Employee_ID, e.lastname, e.firstname, d.namedep Department_name
FROM employee e inner join department d on e.numdep = d.numdep
WHERE d.numdep = 120
This would return the same results. Would be much safer performance wise. It would also be much easier to change to run for different departments, multiple departments or all departments.
I simply joined the to tables together based on the common column, numdep, and selected what I wanted from the table. I also added aliases (e,d) so I didn't have to type out employee and department to specific which table I was referring to.

Combining tables in SQL/QlikView

Is it possible to combine 2 tables with a join or similar construct so that all non matching field in one group. Some thing like this:
All employees with a department name gets their real department and all with no department ends up in group "Other".
Department:
SectionDesc ID
Dep1 500
Dep2 501
Employee:
Name ID
Anders 500
Erik 501
root 0
Output:
Anders Dep1
Erik Dep2
root Other
Best Regards Anders Olme
What you are looking for is an outer join:
SELECT e.name, d.name
FROM employee e
LEFT OUTER JOIN departments d ON e.deptid = d.deptid
This would give you a d.name of NULL for every employee without a department. You can change this to 'Other' with something like this:
CASE WHEN d.name IS NULL THEN 'Other' Else d.name END
(Other, simpler versions for different DBMSs exist, but this should work for most.)
QlikView is a bit tricky, as all joins in QlikView are inner joins by default. There is some discussion in the online help about the different joins, short version is that you can create a new table based on different joins in the script that reads in your data. So you could have something like this in your script:
Emps: SELECT * FROM EMPLOYEES;
Deps: SELECT * FROM DEPARTMENTS;
/* or however else you get your data into QlikView */
EmpDep:
SELECT Emps.name, Deps.name
FROM EMPS LEFT JOIN Deps
In order for this join to work the column names for the join have to be the same in both tables. (If necessary, you can construct new columns for the join when loading the base tables.)