Self Joining between 2 same tables

Self Joining between 2 same tables - sql

For instance I have a table called employees where it consists of "Employee ID", "First Name", "Last Name", "Manager ID". To count the subordinate of each manager, I tried to self-joining between the 2 tables.
SELECT e1.first_name, e1.last_name, COUNT(e1.employee_id)
FROM employee e1 INNER JOIN e2 ON e1.employee_id = e2.manager_id
GROUP BY e1.first_name, e1.last_name
Am I right?
Also, if I want to join with other tables after self-joining, is the joining statement right?
FROM ((self-joining) INNER JOIN other tables ON "common column")
Combining the first and last name:
SELECT CONCAT(e1.first_name,' ',e1.last_name) "Full Name", COUNT(e1.employee_id)
FROM employee e1 INNER JOIN e2 ON e1.employee_id = e2.manager_id
GROUP BY "Full Name"
I can't compile this....What is wrong?

You will need to list all the columns in the group by if they are directly referenced in the SELECT clause.
Also, You can concat in the oracle using (||), CONCAT method can only accept two parameters in the oracle.
So your query should look like this:
SELECT
-- concatanation using ||
E1.FIRSTNAME
|| ' '
|| E1.LASTNAME AS MANAGERNAME,
COUNT(E1.EMPLOYEEID)
FROM
EMPLOYEES E1
INNER JOIN EMPLOYEES E2 ON E1.EMPLOYEEID = E2.MANAGERID
GROUP BY
-- both of the columns are used in the select clause and must be used in the GROUP BY clause
E1.FIRSTNAME,
E1.LASTNAME;
Cheers!!

An answer for your first question is just a minor tweak to your query:
SELECT e1.firstname AS managerFirstName, e1.lastname AS managerLastName, COUNT(e1.employeeid)
FROM employees e1 INNER JOIN employees e2 ON e1.employeeId = e2.managerId
GROUP BY e1.firstname, e1.lastname;
I really haven't changed much here - and you can see that it works at: http://sqlfiddle.com/#!9/187477/1
The essential change is sticking with the names coming from the same table reference (e1) and GROUPing BY the same fields. You also need to (as commented) indicate the table name before aliasing with "e2".
(Note that the aliasing of the names is just to help indicate that these are managers, it's not an essential part of the query. Also, I used slightly different field names, but the logic is the same.)
As to your second question, I'd do it using the self-join query as a sub-query, more or less as you suggest. Try something out - you're essentially at a solution.
EDIT IN RESPONSE TO QUESTION EDIT:
Adding the concatenation in (note that oracle has some limits around concatenating more than 2 strings, so this is one possible workaround - there's more info at this answer: Oracle SQL, concatenate multiple columns + add text):
SELECT CONCAT(CONCAT(e1.firstname, ' '), e1.lastname) AS managerName, COUNT(e1.employeeid)
FROM employees e1 INNER JOIN employees e2 ON e1.employeeId = e2.managerId
GROUP BY e1.firstname, e1.lastname;
still works: http://sqlfiddle.com/#!4/b0cbcd/4

Related

SQL doesn't display rows where

The following sql query selects the employee name (from employee table), their manager's name (from manager table) and their performance (from rating table). However, if an employee's manager_id is missing, then it doesn't list that employee at all when outputting rows. Is there any way around this? Probably involving joins but not too sure. Thanks in advance :)
SELECT employee.name,
manager.name,
rating.performance
FROM employee,
manager,
rating
WHERE employee.manager_id = manager.id
AND rating.employee_id = employee.id;

Never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax. In this case, you want a LEFT JOIN:
SELECT e.name, m.name, r.performance
FROM employee e LEFT JOIN
manager m
ON e.manager_id = m.id LEFT JOIN
rating r
ON r.employee_id = e.id;
Notice that this also includes table aliases to the query is easier to write and to read.

By using a LEFT JOIN you get all rows of the "left" table despite not being able to "pair" with any rows in the joining tables.
SELECT
employee.name,
manager.name,
rating.performance
FROM
employee LEFT JOIN,
manager ON employee.manager_id = manager.id LEFT JOIN
rating ON empoyee.id = rating.employee_id

Join a table in SQL based off two columns?

I have two tables:
Employees (columns: ID, Name)
and
employee partners (EmployeeID1, EmployeeID2, Time)
I want to output EmployeName1, EmployeeName2, Time instead of imployee ids.
(In other words, replace the ids with names, but in two columns at a time)
How would I do this? Would JOIN be the appropriate command?

you need to join the employee table 2 times as the employee partners table acts as many to many connection.
The select should be:
SELECT emp1.name, emp2.name, em.time
FROM Employees emp1
JOIN employee_partners em ON emp1.id = EmployeeID1
JOIN Employees emp2 on emp2.id = EmployeeID2

Often in these situations, you want to use LEFT JOIN:
SELECT e1.name as name1, e2.name as name2, em.time
FROM employee_partners ep LEFT JOIN
Employees e1
ON e1.id = ep.EmployeeID1 LEFT JOIN
Employees e2
ON e2.id = ep.EmployeeID2;
Notes:
The LEFT JOINs ensure that you do not lose rows if either of the employee columns is NULL.
Use tables aliases; they make the query easier to write and to read.
Qualify all columns names; that is, include the table name so you know where the column is coming from.
I also added column aliases so you can distinguish between the names.

A strange way of writing query encountered

This may seems the most dumbest question ever on stackoverflow but I am just wondering why would you write such a query:
Select e1.Emploee_ID, e1.Departement_ID From Employee e
Inner join Employee E1 on e.employee_id= e1.Employee_ID
Inner join Departement d on d.dep_id= e1.departement_id
Why do we have to join on employee? my obvious query would be
select e.employee_ID, e.Departement_id from employee e
inner join Departement d on d.dep_id= e1.departement_id

Referencing the PK with an inner join is redundant.
You would normally join on the same table to link with another record, for example if you have a FK-column that reference the boss of an employee.
Assuming you would have a nullable foreign-key column Boss_ID in table Employee
Select e.Employee_ID, boss.Employee_id, d.Departement_ID
From Employee e
LEFT OUTER JOIN Employee boss on boss.Employee_ID=e.Boss_ID
INNER JOIN Departement d on d.dep_id= e.departement_id
Note that i've used a LEFT OUTER JOIN to get also the employee that have no bosses.

What I can see. You do not need this join.
Inner join Employee E1 on e.employee_id= e1.Employee_ID
The two queries will give the same result. I can not see the point of JOINing twice on the Employee table.

I can't see any reason to join Employee to Employee in this query. In the past I've occasionally used two subsets of the same table in the same query, but there's nothing like that going on here. To me it looks like this was done by mistake.

If employee_id is a PK then it doesn't make sense, but if it is not the two queries will return different results.
The first query will not return NULL employee_id and will return N^2 results for multiple entries with N occurrences.

Even simpler would be this:
select e.employee_ID, d.dep_name from employee e,Departement d where d.dep_id= e.departement_id

How to use "IN" for more than one column

This question might be trivial or even silly but I was wondering if there is a way to use "IN" on more than one column on one to one matching.
For example I use
select emp_id from employee where emp_id IN (select emp_id from employee_other)
How could I achieve something like
select emp_id from employee where emp_id,emp_org IN (select emp_id,emp_org from employee_other)
I know I cant be using the following because it will simply do the union whereas I want a selection based on one to one record matching.
select emp_id from employee where emp_id IN (select emp_id from employee_other) and emp_org in (select emp_org from employee)
Please note that I am reluctant to use EXCEPT.
Thanks guys

You may want to use the EXISTS operator
select e.emp_id
from employee e
where EXISTS
(
SELECT *
FROM employee_other eo
WHERE e.emp_id = eo.emp_id
AND e.emp_org = eo.emp_org
)

IN in Microsoft SQL Server only works with a single column, ie. you can only write X IN (...), never anything remotely like X,Y IN (...).
There are two ways to handle this, depending on your data:
Joining with a sub-query
Using EXISTS
To JOIN, do this:
select emp_id
from employee
inner join (select emp_id,emp_org from employee) as x
on employee.emp_id = x.emp_id and employee.emp_org = x.emp_org
Your example is a bit lousy, however, since you're using the same table.
To use EXISTS, do this:
select emp_id
from employee
where exists (
select emp_id,emp_org from employee e2
where e2.emp_id = employee.emp_id and e2.emp_org = employee.emp_org)
This, in the same way as the join, links the main table to the "sub-query" table, but whereas the join will produce duplicate rows if the "sub-query" produces multiple hits, the EXISTS clause will not.

I don't understand what you are trying to accomplish with emp_org in (select emp_org from employee) isn't that always true?
does this work?
select emp_id from employee e
where exists (select 1 from employee_other eo
WHERE e.emp_id =eo.emp_id and
AND e.emp_org = eo.emp_org )

You had it almost completely right in your second example. You just need to add parens around your column names.
select emp_id
from employee
where (emp_id,emp_org) IN (select emp_id,emp_org from employee)

Use Inner Join
select e1.emp_id from employee e1
inner join employee_other e2 on e1.emp_id = e1.emp_id and e1.emp_org = e2.emp_org
You may have to use Distinct in case the employee_other table causes dups.
select Distinct e1.emp_id from employee e1
inner join employee_other e2 on e1.emp_id = e1.emp_id and e1.emp_org = e2.emp_org

Explanation of self-joins

I don't understand the need for self-joins. Can someone please explain them to me?
A simple example would be very helpful.

You can view self-join as two identical tables. But in normalization, you cannot create two copies of the table so you just simulate having two tables with self-join.
Suppose you have two tables:
Table emp1
Id Name Boss_id
1 ABC 3
2 DEF 1
3 XYZ 2
Table emp2
Id Name Boss_id
1 ABC 3
2 DEF 1
3 XYZ 2
Now, if you want to get the name of each employee with his or her boss' names:
select c1.Name , c2.Name As Boss
from emp1 c1
inner join emp2 c2 on c1.Boss_id = c2.Id
Which will output the following table:
Name Boss
ABC XYZ
DEF ABC
XYZ DEF

It's quite common when you have a table that references itself. Example: an employee table where every employee can have a manager, and you want to list all employees and the name of their manager.
SELECT e.name, m.name
FROM employees e LEFT OUTER JOIN employees m
ON e.manager = m.id

A self join is a join of a table with itself.
A common use case is when the table stores entities (records) which have a hierarchical relationship between them. For example a table containing person information (Name, DOB, Address...) and including a column where the ID of the Father (and/or of the mother) is included. Then with a small query like
SELECT Child.ID, Child.Name, Child.PhoneNumber, Father.Name, Father.PhoneNumber
FROM myTableOfPersons As Child
LEFT OUTER JOIN myTableOfPersons As Father ON Child.FatherId = Father.ID
WHERE Child.City = 'Chicago' -- Or some other condition or none
we can get info about both child and father (and mother, with a second self join etc. and even grand parents etc...) in the same query.

Let's say you have a table users, set up like so:
user ID
user name
user's manager's ID
In this situation, if you wanted to pull out both the user's information and the manager's information in one query, you might do this:
SELECT users.user_id, users.user_name, managers.user_id AS manager_id, managers.user_name AS manager_name INNER JOIN users AS manager ON users.manager_id=manager.user_id

Imagine a table called Employee as described below. All employees have a manager which is also an employee (maybe except for the CEO, whose manager_id would be null)
Table (Employee):
int id,
varchar name,
int manager_id
You could then use the following select to find all employees and their managers:
select e1.name, e2.name as ManagerName
from Employee e1, Employee e2 where
where e1.manager_id = e2.id

They are useful if your table is self-referential. For example, for a table of pages, each page may have a next and previous link. These would be the IDs of other pages in the same table. If at some point you want to get a triple of successive pages, you'd do two self-joins on the next and previous columns with the same table's id column.

Without the ability for a table to reference itself, we'd have to create as many tables for hierarchy levels as the number of layers in the hierarchy. But since that functionality is available, you join the table to itself and sql treats it as two separate tables, so everything is stored nicely in one place.

Apart from the answers mentioned above (which are very well explained), I would like to add one example so that the use of Self Join can be easily shown.
Suppose you have a table named CUSTOMERS which has the following attributes:
CustomerID, CustomerName, ContactName, City, Country.
Now you want to list all those who are from the "same city" .
You will have to think of a replica of this table so that we can join them on the basis of CITY. The query below will clearly show what it means:
SELECT A.CustomerName AS CustomerName1, B.CustomerName AS CustomerName2,
A.City
FROM Customers A, Customers B
WHERE A.CustomerID <> B.CustomerID
AND A.City = B.City
ORDER BY A.City;

There are many correct answers here, but there is a variation that is equally correct. You can place your join conditions in the join statement instead of the WHERE clause.
SELECT e1.emp_id AS 'Emp_ID'
, e1.emp_name AS 'Emp_Name'
, e2.emp_id AS 'Manager_ID'
, e2.emp_name AS 'Manager_Name'
FROM Employee e1 RIGHT JOIN Employee e2 ON e1.emp_id = e2.emp_id
Keep in mind sometimes you want e1.manager_id > e2.id
The advantage to knowing both scenarios is sometimes you have a ton of WHERE or JOIN conditions and you want to place your self join conditions in the other clause to keep your code readable.
No one addressed what happens when an Employee does not have a manager. Huh? They are not included in the result set. What if you want to include employees that do not have managers but you don't want incorrect combinations returned?
Try this puppy;
SELECT e1.emp_id AS 'Emp_ID'
, e1.emp_name AS 'Emp_Name'
, e2.emp_id AS 'Manager_ID'
, e2.emp_name AS 'Manager_Name'
FROM Employee e1 LEFT JOIN Employee e2
ON e1.emp_id = e2.emp_id
AND e1.emp_name = e2.emp_name
AND e1.every_other_matching_column = e2.every_other_matching_column

Self-join is useful when you have to evaluate the data of the table with itself. Which means it'll correlate the rows from the same table.
Syntax: SELECT * FROM TABLE t1, TABLE t2 WHERE t1.columnName = t2.columnName
For example, we want to find the names of the employees whose Initial Designation equals to current designation. We can solve this using self join in following way.
SELECT NAME FROM Employee e1, Employee e2 WHERE e1.intialDesignationId = e2.currentDesignationId

One use case is checking for duplicate records in a database.
SELECT A.Id FROM My_Bookings A, My_Bookings B
WHERE A.Name = B.Name
AND A.Date = B.Date
AND A.Id != B.Id

SELF JOIN:
Joining a table by itself is called as Self Join.
We can perform operations on a single table.
When we use self join we should create alias names on a table otherwise we cannot implement self join.
When we create alias name on a table internally system is preparing virtual table on each alias name of a table.
We can create any number of alias names on a table but each alias name should be different.
Basic Rules of self join:
CASE-I: Comparing a single column values by itself with in the table
CASE-II: Comparing two different columns values to each other with in the table.
Example:
SELECT * from TEST;
ENAME
RICHARD
JOHN
MATHEW
BENNY
LOC
HYDRABAD
MUMBAI
HYDRABAD
CHENNAI
SELECT T1. ENAME, T1. LOC FROM TEST.T1, TEST T2 WHERE T1.LOC=T2.LOC AND T2.ENAME='RICHARD';

It's the database equivalent of a linked list/tree, where a row contains a reference in some capacity to another row.

Here is the exaplanation of self join in layman terms. Self join is not a different type of join. If you have understood other types of joins (Inner, Outer, and Cross Joins), then self join should be straight forward. In INNER, OUTER and CROSS JOINS, you join 2 or more different tables. However, in self join you join the same table with itslef. Here, we don't have 2 different tables, but treat the same table as a different table using table aliases. If this is still not clear, I would recomend to watch the following youtube videos.
Self Join with an example

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Self Joining between 2 same tables - sql

Related

SQL doesn't display rows where

Join a table in SQL based off two columns?

A strange way of writing query encountered

How to use "IN" for more than one column

Explanation of self-joins

Categories

Resources