Three tables join given me the all combination of records - sql

When i written the query like the following.. It's written the combination of all the records.
What's the mistake in the query?
SELECT ven.vendor_code, add.address1
FROM vendor ven INNER JOIN employee emp
ON ven.emp_fk = emp.id
INNER JOIN address add
ON add.emp_name = emp.emp_name;

Using inner join, you've to put all the links (relations) between two tables in the ON clause.
Assuming the relations are good, you may test the following queries to see if they really make the combination of all records:
SELECT count(*)
from vendor ven
inner join employee emp on ven.emp_fk = emp.id
inner join address add on add.emp_name = emp.emp_name;
SELECT count(*)
add.address1
from vendor ven, employee emp, address add
If both queries return the same result (which I doubt), you really have what you say.
If not, as I assume, maybe you are missing a relation or a restriction to filter the number of results.

Related

SQL doesn't display rows where

The following sql query selects the employee name (from employee table), their manager's name (from manager table) and their performance (from rating table). However, if an employee's manager_id is missing, then it doesn't list that employee at all when outputting rows. Is there any way around this? Probably involving joins but not too sure. Thanks in advance :)
SELECT employee.name,
manager.name,
rating.performance
FROM employee,
manager,
rating
WHERE employee.manager_id = manager.id
AND rating.employee_id = employee.id;
Never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax. In this case, you want a LEFT JOIN:
SELECT e.name, m.name, r.performance
FROM employee e LEFT JOIN
manager m
ON e.manager_id = m.id LEFT JOIN
rating r
ON r.employee_id = e.id;
Notice that this also includes table aliases to the query is easier to write and to read.
By using a LEFT JOIN you get all rows of the "left" table despite not being able to "pair" with any rows in the joining tables.
SELECT
employee.name,
manager.name,
rating.performance
FROM
employee LEFT JOIN,
manager ON employee.manager_id = manager.id LEFT JOIN
rating ON empoyee.id = rating.employee_id

Where SQL column value not found in other column

I've read similar questions and I think I am doing it correct, but I just wanted to make sure my SQL is correct. (Still new to SQL)
I have 2 different tables
Students
id, name, address
Staff
id, name, address
I need to find the total number of students (who are not also staff)
SO I have the following SQL
create view nstudents as
select students.id
from students
LEFT JOIN staff ON staff.id = students.id;
Then I run the count(*) on the view.
Can someone confirm my SQL is correct or is there better way to do it?
Your LEFT JOIN doesn't eliminate students who are also staff, but it could be useful in achieving your goal. LEFT JOIN provides you with all results from the left table and matching results from the right table, or NULL results if the right table doesn't have a match. If you do this:
select count(*)
from students
LEFT JOIN staff ON staff.id = students.id
WHERE staff.id IS NULL;
I expect you'll get what you're looking for.
You might find it more natural to do something like this:
create view nstudents as
select s.id
from students s
where not exists (select 1 from staff st where st.id = s.id) ;
This should have the same performance as the left join.

Create filter for most recent date using combined columns

I have created a filtering application in Access that references four simple tables:
Employee: Emp_ID, FirstName, LastName
Skill: Skill_ID, SkillName, SkillDescription, SkillGroup
Employee_Skill: Entry_ID, Emp_ID, Skill_ID, LevelofExperience, Dateupdated
SkillGroupName:SkillGroup_ID SkillGroupName`
Basically the idea of this database is to track employee skills and how the level of experience improves (or not!) over time. The problem I am facing is that I want the application to filter by the most recently updated combination of Skill and Employee. I have found the query that will allow for me to use the two columns as a distinct entity:
SELECT DISTINCT Emp_ID, Skill_ID FROM Employee_Skill
WHERE (SELECT MAX(DateUpdated)From Employee_Skill);
And it works perfectly on its own, but I don't know how to incorporate it either into my main query, which simply joins together the necessary columns for an easier end user experience. It does not visibly show Emp_ID or Skill_ID. It also doesn't in the VBA for the application. (-1 = Include all History; 0 = Only include most updated.)
Update:
I have been able to select the distinct combination of Employee and Skill through my main query by doing this:
SELECT
Employee.FirstName,
Employee.LastName,
Max(Employee_Skill.LevelOfExperience) AS LevelOfExperience,
Skill.SkillName,
Max(Employee_Skill.DateUpdated) AS DateUpdated,
Max(SkillGroup.SkillGroupName) AS SkillGroupName
FROM
SkillGroup INNER JOIN
(Skill INNER JOIN
(Employee INNER JOIN
Employee_Skill ON
Employee.Emp_ID = Employee_Skill.Emp_ID) ON
Skill.Skill_ID = Employee_Skill.Skill_ID) ON
SkillGroup.SkillGroup_ID = Skill.SkillGroup
WHERE
Employee.Active=True
GROUP BY
Employee.FirstName,
Employee.LastName,
Skill.SkillName
ORDER BY
Max(Employee_Skill.LevelOfExperience) DESC;
However, my forms and reports built on this query are stuck with only the option of seeing the most updated version. I am really hoping to have a dynamic form that removes the constraints as desired.
Not sure what you're doing with Max(Employee_Skill.LevelOfExperience) or Max(SkillGroup.SkillGroupName) but I think you need to stick with querying for the detail rows and then include another column marking the Max(Employee_Skill.DateUpdated) filter, like:
SELECT
Employee.FirstName,
Employee.LastName,
Employee_Skill.LevelOfExperience,
Skill.SkillName,
Employee_Skill.DateUpdated,
SkillGroup.SkillGroupName,
iif(max_dateUpdated=dateupdated,1,0) as is_max_DateUpdated
FROM
SkillGroup INNER JOIN
(Skill INNER JOIN
(Employee INNER JOIN
Employee_Skill ON
Employee.Emp_ID = Employee_Skill.Emp_ID) ON
Skill.Skill_ID = Employee_Skill.Skill_ID) ON
SkillGroup.SkillGroup_ID = Skill.SkillGroup inner join
(select
empID,
max(dateupdated) as max_dateUpdated
from
Employee_Skill
group by
empID) mx on
Employee.empID = mx.empID
WHERE
Employee.Active=True

Joining two tables with specific columns

I am new to SQL, I know this is really basic but I really do not know how to do it!
I am joining two tables, each tables lets say has 5 columns, joining them will give me 10 columns in total which I really do not want. What I want is to select specific columns from both of the tables so that they only show after the join. (I want to reduce my joining result to specific columns only)
SELECT * FROM tbEmployees
JOIN tbSupervisor
ON tbEmployees.ID = tbSupervisor.SupervisorID
The syntax above will give me all columns which I don't want. I just want EmpName, Address from the tblEmployees table and Name, Address, project from the tbSupervisor table
I know this step:
SELECT EmpName, Address FROM tbEmployees
JOIN tbSupervisor
ON tbEmployees.ID = tbSupervisor.SupervisorID
but I am not sure about the supervisor table.
I am using SQL Server.
This is what you need:
Select e.EmpName, e.Address, s.Name, S.Address, s.Project
From tbEmployees e
JOIN tbSupervisor s on e.id = SupervisorID
You can read about this on W3Schools for more info.
You can get columns from specific tables, either by their full name or using an alias:
SELECT E.EmpName, E.Address, S.Name, S.Address, S.Project
FROM tbEmployees E
INNER JOIN tbSupervisor S ON E.ID = S.SupervisorID
You can use the table name as part of the column specification:
SELECT tbEmployees.EmpName, tbEmployeesAddress, tbSupervisor.Name,
tbSupervisor.Address, tbSupervisor.project
FROM tbEmployees
JOIN tbSupervisor
ON tbEmployees.ID = tbSupervisor.SupervisorID
SELECT employees.EmpName, employees.Address AS employeer address,
supervisor.Name, supervisor.Address AS supervisor address,supervisor.project
FROM tbEmployees
AS employees
JOIN tbSupervisor
AS supervisor
ON
employees.ID = supervisor.SupervisorID
You need to learn about aliases. They will make your queries more maintainable. Also, you should always use aliases when referencing columns, so your query is clear about what it is doing:
SELECT e.EmpName, e.Address, s.name, s.address as SupervisorAddress
FROM tbEmployees e JOIN
tbSupervisor s
ON e.ID = s.SupervisorID;
Note that I also renamed the second address so its name is unique.
Specify the table name and field name in your selection
SELECT tbEmployees.EmpName,
tbEmployees.Address,
tbSupervisor.[column name]
FROM tbEmployees
JOIN tbSupervisor ON tbEmployees.ID = tbSupervisor.SupervisorID
SELECT product_report.*,
product.pgroup
FROM `product_report`
INNER JOIN product
ON product_report.product_id = product.id
WHERE product.pgroup = '5'
ORDER BY product.id DESC

Explanation of self-joins

I don't understand the need for self-joins. Can someone please explain them to me?
A simple example would be very helpful.
You can view self-join as two identical tables. But in normalization, you cannot create two copies of the table so you just simulate having two tables with self-join.
Suppose you have two tables:
Table emp1
Id Name Boss_id
1 ABC 3
2 DEF 1
3 XYZ 2
Table emp2
Id Name Boss_id
1 ABC 3
2 DEF 1
3 XYZ 2
Now, if you want to get the name of each employee with his or her boss' names:
select c1.Name , c2.Name As Boss
from emp1 c1
inner join emp2 c2 on c1.Boss_id = c2.Id
Which will output the following table:
Name Boss
ABC XYZ
DEF ABC
XYZ DEF
It's quite common when you have a table that references itself. Example: an employee table where every employee can have a manager, and you want to list all employees and the name of their manager.
SELECT e.name, m.name
FROM employees e LEFT OUTER JOIN employees m
ON e.manager = m.id
A self join is a join of a table with itself.
A common use case is when the table stores entities (records) which have a hierarchical relationship between them. For example a table containing person information (Name, DOB, Address...) and including a column where the ID of the Father (and/or of the mother) is included. Then with a small query like
SELECT Child.ID, Child.Name, Child.PhoneNumber, Father.Name, Father.PhoneNumber
FROM myTableOfPersons As Child
LEFT OUTER JOIN myTableOfPersons As Father ON Child.FatherId = Father.ID
WHERE Child.City = 'Chicago' -- Or some other condition or none
we can get info about both child and father (and mother, with a second self join etc. and even grand parents etc...) in the same query.
Let's say you have a table users, set up like so:
user ID
user name
user's manager's ID
In this situation, if you wanted to pull out both the user's information and the manager's information in one query, you might do this:
SELECT users.user_id, users.user_name, managers.user_id AS manager_id, managers.user_name AS manager_name INNER JOIN users AS manager ON users.manager_id=manager.user_id
Imagine a table called Employee as described below. All employees have a manager which is also an employee (maybe except for the CEO, whose manager_id would be null)
Table (Employee):
int id,
varchar name,
int manager_id
You could then use the following select to find all employees and their managers:
select e1.name, e2.name as ManagerName
from Employee e1, Employee e2 where
where e1.manager_id = e2.id
They are useful if your table is self-referential. For example, for a table of pages, each page may have a next and previous link. These would be the IDs of other pages in the same table. If at some point you want to get a triple of successive pages, you'd do two self-joins on the next and previous columns with the same table's id column.
Without the ability for a table to reference itself, we'd have to create as many tables for hierarchy levels as the number of layers in the hierarchy. But since that functionality is available, you join the table to itself and sql treats it as two separate tables, so everything is stored nicely in one place.
Apart from the answers mentioned above (which are very well explained), I would like to add one example so that the use of Self Join can be easily shown.
Suppose you have a table named CUSTOMERS which has the following attributes:
CustomerID, CustomerName, ContactName, City, Country.
Now you want to list all those who are from the "same city" .
You will have to think of a replica of this table so that we can join them on the basis of CITY. The query below will clearly show what it means:
SELECT A.CustomerName AS CustomerName1, B.CustomerName AS CustomerName2,
A.City
FROM Customers A, Customers B
WHERE A.CustomerID <> B.CustomerID
AND A.City = B.City
ORDER BY A.City;
There are many correct answers here, but there is a variation that is equally correct. You can place your join conditions in the join statement instead of the WHERE clause.
SELECT e1.emp_id AS 'Emp_ID'
, e1.emp_name AS 'Emp_Name'
, e2.emp_id AS 'Manager_ID'
, e2.emp_name AS 'Manager_Name'
FROM Employee e1 RIGHT JOIN Employee e2 ON e1.emp_id = e2.emp_id
Keep in mind sometimes you want e1.manager_id > e2.id
The advantage to knowing both scenarios is sometimes you have a ton of WHERE or JOIN conditions and you want to place your self join conditions in the other clause to keep your code readable.
No one addressed what happens when an Employee does not have a manager. Huh? They are not included in the result set. What if you want to include employees that do not have managers but you don't want incorrect combinations returned?
Try this puppy;
SELECT e1.emp_id AS 'Emp_ID'
, e1.emp_name AS 'Emp_Name'
, e2.emp_id AS 'Manager_ID'
, e2.emp_name AS 'Manager_Name'
FROM Employee e1 LEFT JOIN Employee e2
ON e1.emp_id = e2.emp_id
AND e1.emp_name = e2.emp_name
AND e1.every_other_matching_column = e2.every_other_matching_column
Self-join is useful when you have to evaluate the data of the table with itself. Which means it'll correlate the rows from the same table.
Syntax: SELECT * FROM TABLE t1, TABLE t2 WHERE t1.columnName = t2.columnName
For example, we want to find the names of the employees whose Initial Designation equals to current designation. We can solve this using self join in following way.
SELECT NAME FROM Employee e1, Employee e2 WHERE e1.intialDesignationId = e2.currentDesignationId
One use case is checking for duplicate records in a database.
SELECT A.Id FROM My_Bookings A, My_Bookings B
WHERE A.Name = B.Name
AND A.Date = B.Date
AND A.Id != B.Id
SELF JOIN:
Joining a table by itself is called as Self Join.
We can perform operations on a single table.
When we use self join we should create alias names on a table otherwise we cannot implement self join.
When we create alias name on a table internally system is preparing virtual table on each alias name of a table.
We can create any number of alias names on a table but each alias name should be different.
Basic Rules of self join:
CASE-I: Comparing a single column values by itself with in the table
CASE-II: Comparing two different columns values to each other with in the table.
Example:
SELECT * from TEST;
ENAME
RICHARD
JOHN
MATHEW
BENNY
LOC
HYDRABAD
MUMBAI
HYDRABAD
CHENNAI
SELECT T1. ENAME, T1. LOC FROM TEST.T1, TEST T2 WHERE T1.LOC=T2.LOC AND T2.ENAME='RICHARD';
It's the database equivalent of a linked list/tree, where a row contains a reference in some capacity to another row.
Here is the exaplanation of self join in layman terms. Self join is not a different type of join. If you have understood other types of joins (Inner, Outer, and Cross Joins), then self join should be straight forward. In INNER, OUTER and CROSS JOINS, you join 2 or more different tables. However, in self join you join the same table with itslef. Here, we don't have 2 different tables, but treat the same table as a different table using table aliases. If this is still not clear, I would recomend to watch the following youtube videos.
Self Join with an example