Related
Query
SELECT ID, Name, Phone
FROM Table1
LEFT JOIN Table2 ON Table1.ID = Table2.ID
WHERE Table2.ID IS NULL
Problem
Finding it hard to understand why someone would left join on an ID
and then set it to NULL in the where clause?
Am I missing something here? Is there any significance to this?
Could we just omit the Table2 altogether? As in not join at all?
Any help would be much appreciated.
The query you have in the question is basically equivalent to the following query:
SELECT ID, Name, Phone
FROM Table1
WHERE NOT EXISTS
(
SELECT 1
FROM Table2
WHERE Table1.ID = Table2.ID
)
Meaning it selects all the records in Table1 that does not have a correlated record in Table2.
The execution plan for both queries will most likely be the same (Personally, I've never seen a case when they produce a different execution plan, but I don't rule that out), so both queries should be equally efficient, and it's up to you to decide whether the left join or the exists syntax is more readable to you.
I think you should have an alias for you table and specify which table each column is coming from.
Assuming Name is from table one and Phone is form table two and ID is common in both, then the Left join mentioned above may help get all users that do not have phone numbers.
Table 1
Id Name
1 John Smith
2 Jane Doe
Table 2
Id Phone
2 071 555 0863
Left Join without the where clause
ID Name Phone
1 John Smith NULL
2 Jane Doe 071 555 0863
Left Join with the where clause
ID Name Phone
1 John Smith NULL
This is one of the ways to implement the relational database operation of antijoin, called anti semi join within sql server's terminology. This is essentially "bring rows from one table that are not in another table".
The ways I cant think of doing this are:
select cols from t1 left join t2 on t1.key=t2.key where t2.key is null
select cols from t1 where key not in (select key from t2)
select cols from t1 where not exists (select 1 from t2 where t1.key=t2.key)
and even
select * from t1 where key in (select key from t1 except select key from t2)
There are some differences between these methods (most notably, the danger of null handling in the case of not in), but they generally do the same.
To address your points:
Finding it hard to understand why someone would left join on an ID and
then set it to NULL in the where clause?
As mentioned, in order to exclude results from t1 that are present in t2
Could we just omit the Table2 altogether? As in not join at all?
If you don't use the join (or any of its equivelant alternatives), you will get more results, as the rows in table1 that have the same id with any rows in table2 will be returned, too.
If joining condition column is having null value specifically ID then it is bad database design per my understanding.
As per your query below. Here are the possible scnario why where clause make sense
I am assuming that your name and phone number are coming from table2 and then you are trying to find the name and phone number whose ID is null.
If name and phone number is coming from table1 and table 2 is just having ID join and not selecting anything from table 2 then where clause is total waste.
SELECT
ID,
Name,
Phone
FROM
Table1
LEFT JOIN
Table2
ON
Table1.ID = Table2.ID
WHERE
Table2.ID IS NULL
Essentially in the above common business scenario, developers put where clause filter criteria in left join when any value is coming from right side is having non relevance data and not required to be the part of dataset then filter it out.
SO perhaps the data framework is flawed from the start, but.. I need to do an out join on two tables, but I need to do it based a concatenation of 2 column sin the second table.
For instance, table one
title | key
-------+-------
foo | Bar1
table two
subcat | pt1 | pt2
--------+-----+-----
kitty | Bar | 1
I basically need to use pt1+pt2 combined as the foreign key.
This is largely academic as I can add a column to the dataset (not my original creation) that is the concatenation, however, I wanted to know if the was possible.
Postgres version 8.4.8
cheers.bo
A join condition can be pretty much any expression; in particular, you can include string concatenation:
select ...
from t1 left outer join t2 on t1.key = t2.pt1 || t2.pt2
where ...
You can always create a sub query and perform the join against the sub query:
SELECT t1.foo, t1.key, t3.subcat FROM table1
JOIN (SELECT t2.pt1 || t2.pt2 AS ptjoined, t2.subcat
FROM tabletwo AS t2) as t3
ON t3.ptjoined = t1.key
I have tables like this:
Table1 Table2
name1 | link_id name2 | link_id
text 1 text 2
text 2 text 4
And I wanna have result:
name1 name2 link_id
text text 1
text text 2
text text 4
How I can do this?
ADD:
Sry, my English in not good. I have device, device_model and device_type tables with duplicate field counter_set_id. I wanna select fields from counter_set with all values of counter_set_id. I need to fetch values only from counter_set_id fields
Now I have this query:
SELECT `dev`.`counter_set_id`, `mod`.`counter_set_id`, `type`.`counter_set_id`
FROM `device` AS `dev`
LEFT JOIN `device_model` AS `mod` ON `dev`.`device_model_id` = `mod`.`id`
LEFT JOIN `device_type` AS `type` ON `mod`.`device_type_id` = `type`.`id`
WHERE `dev`.`id` = 4;
This returns 3 columns but I need all values in one column
This is final variant I think:
SELECT `dev`.`counter_set_id`
FROM `device` AS `dev` LEFT OUTER JOIN
`device_model` AS `mod` ON `dev`.`device_model_id` = `mod`.`id`
WHERE `dev`.`id` = 4 AND
`dev`.`counter_set_id` IS NOT NULL
UNION
SELECT `mod`.`counter_set_id`
FROM `device_model` AS `mod` LEFT OUTER JOIN
`device` AS `dev` ON `mod`.`id` = `dev`.`device_model_id`
WHERE `mod`.`counter_set_id` IS NOT NULL;
Based on the sample tables and desired output you provided, it sounds like you might want a FULL OUTER JOIN. Not all vendors implement this, but you can simulate it with a LEFT OUTER join and a UNION to an EXCEPTION join with the tables reversed like this:
Select name1, name2, A.link_id
From table1 A Left Outer Join
table2 B on A.link_id = B.link_id
Union
Select name1, name2, link_id
From table2 C Exception Join
table1 D on C.link_id = D.link_id
then your output would be like this:
NAME1 NAME2 LINK_ID
===== ===== =======
text <NULL> 1
text text 2
<NULL> text 4
At first i thought a join would work but now i'm not sure... Im thinking a union of some type.
in all honesty this is a bad design imo.
select * from table1
union
select * from table2
I'm assuming that you're joining on the link_id field. A normal join would result in four rows being returned (two of which are identical). What you're really asking for isn't combining fields, but getting only the distinct rows.
Just use the distinct keyword:
select distinct t1.name1, t2.name2, t1.link_id
from Table1 as t1
inner join Table2 as t2
on t1.link_id = t2.link_id
What value does link_id 4 have for name1?
What value does link_id 1 have for name2?
Try looking up the different JOIN-Types ..
http://dev.mysql.com/doc/refman/5.0/en/join.html
Maybe I misunderstand the question at hand - sorry then.
cheers
If the three columns have the same value, just
SELECT `type`.`counter_set_id`
FROM ...
Or, better yet, if they have the same value, you can do:
SELECT `dev`.`counter_set_id`
FROM `device` AS `dev`
WHERE `dev`.`id` = 4;
If you want to concatenate them (put them all into the same field), use this:
SELECT CONCAT(
CAST(`dev`.`counter_set_id` AS CHAR),
',',
CAST(`mod`.`counter_set_id` AS CHAR),
',',
CAST(`type`.`counter_set_id` AS CHAR))
FROM ...
If you do not want to repeat a field result, use GROUP BY link_id in the end of your query
If you want to show only link_id field then:
SELECT DISTINCT ta.link_id
FROM tblA AS ta
INNER JOIN tblB AS tb
ON ta.link_id = tb.link_id
Also look for CONCAT , CAST and other usefull funcs on mysql manual
I hope this help you.
Select all the IDs into a temp table (however that works in MySQL - a CTE in SQL Server), and use that as your joining table.
I am learning SQL and am trying to learn JOINs this week.
I have gotten to the level where I can do three table joins, similar to a lot of examples I've seen. I'm still trying to figure out the tiny details of how things work. All the examples I've seen of three table joins use INNER JOINS only. What about LEFT and RIGHT JOINs? Do you ever use these in three table joins? What would it mean?
SELECT ~some columns~ FROM ~table name~
LEFT JOIN ~table 2~ ON ~criteria~
INNER JOIN ~table 3~ ON ~criteria~
or
SELECT ~some columns~ FROM ~table name~
INNER JOIN ~table 2~ ON ~criteria~
LEFT JOIN ~table 3~ ON ~criteria~
or
SELECT ~some columns~ FROM ~table name~
LEFT JOIN ~table 2~ ON ~criteria~
LEFT JOIN ~table 3~ ON ~criteria~
or
???
Just trying to explore the space as much as possible
Yes, I do use all three of those JOINs, although I tend to stick to using just LEFT (OUTER) JOINs instead of inter-mixing LEFT and RIGHT JOINs. I also use FULL OUTER JOINs and CROSS JOINs.
In summary, an INNER JOIN restricts the resultset only to those records satisfied by the JOIN condition. Consider the following tables
EDIT: I've renamed the Table names and prefix them with # so that Table Variables can be used for anyone reading this answer and wanting to experiment.
If you'd also like to experiment with this in the browser, I've set this all up on SQL Fiddle too;
#Table1
id | name
---------
1 | One
2 | Two
3 | Three
4 | Four
#Table2
id | name
---------
1 | Partridge
2 | Turtle Doves
3 | French Hens
5 | Gold Rings
SQL code
DECLARE #Table1 TABLE (id INT PRIMARY KEY CLUSTERED, [name] VARCHAR(25))
INSERT INTO #Table1 VALUES(1, 'One');
INSERT INTO #Table1 VALUES(2, 'Two');
INSERT INTO #Table1 VALUES(3, 'Three');
INSERT INTO #Table1 VALUES(4, 'Four');
DECLARE #Table2 TABLE (id INT PRIMARY KEY CLUSTERED, [name] VARCHAR(25))
INSERT INTO #Table2 VALUES(1, 'Partridge');
INSERT INTO #Table2 VALUES(2, 'Turtle Doves');
INSERT INTO #Table2 VALUES(3, 'French Hens');
INSERT INTO #Table2 VALUES(5, 'Gold Rings');
An INNER JOIN SQL Statement, joined on the id field
SELECT
t1.id,
t1.name,
t2.name
FROM
#Table1 t1
INNER JOIN
#Table2 t2
ON
t1.id = t2.id
Results in
id | name | name
----------------
1 | One | Partridge
2 | Two | Turtle Doves
3 | Three| French Hens
A LEFT JOIN will return a resultset with all records from the table on the left hand side of the join (if you were to write out the statement as a one liner, the table that appears first) and fields from the table on the right side of the join that match the join expression and are included in the SELECT clause. Missing details will be populated with NULL
SELECT
t1.id,
t1.name,
t2.name
FROM
#Table1 t1
LEFT JOIN
#Table2 t2
ON
t1.id = t2.id
Results in
id | name | name
----------------
1 | One | Partridge
2 | Two | Turtle Doves
3 | Three| French Hens
4 | Four | NULL
A RIGHT JOIN is the same logic as a LEFT JOIN but will return all records from the right-hand side of the join and fields from the left side that match the join expression and are included in the SELECT clause.
SELECT
t1.id,
t1.name,
t2.name
FROM
#Table1 t1
RIGHT JOIN
#Table2 t2
ON
t1.id = t2.id
Results in
id | name | name
----------------
1 | One | Partridge
2 | Two | Turtle Doves
3 | Three| French Hens
NULL| NULL| Gold Rings
Of course, there is also the FULL OUTER JOIN, which includes records from both joined tables and populates any missing details with NULL.
SELECT
t1.id,
t1.name,
t2.name
FROM
#Table1 t1
FULL OUTER JOIN
#Table2 t2
ON
t1.id = t2.id
Results in
id | name | name
----------------
1 | One | Partridge
2 | Two | Turtle Doves
3 | Three| French Hens
4 | Four | NULL
NULL| NULL| Gold Rings
And a CROSS JOIN (also known as a CARTESIAN PRODUCT), which is simply the product of cross applying fields in the SELECT statement from one table with the fields in the SELECT statement from the other table. Notice that there is no join expression in a CROSS JOIN
SELECT
t1.id,
t1.name,
t2.name
FROM
#Table1 t1
CROSS JOIN
#Table2 t2
Results in
id | name | name
------------------
1 | One | Partridge
2 | Two | Partridge
3 | Three | Partridge
4 | Four | Partridge
1 | One | Turtle Doves
2 | Two | Turtle Doves
3 | Three | Turtle Doves
4 | Four | Turtle Doves
1 | One | French Hens
2 | Two | French Hens
3 | Three | French Hens
4 | Four | French Hens
1 | One | Gold Rings
2 | Two | Gold Rings
3 | Three | Gold Rings
4 | Four | Gold Rings
EDIT:
Imagine there is now a Table3
#Table3
id | name
---------
2 | Prime 1
3 | Prime 2
5 | Prime 3
The SQL code
DECLARE #Table3 TABLE (id INT PRIMARY KEY CLUSTERED, [name] VARCHAR(25))
INSERT INTO #Table3 VALUES(2, 'Prime 1');
INSERT INTO #Table3 VALUES(3, 'Prime 2');
INSERT INTO #Table3 VALUES(5, 'Prime 3');
Now all three tables joined with INNER JOINS
SELECT
t1.id,
t1.name,
t2.name,
t3.name
FROM
#Table1 t1
INNER JOIN
#Table2 t2
ON
t1.id = t2.id
INNER JOIN
#Table3 t3
ON
t1.id = t3.id
Results in
id | name | name | name
-------------------------------
2 | Two | Turtle Doves | Prime 1
3 | Three| French Hens | Prime 2
It might help to understand this result by thinking that records with id 2 and 3 are the only ones common to all 3 tables and are also the field we are joining each table on.
Now all three with LEFT JOINS
SELECT
t1.id,
t1.name,
t2.name,
t3.name
FROM
#Table1 t1
LEFT JOIN
#Table2 t2
ON
t1.id = t2.id
LEFT JOIN
#Table3 t3
ON
t1.id = t3.id
Results in
id | name | name | name
-------------------------------
1 | One | Partridge | NULL
2 | Two | Turtle Doves | Prime 1
3 | Three| French Hens | Prime 2
4 | Four | NULL | NULL
Joel's answer is a good explanation for explaining this resultset (Table1 is the base/origin table).
Now with a INNER JOIN and a LEFT JOIN
SELECT
t1.id,
t1.name,
t2.name,
t3.name
FROM
#Table1 t1
INNER JOIN
#Table2 t2
ON
t1.id = t2.id
LEFT JOIN
#Table3 t3
ON
t1.id = t3.id
Results in
id | name | name | name
-------------------------------
1 | One | Partridge | NULL
2 | Two | Turtle Doves | Prime 1
3 | Three| French Hens | Prime 2
Although we do not know the order in which the query optimiser will perform the operations, we will look at this query from top to bottom to understand the resultset. The INNER JOIN on ids between Table1 and Table2 will restrict the resultset to only those records satisfied by the join condition i.e. the three rows that we saw in the very first example. This temporary resultset will then be LEFT JOINed to Table3 on ids between Table1 and Tables; There are records in Table3 with id 2 and 3, but not id 1, so t3.name field will have details in for 2 and 3 but not 1.
Joins are just ways of combining tables. Joining three tables is no different than joining 2... or 200. You can mix and match INNER, [LEFT/RIGHT/FULL] OUTER, and even CROSS joins as much as you want. The only difference is which results are kept: INNER joins only keep rows where both sides match the expression. OUTER joins pick an "origin" table depending on the LEFT/RIGHT/FULL specification, always keep all rows from the origin table, and supply NULL values for rows from the other side that don't match the expression. CROSS joins return all possible combinations of both sides.
The trick is that because you're working with declarative code rather than more-familiar iterative, the temptation is to try to think of it as if everything happens at once. When you do that, you try to wrap your head around the entire query and it can get confusing.
Instead, you want to think of it as if the joins happen in order, from the first table listed to the last. This actually is not how it works, because the query optimizer can re-order things to make them run faster. But it makes building the query easier for the developer.
So with three tables, you start with your base table, then join in the values you need from the next table, and the next, and so on, just like adding lines of code to a function to produce the required output.
As for using the different join types, I've used all the different types I listed here: INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER, and CROSS. But most of those you only need to use occasionally. INNER JOIN and LEFT JOIN will cover probably 95% or more of what you want to do.
Now let's talk about performance. Often times the order you list tables is dictated to you: you start from TableA and you need to list TableB first in order to have access to columns required to join in TableC. But sometimes both TableB and TableC only depend on TableA, and you could list them in either order. When that happens the query optimizer will usually pick the best order for you, but sometimes it doesn't know how. Even if it did, it helps to have a good system for listing tables so you can always look at a query and know that it's "right".
With that in mind, you should think of a query in terms of the working set currently in memory as the query builds. When you start with TableA, the database looks at all the columns from TableA in the select list or anywhere else (like WHERE or ORDER BY clauses, or potential indexes) in the query, factors in relevant conditions from the WHERE clause, and loads the smallest portion of that table into memory that it can get away with. It does this for each table in turn, always loading as little as possible. And that's the key: you want to keep this working set as small as possible for as long as possible.
So, going back to our three-table join, we want to list the tables in the order that will keep the working set smaller for longer. This means listing the smaller table above the larger one. Another good rule of thumb is that INNER joins tend to shrink result sets, while OUTER joins, tend to grow result sets, and so you want to list your INNER joins first. However, this is not a requirement for a query to work, nor is it always true; sometimes the reverse can happen as well.
Finally, I want to point out again that this isn't how it really works. The query optimizer and execution plans are a very complex topic, and there are lots of tricks the database can take that break this model from time to time. It's just one model that you as a developer can use to help understand what the server is doing and help you write better queries.
Selecting from three tables is no different from selecting from only two (or as many as a hundred, though that would be a fairly ugly query to read).
For EACH join you write, having INNER indicates that you only want rows that successfully join those two tables together. If other tables were joined earlier in the query, those results are now completely irrelevant, except to the extent your own join conditions call on them.
For example:
SELECT person.*
FROM person
LEFT JOIN vehicle ON (person.person_id = vehicle.owner_id)
LEFT JOIN house ON (person.person_id = house.owner_id)
Here I want a list of all people, and (if available) all the vehicles and houses they own.
Alternatively:
SELECT person.*
FROM person
INNER JOIN vehicle ON (person.person_id = vehicle.owner_id)
LEFT JOIN house ON (person.person_id = house.owner_id)
Here I want all people who own vehicles (they must own a vehicle to get results in my query), and (if available) all the houses they own).
Each join is completely separate here.
Of course, by varying what you put in the ON clause, you can make joins interrelate tables any way you want.
This really depends on what you are doing. I've written many 3+ table queries that will have an outer join in them. It just depends on the data you are querying and what you are trying to follow.
The same general logic applies when selecting the join type when you have multiples as with single join queries.
For the sake of this example, lets say we have a table "employees" with an ID, NAME and MANAGER_ID fields.
Here is a simple query:
SELECT E.ID, E.NAME, M.NAME AS MANAGER
FROM EMPLOYEES E
JOIN EMPLOYEE M ON E.MANAGER_ID = M.ID
This will return all of the employees, with their manager name. But what happens for the boss? he who has no manager? A database null would actually prevent that row from returning as it could not find a matching record to join on. Thus you would use an OUTER join (left or right depending on how your write the query).
The same logic would hold for writing a query with 2+n joins. If you are possibly going to have rows that don't have matches in your join clause, and want those rows to come back (albeit with nulls), then you are golden.
Read this great article on outer joins by a well known expert Terry Purcell
also a great write up by Plamen Ratchev
On some sql engines there's an issue where you're joining a using left join.
If you join A->B->C and the row in B doesn't exist then the join column from B is NULL.
A few I've used require that the join from B->C must be a left join if the join from A->B is a left join.
This is ok
select a.*, b.*, c.*
from a
left join b on b.id = a.id
left join c on c.id = b.id
this is not
select a.*, b.*, c.*
from a
left join b on b.id = a.id
inner join c on c.id = b.id
For the sake of completeness and standard evangelics, I'll chime in with the ansi-92 nested join syntax:
select t1.*
,t2.*
,t3.*
from table1 t1
left outer join (
table2 t2 left outer join table3 t3 on (t2.b = t3.b)
) on (t1.a = t2.a)
Your SQL engine of choice may optimize for them.
SQLite only has INNER and LEFT JOIN.
Is there a way to do a FULL OUTER JOIN with SQLite?
Yes, see the example on Wikipedia.
SELECT employee.*, department.*
FROM employee
LEFT JOIN department
ON employee.DepartmentID = department.DepartmentID
UNION ALL
SELECT employee.*, department.*
FROM department
LEFT JOIN employee
ON employee.DepartmentID = department.DepartmentID
WHERE employee.DepartmentID IS NULL
FULL OUTER JOIN is natively supported starting from SQLite 3.39.0:
2.1. Determination of input data (FROM clause processing)
A "FULL JOIN" or "FULL OUTER JOIN" is a combination of a "LEFT JOIN" and a "RIGHT JOIN". Extra rows of output are added for each row in left dataset that matches no rows in the right, and for each row in the right dataset that matches no rows in the left. Unmatched columns are filled in with NULL.
Demo:
CREATE TABLE t1 AS
SELECT 1 AS id, 'A' AS col UNION
SELECT 2 AS id, 'B' AS col;
CREATE TABLE t2 AS
SELECT 1 AS id, 999 AS val UNION
SELECT 3 AS id, 100 AS val;
Query:
SELECT *
FROM t1
FULL JOIN t2
ON t1.id = t2.id;
db<>fiddle demo
Following Jonathan Leffler's comment in Mark Byers' answer, here's an alternative answer which uses UNION instead of UNION ALL:
SELECT * FROM table_name_1 LEFT OUTER JOIN table_name_2 ON id_1 = id_2
UNION
SELECT * FROM table_name_2 LEFT OUTER JOIN table_name_1 ON id_1 = id_2
Edit: The original source for the SQLite example above and from where further SQLite examples could be found was http://sqlite.awardspace.info/syntax/sqlitepg06.htm but it seems as though that site is now returning a 404 Not Found error.
For people, searching for an answer to emulate a Distinct Full Outer Join:
Due to the fact, that SQLite does neither support a Full Outer Join, nor a Right Join, i had to emulate a distinct full outer join / an inverted inner join (however you might call it).
The following Venn diagram shows the expected output:
To receive this expected output, i combined two Left Join clauses (the example refers to two identical built tables with partially differing data. I wanted to output only the data which does either appear in table A OR in table B).
SELECT A.flightNumber, A.offblockTime, A.airspaceCount, A.departure, A.arrival FROM D2flights A
LEFT JOIN D1flights B
ON A.flightNumber = B.flightNumber
WHERE B.flightNumber IS NULL
UNION
SELECT A.flightNumber, A.offblockTime, A.airspaceCount, A.departure, A.arrival FROM D1flights A
LEFT JOIN D2flights B
ON A.flightNumber = B.flightNumber
WHERE B.flightNumber IS NULL
The SQLite statement above returns the expected result in one query. It appears, that the UNION clause does also order the output via the flightNumber column.
The code has been tested with SQLite version 3.32.2
I will belatedly pitch in my 2 cents. Consider the 2 simple tables people1 and people2 below:
id name age
0 1 teo 59
1 2 niko 57
2 3 maria 54
id name weight
0 1 teo 186
1 2 maria 125
2 3 evi 108
First, we create a temporaty view, v_all, where we join with UNION the two opposite LEFT JOINS as below:
CREATE TEMP VIEW v_all AS
SELECT p1.name AS name1, p1.age,
p2.name AS name2, p2.weight
FROM people1 p1
LEFT JOIN people2 AS p2
USING (name)
UNION
SELECT p1.name AS name1, p1.age,
p2.name AS name2, p2.weight
FROM people2 AS p2
LEFT JOIN people1 AS p1
USING (name);
However, we end up with 2 name columns,name1 and name2, which may have a null value or equal values. What we want is to combine name1 and name2 in a single column name. We can do that with a CASE query as below:
SELECT age,weight,
CASE
WHEN name1 IS NULL
THEN name2
WHEN name2 IS NULL
THEN name1
WHEN name1=name2
THEN name1
END name
FROM v_all
And we finally end up with:
name weight age
0 evi 108 None
1 maria 125 54
2 niko None 57
3 teo 186 59
Of course you could combine the two in a single query, without having to create a temp view. I avoided doing so, in order to highlight the insufficiency of just 2 left joins and a union, which is what i have seen so far recommended.