Shouldn't a full join get both sides even if one side has a null of both tables - sql

This is not the full query(it is a subquery in a larger query). However, I know this is the part that is causing me a problem as I have taken it own and sourced it to it. I am trying to get both sides of two tables that don't always have matching composite keys. In particular, the GKey is not always the same.
I am currently using a full join, but I still get gaps on both sides. There is an example of how the data should essentially come out. You can see that there are nulls on both sides of the two different tables.
Input
| A | B |
| 2 | NULL |
| NULL | 5 |
| 3 | 3 |
|Null | 6 |
Outcome
| A | B |
| 3 | 3 |
SELECT DISTINCT BudgetUnit / (WorkDaysInMonth * 8) AS B
,Unit / (WorkDaysInMonth * 8) AS A
FROM BFact AS BMF
FULL JOIN GFact AS GF ON
BMF.GLKey = GF.GKey
AND
BMF.DKey = GF.DKey
AND GF.AKey = BMF.BKey
AND GF.PKey = BMF.PKey
INNER JOIN DimDate AS DA ON GF.AKey = DA.DateKey

Because you are selecting FROM BFact, and then using GFact in an INNER JOIN after the FULL JOIN, you are turning your FULL JOIN into an INNER JOIN because no results that don't satisfy the last INNER JOIN will be returned.
You could make that last INNER JOIN into a LEFT OUTER JOIN to get the desired results.

Related

Access Query Combine Records from Many-to-Many into one row

This should be easy but I'm having a hard time.
I have a many-to-many relationship where, for example, many cars can have many components.
So I want a query to return all cars and the subsequent components used. If no component is used it should just return a NULL value.
Car | Engine | Tyre
----------------------
1 | Engine3 |
2 | Engine4 | Tyre3
3 | Engine1 | Tyre1
But with the following SQL:
SELECT Car.idCar, Engine.idEngine, Tyre.idTyre
FROM ((Component
RIGHT JOIN (Car
LEFT JOIN Car_Component ON Car.idCar = Car_Component.idCar) ON Component.idComponent = Car_Component.idComponent)
LEFT JOIN Engine ON Component.idComponent = Engine.idComponent)
LEFT JOIN Tyre ON Component.idComponent = Tyre.idComponent;
I get:
Car | Engine | Tyre
----------------------
1 | Engine3 |
2 | Engine4 |
2 | | Tyre3
3 | Engine1 |
3 | | Tyre1
I've been searching for a solution for quite some time now and I'm pretty sure I need to make subqueriesm but my knowledge of subqueries is limited and I don't know how to start.
Here is the problem in SQL Fiddle.
Does the following query work for you SQL Fiddle:
SELECT DISTINCT Car.idCar, Engine.idEngine, Tyre.idTyre
FROM (((Car
INNER JOIN Car_Component ON Car.idCar = Car_Component.idCar)
INNER JOIN Component ON Car_Component.idComponent = Component.idComponent)
LEFT JOIN Engine ON Component.idComponent = Engine.idComponent)
LEFT JOIN Tyre ON Component.idComponent = Tyre.idComponent;

Difference between natural join and inner join

What is the difference between a natural join and an inner join?
One significant difference between INNER JOIN and NATURAL JOIN is the number of columns returned.
Consider:
TableA TableB
+------------+----------+ +--------------------+
|Column1 | Column2 | |Column1 | Column3 |
+-----------------------+ +--------------------+
| 1 | 2 | | 1 | 3 |
+------------+----------+ +---------+----------+
The INNER JOIN of TableA and TableB on Column1 will return
SELECT * FROM TableA AS a INNER JOIN TableB AS b USING (Column1);
SELECT * FROM TableA AS a INNER JOIN TableB AS b ON a.Column1 = b.Column1;
+------------+-----------+---------------------+
| a.Column1 | a.Column2 | b.Column1| b.Column3|
+------------------------+---------------------+
| 1 | 2 | 1 | 3 |
+------------+-----------+----------+----------+
The NATURAL JOIN of TableA and TableB on Column1 will return:
SELECT * FROM TableA NATURAL JOIN TableB
+------------+----------+----------+
|Column1 | Column2 | Column3 |
+-----------------------+----------+
| 1 | 2 | 3 |
+------------+----------+----------+
The repeated column is avoided.
(AFAICT from the standard grammar, you can't specify the joining columns in a natural join; the join is strictly name-based. See also Wikipedia.)
(There's a cheat in the inner join output; the a. and b. parts would not be in the column names; you'd just have column1, column2, column1, column3 as the headings.)
An inner join is one where the matching row in the joined table is required for a row from the first table to be returned
An outer join is one where the matching row in the joined table is not required for a row from the first table to be returned
A natural join is a join (you can have either natural left or natural right) that assumes the join criteria to be where same-named columns in both table match
I would avoid using natural joins like the plague, because natural joins are:
not standard sql [SQL 92] and therefore not portable, not particularly readable (by most SQL coders) and possibly not supported by various tools/libraries
not informative; you can't tell what columns are being joined on without referring to the schema
your join conditions are invisibly vulnerable to schema changes - if there are multiple natural join columns and one such column is removed from a table, the query will still execute, but probably not correctly and this change in behaviour will be silent
hardly worth the effort; you're only saving about 10 seconds of typing
A natural join is just a shortcut to avoid typing, with a presumption that the join is simple and matches fields of the same name.
SELECT
*
FROM
table1
NATURAL JOIN
table2
-- implicitly uses `room_number` to join
Is the same as...
SELECT
*
FROM
table1
INNER JOIN
table2
ON table1.room_number = table2.room_number
What you can't do with the shortcut format, however, is more complex joins...
SELECT
*
FROM
table1
INNER JOIN
table2
ON (table1.room_number = table2.room_number)
OR (table1.room_number IS NULL AND table2.room_number IS NULL)
SQL is not faithful to the relational model in many ways. The result of a SQL query is not a relation because it may have columns with duplicate names, 'anonymous' (unnamed) columns, duplicate rows, nulls, etc. SQL doesn't treat tables as relations because it relies on column ordering etc.
The idea behind NATURAL JOIN in SQL is to make it easier to be more faithful to the relational model. The result of the NATURAL JOIN of two tables will have columns de-duplicated by name, hence no anonymous columns. Similarly, UNION CORRESPONDING and EXCEPT CORRESPONDING are provided to address SQL's dependence on column ordering in the legacy UNION syntax.
However, as with all programming techniques it requires discipline to be useful. One requirement for a successful NATURAL JOIN is consistently named columns, because joins are implied on columns with the same names (it is a shame that the syntax for renaming columns in SQL is verbose but the side effect is to encourage discipline when naming columns in base tables and VIEWs :)
Note a SQL NATURAL JOIN is an equi-join**, however this is no bar to usefulness. Consider that if NATURAL JOIN was the only join type supported in SQL it would still be relationally complete.
While it is indeed true that any NATURAL JOIN may be written using INNER JOIN and projection (SELECT), it is also true that any INNER JOIN may be written using product (CROSS JOIN) and restriction (WHERE); further note that a NATURAL JOIN between tables with no column names in common will give the same result as CROSS JOIN. So if you are only interested in results that are relations (and why ever not?!) then NATURAL JOIN is the only join type you need. Sure, it is true that from a language design perspective shorthands such as INNER JOIN and CROSS JOIN have their value, but also consider that almost any SQL query can be written in 10 syntactically different, but semantically equivalent, ways and this is what makes SQL optimizers so very hard to develop.
Here are some example queries (using the usual parts and suppliers database) that are semantically equivalent:
SELECT *
FROM S NATURAL JOIN SP;
-- Must disambiguate and 'project away' duplicate SNO attribute
SELECT S.SNO, SNAME, STATUS, CITY, PNO, QTY
FROM S INNER JOIN SP
USING (SNO);
-- Alternative projection
SELECT S.*, PNO, QTY
FROM S INNER JOIN SP
ON S.SNO = SP.SNO;
-- Same columns, different order == equivalent?!
SELECT SP.*, S.SNAME, S.STATUS, S.CITY
FROM S INNER JOIN SP
ON S.SNO = SP.SNO;
-- 'Old school'
SELECT S.*, PNO, QTY
FROM S, SP
WHERE S.SNO = SP.SNO;
** Relational natural join is not an equijoin, it is a projection of one. – philipxy
A NATURAL join is just short syntax for a specific INNER join -- or "equi-join" -- and, once the syntax is unwrapped, both represent the same Relational Algebra operation. It's not a "different kind" of join, as with the case of OUTER (LEFT/RIGHT) or CROSS joins.
See the equi-join section on Wikipedia:
A natural join offers a further specialization of equi-joins. The join predicate arises implicitly by comparing all columns in both tables that have the same column-names in the joined tables. The resulting joined table contains only one column for each pair of equally-named columns.
Most experts agree that NATURAL JOINs are dangerous and therefore strongly discourage their use. The danger comes from inadvertently adding a new column, named the same as another column ...
That is, all NATURAL joins may be written as INNER joins (but the converse is not true). To do so, just create the predicate explicitly -- e.g. USING or ON -- and, as Jonathan Leffler pointed out, select the desired result-set columns to avoid "duplicates" if desired.
Happy coding.
(The NATURAL keyword can also be applied to LEFT and RIGHT joins, and the same applies. A NATURAL LEFT/RIGHT join is just a short syntax for a specific LEFT/RIGHT join.)
Natural Join: It is combination or combined result of all the columns in the two tables.
It will return all rows of the first table with respect to the second table.
Inner Join: This join will work unless if any of the column name shall be sxame in two tables
A Natural Join is where 2 tables are joined on the basis of all common columns.
common column : is a column which has same name in both tables + has compatible datatypes in both the tables.
You can use only = operator
A Inner Join is where 2 tables are joined on the basis of common columns mentioned in the ON clause.
common column : is a column which has compatible datatypes in both the tables but need not have the same name.
You can use only any comparision operator like =, <=, >=, <, >, <>
Natural Join : A SQL Join clause combines fields from 2 or more tables in a relational database. A natural join is based on all columns in two tables that have the same name and selected rows from the two tables that have equal values in all matched columns.
--- The names and data types of both columns must be the same.
Using Clause : In a natural join,if the tables have columns with the same names but different data types, the join causes and error.To avoid this situation, the join clause can be modified with a USING clause. The USING clause specifies the columns that should be used for the join.
difference is that int the inner(equi/default)join and natural join that in the natuarl join common column win will be display in single time but inner/equi/default/simple join the common column will be display double time.
Inner join and natural join are almost same but there is a slight difference between them. The difference is in natural join no need to specify condition but in inner join condition is obligatory. If we do specify the condition in inner join , it resultant tables is like a cartesian product.
mysql> SELECT * FROM tb1 ;
+----+------+
| id | num |
+----+------+
| 6 | 60 |
| 7 | 70 |
| 8 | 80 |
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------+
6 rows in set (0.00 sec)
mysql> SELECT * FROM tb2 ;
+----+------+
| id | num |
+----+------+
| 4 | 40 |
| 5 | 50 |
| 9 | 90 |
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------+
6 rows in set (0.00 sec)
INNER JOIN :
mysql> SELECT * FROM tb1 JOIN tb2 ;
+----+------+----+------+
| id | num | id | num |
+----+------+----+------+
| 6 | 60 | 4 | 40 |
| 7 | 70 | 4 | 40 |
| 8 | 80 | 4 | 40 |
| 1 | 1 | 4 | 40 |
| 2 | 2 | 4 | 40 |
| 3 | 3 | 4 | 40 |
| 6 | 60 | 5 | 50 |
| 7 | 70 | 5 | 50 |
| 8 | 80 | 5 | 50 |
.......more......
return 36 rows in set (0.01 sec)
AND NATURAL JOIN :
mysql> SELECT * FROM tb1 NATURAL JOIN tb2 ;
+----+------+
| id | num |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------+
3 rows in set (0.01 sec)
Inner join, join two table where column name is same.
Natural join, join two table where column name and data types are same.

join on three tables? Error in phpMyAdmin

I'm trying to use a join on three tables query I found in another post (post #5 here). When I try to use this in the SQL tab of one of my tables in phpMyAdmin, it gives me an error:
#1066 - Not unique table/alias: 'm'
The exact query I'm trying to use is:
select r.*,m.SkuAbbr, v.VoucherNbr from arrc_RedeemActivity r, arrc_Merchant m, arrc_Voucher v
LEFT OUTER JOIN arrc_Merchant m ON (r.MerchantID = m.MerchantID)
LEFT OUTER JOIN arrc_Voucher v ON (r.VoucherID = v.VoucherID)
I'm not entirely certain it will do what I need it to do or that I'm using the right kind of join (my grasp of SQL is pretty limited at this point), but I was hoping to at least see what it produced.
(What I'm trying to do, if anyone cares to assist, is get all columns from arrc_RedeemActivity, plus SkuAbbr from arrc_Merchant where the merchant IDs match in those two tables, plus VoucherNbr from arrc_Voucher where VoucherIDs match in those two tables.)
Edited to add table samples
Table arrc_RedeemActivity
RedeemID | VoucherID | MerchantID | RedeemAmt
----------------------------------------------
1 | 2 | 3 | 25
2 | 6 | 5 | 50
Table arrc_Merchant
MerchantID | SkuAbbr
---------------------
3 | abc
5 | def
Table arrc_Voucher
VoucherID | VoucherNbr
-----------------------
2 | 12345
6 | 23456
So ideally, what I'd like to get back would be:
RedeemID | VoucherID | MerchantID | RedeemAmt | SkuAbbr | VoucherNbr
-----------------------------------------------------------------------
1 | 2 | 3 | 25 | abc | 12345
2 | 2 | 5 | 50 | def | 23456
The problem was you had duplicate table references - which would work, except for that this included table aliasing.
If you want to only see rows where there are supporting records in both tables, use:
SELECT r.*,
m.SkuAbbr,
v.VoucherNbr
FROM arrc_RedeemActivity r
JOIN arrc_Merchant m ON m.merchantid = r.merchantid
JOIN arrc_Voucher v ON v.voucherid = r.voucherid
This will show NULL for the m and v references that don't have a match based on the JOIN criteria:
SELECT r.*,
m.SkuAbbr,
v.VoucherNbr
FROM arrc_RedeemActivity r
LEFT JOIN arrc_Merchant m ON m.merchantid = r.merchantid
LEFT JOIN arrc_Voucher v ON v.voucherid = r.voucherid

SQL Server optional joins

I have several tables (to be exact, 7) tables I cross join in one another. This part gives me some problems;
Table "Actions"
-----------------------------------------
| ID | Package ID | Action Type | Message |
-----------------------------------------
| 40 | 100340 | 0 | OK |
| 41 | 100340 | 12 | Error |
| 42 | 100340 | 2 | OK |
| 43 | 100341 | 4 | OK |
| 44 | 100341 | 0 | Error |
| 45 | 100341 | 12 | OK |
-----------------------------------------
Table "Packages"
----------------------
| ID | Name |
----------------------
| 100340 | Testpackage |
| 100341 | Package xy |
----------------------
I accomplished cross joingin thm, but when there is no Package with an ID specified in Actions, all actions on that package are completely missing, rather than just leavin Name blank - which is what I'm trying to get.
So, if a reference is missing, just leave the corresponding joined column blank or as an empty string...:
----------------------------------------------------------------------
| Package ID | Name | Action 0 | Action 2 | Action 4 | Action 12 |
----------------------------------------------------------------------
| 100340 | Testpackage | OK | OK | | Error |
| 100341 | Package xy | Error | | OK | OK |
----------------------------------------------------------------------
How is that possible?
Edit
Sorry, I just saw my example was completety wrong, I updated it how it should look like in the end.
My current query looks something like this (as said above, just an extract as the actual one is about three times as long including even more tables)
SELECT
PackageTable.ID AS PackageID,
PackageTable.Name,
Action0Table.Message AS Action0,
Action2Table.Message AS Action2,
Action4Table.Message AS Action4,
Action12Table.Message AS Action12
FROM
Packages AS PackageTable LEFT OUTER JOIN
Actions AS Action0Table ON PackageTable.ID = Action0Table.PackageID LEFT OUTER JOIN
Actions AS Action2Table ON PackageTable.ID = Action2Table.PackageID LEFT OUTER JOIN
Actions AS Action4Table ON PackageTable.ID = Action4Table.PackageID LEFT OUTER JOIN
Actions AS Action12Table ON PackageTable.ID = Action12Table.PackageID
WHERE
Action0Table.ActionType = 0 AND
Action2Table.ActionType = 2 AND
Action4Table.ActionType = 4 AND
Action12Table.ActionType = 12
So why can't you just do an outer JOIN, such as:
SELECT `Package ID`, Name, `Action Type`, `Action Date`
FROM Actions
LEFT OUTER JOIN Packages
ON Actions.`Package ID` = Packages.`Package ID`
?
Do you have a 'where' clause which is excluding the missing action records? Even if you use an outer join, your where clause will still be applied and the action records will not be included if they don't match.
EDIT: The problem is your ActionType filters. If there is no matching action record, then ActionType is null, which does not match any of your filters. So, you could add 'or ActionType is null' to your where clause. I don't know your business requirement, but this may include more records than you want.
As what the others has said, I have the same answer.
Just showing u the different result
declare #Actions table(id int , packageid int, actiontype int,dt date)
declare #Packages table(id int,name varchar(50))
insert into #Actions
select 40,100340,0,'2009/01/01 3:00pm' union all
select 41,100340,12,'2009/01/01 5:00pm' union all
select 42,100340,2,'2009/01/01 5:30pm' union all
select 43,100341, 4,'2009/01/02 8:00am'
insert into #Packages
select 100340,'Testpackage'
Left outer join query
select a.packageid,p.name,a.actiontype,a.dt
from #Actions a
left join #Packages p
on a.packageid = p.id
Full join(in this case u will get the same result)
select a.packageid,p.name,a.actiontype,a.dt
from #Actions a
full join #Packages p
on a.packageid = p.id
Output:
packageid name actiontype dt
100340 Testpackage 0 2009-01-01
100340 Testpackage 12 2009-01-01
100340 Testpackage 2 2009-01-01
100341 NULL 4 2009-01-02
Inner join query(which u don't want)
select a.packageid,p.name,a.actiontype,a.dt
from #Actions a
join #Packages p
on a.packageid = p.id
Output:
packageid name actiontype dt
100340 Testpackage 0 2009-01-01
100340 Testpackage 12 2009-01-01
100340 Testpackage 2 2009-01-01
Your are left (outer) joining on the packages table. This means that if the record is not in the packages table (the table on the left side of the join condition) then don't include it in the final result.
You could right (outer) join on the action table in which case you will get all of the action records whether or not they have a match in the package table.
You could do a full (outer) join, in other words, give me all of the records in both tables.
And finally you can do an inner join, or the records which are present in both tables.
You may find it helpful to picture a Venn diagram here with the left table as the left circle and the right table as the right circle. The inner join thus represents the overlapping region of the diagram.
So, to answer your question, you are going to need to tweak your joins to be full outer joins or right joins depending on whether or not you want to see packages without actions or not. And you are going to need to adjust your where clause to include null actions as has been suggested by many other posters. Though you may want to add an additional clause to that where expression which excludes records where all of the actions are null. Plus, as your example is written you are only going to see packages where the actions on that package are 0, 2, 4 and 12; which sounds incorrect given the information you've provided.
SELECT
PackageTable.ID AS PackageID,
PackageTable.Name,
Action0Table.Message AS Action0,
Action2Table.Message AS Action2,
Action4Table.Message AS Action4,
Action12Table.Message AS Action12
FROM
Packages AS PackageTable
RIGHT OUTER JOIN Actions AS Action0Table ON
PackageTable.ID = Action0Table.PackageID
RIGHT OUTER JOIN Actions AS Action2Table ON
PackageTable.ID = Action2Table.PackageID
RIGHT OUTER JOIN Actions AS Action4Table ON
PackageTable.ID = Action4Table.PackageID
RIGHT OUTER JOIN Actions AS Action12Table ON
PackageTable.ID = Action12Table.PackageID
WHERE
(Action0Table.ActionType = 0 OR Action0Table.ActionType IS NULL) AND
(Action2Table.ActionType = 2 OR Action2Table.ActionType IS NULL) AND
(Action4Table.ActionType = 4 OR Action4Table.ActionType IS NULL) AND
(Action12Table.ActionType = 12 OR Action12Table.ActionType IS NULL) AND
NOT (Action0Table.ActionType IS NULL AND Action2Table.ActionType IS NULL AND
Action4Table.ActionType IS NULL AND Action12Table.ActionType IS NULL)
You will need to remove that final NOT clause if you want to see packages without any of those actions. Also, depending on the quality of the data you may begin receiving duplicate records once you start including null values; in which case your problem has become a lot harder to solve and you will need to get back to us.
read up on 'outer joins'
Instead of INNER JOIN use LEFT JOIN. That will make it.-

SQL INNER JOIN syntax

the two bits of SQL below get the same result
SELECT c.name, o.product
FROM customer c, order o
WHERE c.id = o.cust_id
AND o.value = 150
SELECT c.name, o.product
FROM customer c
INNER JOIN order o on c.id = o.cust_id
WHERE o.value = 150
I've seen both styles used as standard at different companies. From what I've seen, the 2nd one is what most people recommend online. Is there any real reason for this other than style? Does using an Inner Join sometimes have better performance?
I've noticed Ingres and Oracle developers tend to use the first style, whereas Microsoft SQL Server users have tended to use the second, but that might just be a coincidence.
Thanks for any insight, I've wondered about this for a while.
Edit: I've changed the title from 'SQL Inner Join versus Cartesian Product' as I was using the incorrect terminlogy. Thanks for all the responses so far.
Both queries are an inner joins and equivalent. The first is the older method of doing things, whereas the use of the JOIN syntax only became common after the introduction of the SQL-92 standard (I believe it's in the older definitions, just wasn't particularly widely used before then).
The use of the JOIN syntax is strongly preferred as it separates the join logic from the filtering logic in the WHERE clause. Whilst the JOIN syntax is really syntactic sugar for inner joins it's strength lies with outer joins where the old * syntax can produce situations where it is impossible to unambiguously describe the join and the interpretation is implementation-dependent. The [LEFT | RIGHT] JOIN syntax avoids these pitfalls, and hence for consistency the use of the JOIN clause is preferable in all circumstances.
Note that neither of these two examples are Cartesian products. For that you'd use either
SELECT c.name, o.product
FROM customer c, order o
WHERE o.value = 150
or
SELECT c.name, o.product
FROM customer c CROSS JOIN order o
WHERE o.value = 150
To answer part of your question, I think early bugs in the JOIN ... ON syntax in Oracle discouraged Oracle users away from that syntax. I don't think there are any particular problems now.
They are equivalent and should be parsed into the same internal representation for optimization.
Actually these examples are equivalent and neither is a cartesian product. A cartesian product is returned when you join two tables without specifying a join condition, such as in
select *
from t1,t2
There is a good discussion of this on Wikipedia.
Oracle was late in supporting the JOIN ... ON (ANSI) syntax (not until Oracle 9), that's why Oracle developers often don't use it.
Personally, I prefer using ANSI syntax when it is logically clear that one table is driving the query and the others are lookup tables. When tables are "equal", I tend to use the cartesian syntax.
The performance should not differ at all.
The JOIN... ON... syntax is a more recent addition to ANSI and ISO specs for SQL. The JOIN... ON... syntax is generally preferred because it 1) moves the join criteria out of the WHERE clause making the WHERE clause just for filtering and 2) makes it more obvious if you are creating a dreaded Cartesian product since each JOIN must be accompanied by at least one ON clause. If all the join criteria are just ANDed in the WHERE clause, it's not as obvious when one or more is missing.
Both queries are performing an inner join, just different syntax.
TL;DR
An INNER JOIN statement can be rewritten as a CROSS JOIN with a WHERE clause matching the same condition you used in the ON clause of the INNER JOIN query.
Table relationship
Considering we have the following post and post_comment tables:
The post has the following records:
| id | title |
|----|-----------|
| 1 | Java |
| 2 | Hibernate |
| 3 | JPA |
and the post_comment has the following three rows:
| id | review | post_id |
|----|-----------|---------|
| 1 | Good | 1 |
| 2 | Excellent | 1 |
| 3 | Awesome | 2 |
SQL INNER JOIN
The SQL JOIN clause allows you to associate rows that belong to different tables. For instance, a CROSS JOIN will create a Cartesian Product containing all possible combinations of rows between the two joining tables.
While the CROSS JOIN is useful in certain scenarios, most of the time, you want to join tables based on a specific condition. And, that's where INNER JOIN comes into play.
The SQL INNER JOIN allows us to filter the Cartesian Product of joining two tables based on a condition that is specified via the ON clause.
SQL INNER JOIN - ON "always true" condition
If you provide an "always true" condition, the INNER JOIN will not filter the joined records, and the result set will contain the Cartesian Product of the two joining tables.
For instance, if we execute the following SQL INNER JOIN query:
SELECT
p.id AS "p.id",
pc.id AS "pc.id"
FROM post p
INNER JOIN post_comment pc ON 1 = 1
We will get all combinations of post and post_comment records:
| p.id | pc.id |
|---------|------------|
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
So, if the ON clause condition is "always true", the INNER JOIN is simply equivalent to a CROSS JOIN query:
SELECT
p.id AS "p.id",
pc.id AS "pc.id"
FROM post p
CROSS JOIN post_comment
WHERE 1 = 1
ORDER BY p.id, pc.id
SQL INNER JOIN - ON "always false" condition
On the other hand, if the ON clause condition is "always false", then all the joined records are going to be filtered out and the result set will be empty.
So, if we execute the following SQL INNER JOIN query:
SELECT
p.id AS "p.id",
pc.id AS "pc.id"
FROM post p
INNER JOIN post_comment pc ON 1 = 0
ORDER BY p.id, pc.id
We won't get any result back:
| p.id | pc.id |
|---------|------------|
That's because the query above is equivalent to the following CROSS JOIN query:
SELECT
p.id AS "p.id",
pc.id AS "pc.id"
FROM post p
CROSS JOIN post_comment
WHERE 1 = 0
ORDER BY p.id, pc.id
SQL INNER JOIN - ON clause using the Foreign Key and Primary Key columns
The most common ON clause condition is the one that matches the Foreign Key column in the child table with the Primary Key column in the parent table, as illustrated by the following query:
SELECT
p.id AS "p.id",
pc.post_id AS "pc.post_id",
pc.id AS "pc.id",
p.title AS "p.title",
pc.review AS "pc.review"
FROM post p
INNER JOIN post_comment pc ON pc.post_id = p.id
ORDER BY p.id, pc.id
When executing the above SQL INNER JOIN query, we get the following result set:
| p.id | pc.post_id | pc.id | p.title | pc.review |
|---------|------------|------------|------------|-----------|
| 1 | 1 | 1 | Java | Good |
| 1 | 1 | 2 | Java | Excellent |
| 2 | 2 | 3 | Hibernate | Awesome |
So, only the records that match the ON clause condition are included in the query result set. In our case, the result set contains all the post along with their post_comment records. The post rows that have no associated post_comment are excluded since they can not satisfy the ON Clause condition.
Again, the above SQL INNER JOIN query is equivalent to the following CROSS JOIN query:
SELECT
p.id AS "p.id",
pc.post_id AS "pc.post_id",
pc.id AS "pc.id",
p.title AS "p.title",
pc.review AS "pc.review"
FROM post p, post_comment pc
WHERE pc.post_id = p.id
The non-struck rows are the ones that satisfy the WHERE clause, and only these records are going to be included in the result set. That's the best way to visualize how the INNER JOIN clause works.
| p.id | pc.post_id | pc.id | p.title | pc.review |
|------|------------|-------|-----------|-----------|
| 1 | 1 | 1 | Java | Good |
| 1 | 1 | 2 | Java | Excellent |
| 1 | 2 | 3 | Java | Awesome |
| 2 | 1 | 1 | Hibernate | Good |
| 2 | 1 | 2 | Hibernate | Excellent |
| 2 | 2 | 3 | Hibernate | Awesome |
| 3 | 1 | 1 | JPA | Good |
| 3 | 1 | 2 | JPA | Excellent |
| 3 | 2 | 3 | JPA | Awesome |
Not that this only applies to INNER JOIN, not for OUTER JOIN.