Differences between forms of LEFT JOIN - sql

What is the difference between these query
SELECT
A.AnyField
FROM
A
LEFT JOIN B ON B.AId = A.Id
LEFT JOIN C ON B.CId = C.Id
and this another query
SELECT
A.AnyField
FROM
A
LEFT JOIN
(
B
JOIN C ON B.CId = C.Id
) ON B.AId = A.Id

Original answer:
They are not the same.
For example a left join b left join c will return a rows, plus b rows even if there are no c rows.
a left join (b join c) will never return b rows if there are no c rows.
Added later:
SQL>create table a (id int);
SQL>create table b (id int);
SQL>create table c (id int);
SQL>insert into a values (1);
SQL>insert into a values (2);
SQL>insert into b values (1);
SQL>insert into b values (1);
SQL>insert into c values (2);
SQL>insert into c values (2);
SQL>insert into c values (2);
SQL>insert into c values (2);
SQL>select a.id from a left join b on a.id = b.id left join c on b.id = c.id;
id
===========
1
1
2
3 rows found
SQL>select a.id from a left join (b join c on b.id = c.id) on a.id = b.id;
id
===========
1
2
2 rows found

The first query is going to take ALL records from table a and then only records from table b where a.id is equal to b.id. Then it's going to take all records from table c where the resulting records in table b have a cid that matches c.id.
The second query is going to first JOIN b and c on the id. That is, records will only make it to the resultset from that join where the b.CId and the c.ID are the same, because it's an INNER JOIN.
Then the result of the b INNER JOIN c will be LEFT JOINed to table a. That is, the DB will take all records from a and only the records from the results of b INNER JOIN c where a.id is equal to b.id
The difference is that you may end up with more data from b in your first query since the DB isn't dropping records from your result set just because b.cid <> c.id.
For a visual, the following Venn diagram shows which records are available

SELECT
A.AnyField
FROM
A
LEFT JOIN B ON B.AId = A.Id
LEFT JOIN C ON B.CId = C.Id
In this query you are LEFT JOINing C with B which will give you all records possible with B whether or not there is a match to any records in C.
SELECT
A.AnyField
FROM
A
LEFT JOIN
(
B
JOIN C ON B.CId = C.Id
) ON B.AId = A.Id
In this query you are INNER JOINing C with B which will result in matching B and C records.
Both queries will give you the same result set as you are only pulling records from A so you will not see what records had matches and what did not in regards to B and C.

Related

Left join inside left join

I have problem getting values from tables.
I need something like this
A.Id a1
B.Id b1
C.Id c1
B.Id b2
C.Id c2
C.Id c3
C.Id c4
Table A and B are joined together and also table B and C.
Table A can have one/zero or more values from table B. Same situation is for values from table C.
I need to perform left join on table A over table B and inside that left join on table B over table C.
I tried with left join from table A and B, but don't know how to perform left join inside that left join.
Is that possible? What would syntax for that look like?
edit:
Data would look like this
ZZN1 P1 NULL
ZZN1 P2 NAB1
ZZN2 P3 NAB2
ZZN2 P3 NAB3
No need to nest the left joins, you can simply flatten them and let your RDMBS handle the logic.
Sample schema:
a(id)
b(id, aid) -- aid is a foreign key to a(id)
c(id, bid) -- bid is a foreign key to b(id)
Query:
select a.id, b.id, c.id
from a
left join b on b.aid = a.id
left join c on c.bid = b.id
If the first left join does not succeed, then the second one cannot be performed either, since joining column b.id will be null. On the other hand, if the first left join succeeds, then the second one may, or may not succeed, depending if the relevant bid is available in c.
SELECT A.Name, B.Name , C.Name
FROM A
LEFT JOIN B ON A.id = B.id
LEFT JOIN C ON B.id = C.id

How do JOINs distribute over OUTER JOINs?

Are there any case where these two are not equivalent?
A OUTER JOIN (B JOIN C)
A OUTER JOIN C OUTER JOIN B
Yes:
In your first example (A OUTERJOIN (B JOIN C)), if either B or C does not have a matching record, both B and C are omitted.
In your second example (A OUTERJOIN C OUTERJOIN B), C can be returned even if B does not have a matching record.
Still same as my comment. If B join C produces an empty result set, then A outer join "empty result set" is the same as just A.
A outer join B outer join C is something different (as least if one of B and C are not empty.)
Since A seems to connect to C, it would be clearer to write the first option as:
A OUTERJOIN (C JOIN B)
The key difference between your two options is whether data from C is returned when there is a match between A and C. Just looking at this aspect the two options could be viewed as the sets:
intersect(A,intersect(C,B))
intersect(A,C)
Clearly, the two are different since the first form can eliminate rows from C before it is intersected with A.
What fields you join on can make a difference as tables are not guaranteed to have only one possible field to join to another table on. And in this case does table b have a field to join to either table c or table a? That makes a differnce. What fields you want returned make a difference to the results set and whther two things will return the same results. The state of the data makes a differnce as some queries will appear to be equivalent until the data changes. SO understnding that these are not equivalent queires helps you avoid these mistakes. Whether you use a full, left or right outer join makes a difference as well. And finally what where clauses you add can make a differnce in whther they appear to be equivalent.
Check out these examples using temp tables (SQL server syntax)
create table #a (aid int, sometext varchar(50))
create table #b (bid int, sometext2 varchar(50), cid int, aid int)
create table #c (cid int, sometext3 varchar(50), aid int)
insert into #a
values(1, 'test') , (2, 'test2'), (3, 'test3')
insert into #b
values(1, 'test', 1, 2) , (2, 'test2', 2, 1), (3, 'test3', 2, 2)
insert into #c
values(1, 'test', 1) , (2, 'test2', 2), (3, 'test3', 1)
select *
from #a a
left outer join #c c on a.aid = c.aid
left outer join #b b on a.aid = b.aid
select *
from #a a
left outer join #c c on a.aid = c.aid
left outer join #b b on c.cid = b.cid
select *
from #a a
left outer join #b b
join #c c on b.cid = c.cid
on a.aid = b.aid
select *
from #a a
right outer join #c c on a.aid = c.aid
right outer join #b b on a.aid = b.aid
select *
from #a a
right outer join #c c on a.aid = c.aid
right outer join #b b on c.cid = b.cid
select *
from #a a
right outer join #b b
join #c c on b.cid = c.cid
on a.aid = b.aid
select *
from #a a
full outer join #c c on a.aid = c.aid
full outer join #b b on a.aid = b.aid
select *
from #a a
full outer join #c c on a.aid = c.aid
full outer join #b b on c.cid = b.cid
select *
from #a a
full outer join #b b
join #c c on b.cid = c.cid
on a.aid = b.aid

Stuck on multiple table filtering query

Ok so I will try to simplify the problem that I have.
I have 4 tables:
TableA:
OneID
TableB:
OneID (FK to TableA)
TwoID (FK to TableC)
TableC:
TwoID
ThreeID (FK to TableD)
TableD:
ThreeID
I need a query to retrieve data from all 4 of these tables.
The query criteria is:
want to inner join tables A, B, C
want to join above result with table D with the following conditions:
if a record is in Table D but not in Table C, then it must always be present in the results
otherwise if a record is in Table D and Table C, then it should only be present if it is in the result of the A,B,C join
The scenario you have described is not really possible (or at least they are not really logical)
if a record is in Table D but not in Table C, then it must always be present in the results
The only way a record could be "in Table D and not in Table C" is if the foreign key is null in table D, with no link from table D to tables A or B there is no other way you could define a record as being present in D and not in C:
otherwise if a record is in Table D and Table C, then it should only be present if it is in the result of the A,B,C join
Again, the only way this could happen is with NULLABLE foreign keys. Regardless I think any of the below will get you the results you require:
SELECT A.OneID, B.TwoID, c.ThreeID, D.FourID
FROM D
LEFT JOIN (C
INNER JOIN B
ON B.TwoID = C.TwoID
INNER JOIN A
ON A.OneID = B.OneID)
ON C.ThreeID = D.ThreeID;
Or
SELECT A.OneID, B.TwoID, c.ThreeID, D.FourID
FROM A
INNER JOIN B
ON B.OneID = A.OneID
INNER JOIN C
ON C.TwoID = B.TwoID
RIGHT JOIN D
ON D.ThreeID = C.ThreeID
Or
SELECT A.OneID, B.TwoID, c.ThreeID, D.FourID
FROM A
INNER JOIN B
ON B.OneID = A.OneID
INNER JOIN C
ON C.TwoID = B.TwoID
INNER JOIN D
ON D.ThreeID = C.ThreeID
UNION ALL
SELECT NULL, NULL, NULL, FourID
FROM D
WHERE ThreeID IS NULL;
Examples on SQL Fiddle
The first two have the same execution plan, it is just a matter of preference, I personally dislike using RIGHT JOIN because it makes queries feel like they in the wrong order i.e bottom to top, but this is purely my preference. The last query may perform better depending on the cardinality of your data and any indexes you have
EDIT
With your revised criteria I think the easiest way to implement this is with a UNION ALL:
SELECT A.OneID, B.TwoID, c.ThreeID, d3 = D.ThreeID
FROM A
INNER JOIN B
ON B.OneID = A.OneID
INNER JOIN C
ON C.TwoID = B.TwoID
INNER JOIN D
ON D.ThreeID = C.ThreeID
UNION ALL
SELECT NULL, NULL, NULL, ThreeID
FROM D
WHERE NOT EXISTS (SELECT 1 FROM C WHERE C.ThreeID = D.ThreeID);
Example on SQL Fiddle
I am not 100% sure I understood you correctly but I will try to help anyway. It seems that you need to do FULL OUTER JOIN on table D:
SELECT
*
FROM
TableA AS A INNER JOIN
TableB AS B ON B.A_Id = A.Id INNER JOIN
TableC AS C ON C.B_Id = B.Id FULL OUTER JOIN
TableD AS D ON D.C_Id = C.Id
If I have misunderstood your requirements and you need more complicated criteria, you could just do FULL OUTER JOIN on all the tables and put extra conditions in WHERE part:
SELECT
*
FROM
TableA AS A FULL OUTER JOIN
TableB AS B ON B.A_Id = A.Id FULL OUTER JOIN
TableC AS C ON C.B_Id = B.Id FULL OUTER JOIN
TableD AS D ON D.C_Id = C.Id
WHERE
--if a record is in Table D but not in Table C, then it must always be present in the results
(D.Id IS NOT NULL AND C.Id IS NULL) OR
(
--otherwise if a record is in Table D and Table C, then it should only be present if it is in the result of the A,B,C join
(D.Id IS NOT NULL AND C.Id IS NOT NULL) AND
--want to inner join tables A, B, C
(A.Id IS NOT NULL AND B.Id IS NOT NULL AND B.Id IS NOT NULL)
)

SQL multiple table join

I have three tables A, B, and C:
A
---------
a_pk | id
B
----------------------
b_pk | id | link | foo
C
----------------------
c_pk | id | link | bar
All records in B have a matching record in A; all records in C have a matching record in A, but records in B and C do not necessarily have to match each other. I want to get the set of results where A has a match in either B or C. Individually, the queries would be:
SELECT A.id, B.foo FROM A INNER JOIN B ON A.id = B.id
SELECT A.id, C.bar FROM A INNER JOIN C ON A.id = C.id
SELECT B.foo, C.bar FROM B FULL JOIN C ON B.id = C.id AND B.link = C.link
What would I need to fill in to make this work?
SELECT A.id, B.foo, C.bar FROM <join A, B, C>
I'm using Oracle if it makes a difference, and I would prefer to avoid using a subquery if possible.
[edit - for clarification]
I want only the records from A that have a match in either B or C.
Sounds like you need a left join. (I'm not very familiar with Oracle but I assume this should work.)
SELECT A.id, B.foo, C.bar
FROM A
LEFT OUTER JOIN B on A.id = B.id
LEFT OUTER JOIN C on A.id = C.id
WHERE B.id IS NOT NULL OR C.id IS NOT NULL

How to update with inner join in Oracle

Could someone please verify whether inner join is valid with UPDATE statment in PL SQL?
e.g.
Update table t
set t.value='value'
from tableb b inner join
on t.id=b.id
inner join tablec c on
c.id=b.id
inner join tabled d on
d.id=c.id
where d.key=1
This synthax won't work in Oracle SQL.
In Oracle you can update a join if the tables are "key-preserved", ie:
UPDATE (SELECT a.val_a, b.val_b
FROM table a
JOIN table b ON a.b_pk = b.b_pk)
SET val_a = val_b
Assuming that b_pk is the primary key of b, here the join is updateable because for each row of A there is at most one row from B, therefore the update is deterministic.
In your case since the updated value doesn't depend upon another table you could use a simple update with an EXIST condition, something like this:
UPDATE mytable t
SET t.VALUE = 'value'
WHERE EXISTS
(SELECT NULL
FROM tableb b
INNER JOIN tablec c ON c.id = b.id
INNER JOIN tabled d ON d.id = c.id
WHERE t.id = b.id
AND d.key = 1)
update t T
set T.value = 'value'
where T.id in (select id from t T2, b B, c C, d D
where T2.id=B.id and B.id=C.id and C.id=D.id and D.key=1)
-- t is the table name, T is the variable used to reffer to this table