SQL Joining On Multiple Tables - sql

If I was to have a SQL query as such:
SELECT * FROM tableA a
INNER JOIN TABLEB b on a.item = b.item
INNER JOIN TABLEC c on a.item = c.item
LEFT JOIN TABLED d on d.item = c.item
Am I right in assuming the following:
Firstly, Table A is joined with Table B
Table C is joined with Table A independently of statement 1
Table D is left joined with table C indepdendently of Statement 1 and 2
The results of statement 1,2 and 3 and listed all together from the select

No, you are not correct. SQL is a descriptive language, not a procedural language. A SQL query describes the result set but does not specify how the query is actually executed.
In fact, almost all databases execute queries using directed acyclic graphs (DAGs). The operators in the graphs have names, but they do not correspond to the clauses in SQL.
In other words, the query is compiled to a language that you would not recognize. Along the way, it is optimized. A key part of the optimization is determining the order of the join operations.
In terms of interpreting what the query means, the joins are grouped from left to right (in the absence of parentheses):
SELECT *
FROM ((tableA a INNER JOIN
TABLEB b
on a.item = b.item
) INNER JOIN
TABLEC c
on a.item = c.item
) LEFT JOIN
TABLED d
on d.item = c.item
I believe this is basically what you have described.

Related

how do I properly create an sql query that joins four tables with a common key

I have four tables, that have a key in common.
three of the four tables are small subsets of the fourth,(master).
I want to join the tables such that only the output only contains records from the master table, that are on any of other fourth:
as an example:
My end result should look like this:
My problem is that joins I'm using, are giving me only the records that are common to all tables.
or records that are common to only one of the tables and the master.
Any help on formulating the correct join would be awesome!
Three left joins will produce the result you want. For example:
select a.*, b.color, c.size, d.weight
from a
left join b on b.id = a.id
left join c on c.id = a.id
left join d on d.id = a.id
where b.id is not null or c.id is not null or d.id is not null
EDIT: Added WHERE clause above as requested.

What is a good way to make multiple full outer join?

I'd like to know if anyone would know an elegant and scalable method to full outer join multiple tables, given that I might want to regularly add new tables to the join?
For now my method consists in full joining table A with table B, store the result as a cte, then full joining the cte to table C, store the result as a cte2, full joining cte2 to table D... you got it.
Creating a new cte every time i want to add another table to the join is not very practical, but every other solutions i found so far have the same issue, there's always some kind of infinite looping either on ctes or in selects (like SELECT blabla FROM (SELECT blabla2 FROM..)).
Is there any way that i don't know that would help me perform this multiple full join without falling in an infinite recursive loop of ctes?
Thanks
EDIT: Sorry it seems it wasn't clear enough
When i perform a multiple full join in one query like:
SELECT
a.*, b.*, c.*
FROM
tableA a
FULL JOIN
tableB b
ON
a.id = b.id
FULL JOIN
tableC c
ON
a.id = c.id
If the id is present in tableB and tableC but not tableA, my result will create two lines where there should be one, because i joined b to a and c to a but not b to c. That's why i need to full join the result of the full join of a and b to c.
So if i have let's say five table instead of three, i need to full join the result of the full join of the result of the full join of the result of the full join... x)
This fiddle illustrates the problem.
If you want the rows from tables B and C to join, you need to accomodate the fact that maybe the data comes from table B and not A. The easiest is probably to use COALESCE.
Your join should therefore look like:
SELECT a.*, b.*, c.*
FROM tableA a
FULL JOIN tableB b ON a.id = b.id
FULL JOIN tableC c ON COALESCE(a.id, b.id) = c.id
-- FULL JOIN tableD d ON COALESCE(a.id, b.id, c.id) = d.id
-- FULL JOIN tableE e ON COALESCE(a.id, b.id, c.id, d.id) = e.id
Most databases that support FULL JOIN also support USING, which is the simplest way to do what you want:
SELECT *
FROM tableA a FULL JOIN
tableB b
USING (id) FULL JOIN
tableC c
USING (id);
The semantics of USING mean that only non-NULL values are used, if such a value is available.

Does inner join order and where as an impact on performance [duplicate]

This question already has answers here:
Order Of Execution of the SQL query
(7 answers)
Closed 8 years ago.
Are these 2 queries equivalent in performance ?
select a.*
from a
inner join b
on a.bid = b.id
inner join c
on b.cid = c.id
where c.id = 'x'
and
select a.*
from c
inner join b
on b.cid = c.id
join a
on a.bid = b.id
where c.id = 'x'
Does it join all the table first then filter the condition, or is the condition applied first to reduce the join ?
(I am using sql server)
The Query Optimizer will almost always filter table c first before joining c to the other two tables. You can verify this by looking into the execution plan and see how many rows are being taken by SQL Server from table c to participate in the join.
About join order: the Query Optimizer will pick a join order that it thinks will work best for your query. It could be a JOIN b JOIN (filtered c) or (filtered c) JOIN a JOIN b.
If you want to force a certain order, include a hint:
SELECT *
FROM a
INNER JOIN b ON ...
INNER JOIN c ON ...
WHERE c.id = 'x'
OPTION (FORCE ORDER)
This will force SQL Server to do a join b join (filtered c). Standard warning: unless you see massive performance gain, most times it's better to leave the join order to the Query Optimizer.
Read about http://www.bennadel.com/blog/70-sql-query-order-of-operations.htm
The execution order is FROM then WHERE, in this case or in any other cases I don't think the WHERE clause is executed before the JOINS .
select a.*
from (select * from c where c.id = 'x') c
inner join b
on b.cid = c.id
inner join a
on a.bid = b.id
This can create difference in execution.

PL/SQL Using multiple left join

SELECT * FROM Table A LEFT JOIN TABLE B LEFT JOIN TABLE C
From the snippet above, TABLE C will left join into (TABLE B) or (data from TABLE A LEFT JOIN TABLE B) or (TABLE A)?
TABLE C will left join into 1. (TABLE B) or 2. (data from TABLE A LEFT JOIN
TABLE B) or 3. (TABLE A)?
The second. But The join condition will help you to understand more.
You can write:
SELECT *
FROM Table A
LEFT JOIN TABLE B ON (A.id = B.id)
LEFT JOIN TABLE C ON (A.ID = C.ID)
But you are able to:
SELECT *
FROM Table A
LEFT JOIN TABLE B ON (A.id = B.id)
LEFT JOIN TABLE C ON (A.id = C.id and B.code = C.code)
So, you can join on every field from previous tables and you join on "the result" (though the engine may choose its way to get the result) of the previous joins.
Think at left join as non-commutative operation (A left join B is not the same as B left join A) So, the order is important and C will be left joined at the previous joined tables.
The Oracle documentation is quite specific about how the joins are processed:
To execute a join of three or more tables, Oracle first joins two of
the tables based on the join conditions comparing their columns and
then joins the result to another table based on join conditions
containing columns of the joined tables and the new table. Oracle
continues this process until all tables are joined into the result.
This is the logic approach to handling the joins and is consistent with the ANSI standard (in other words, all database engines process the joins in order).
However, when the query is actually executed, the optimizer may choose to run the joins in a different order. The result needs to be logically the same as processing the joins in the order given in the query.
Also, the join conditions may cause some unexpected conditions to arise. So if you have:
from A left outer join
B
on A.id = B.id left outer join
C
on B.id = C.id
Then, you might have the condition where A and C each have a row with a particular id, but B does not. With this formulation, you will not see the row in C because it is joining to NULL. So, be careful with join conditions on left outer join, particularly when joining to a table other than the first table in the chain.
You need to mentioned the column name properly in order to run the query. Let´s say if you are using:
SELECT *
FROM Table A
LEFT JOIN TABLE B ON (A.id = B.id)
LEFT JOIN TABLE C ON (A.id = C.id and B.code = C.code)
Then you may get the following error:
ORA-00933:SQL command not properly ended.
So to avoid it you can try:
SELECT A.id as "Id_from_A", B.code as "Code_from_B"
FROM Table A
LEFT JOIN TABLE B ON (A.id = B.id)
LEFT JOIN TABLE C ON (A.id = C.id and B.code = C.code)
Thanks

Joining in SQL on more than 2 tables Using ORACLE

Suppose I have three tables, A, B and C.
I did Join on table A and table B, now I wanted to do Join on result AB and table c.
Do I need to create view and then do Join or need to do it by Nested query?
You don't say which DB you're using, so the syntax could be wrong, but multi-table joins aren't any different, really, than joining two tables:
SELECT ...
FROM a
JOIN b ON ...
JOIN c ON ...
JOIN d ON ...
No, you will do it as follows
SELECT *
FROM A [INNER/LEFT/RIGHT/OUTER] JOIN
B ON [a/b].IDCols = [a/b].IDCols [INNER/LEFT/RIGHT/OUTER] JOIN
C ON [a/b/c].IDCols = [a/b/c].IDCols
The specific joins (INNER/LEFT/RIGT/OUTER) will depend on what your requirements are.
Have a look at Introduction to JOINs – Basic of JOINs for an overview
The criteria for the JOIN ONs will also depend on how the tables relate to one another.
You can either use something like this:
select *
from A, B, C
where A.id = B.id
and A.id = C.id
or you can use something like this:
select *
from A INNER JOIN B ON (A.id = B.id)
INNER JOIN C ON (A.id = C.id)
How you join will of course depend on how the tables relate to each other, so what's the primary key and foreign key of A,B,C.
You might also use OUTER JOIN instead of INNER JOIN, depending on our data.