SQL joining three tables, join precedence - sql

I have three tables: R, S and P.
Table R Joins with S through a foreign key; there should be at least one record in S, so I can JOIN:
SELECT
*
FROM
R
JOIN S ON (S.id = R.fks)
If there's no record in S then I get no rows, that's fine.
Then table S joins with P, where records is P may or may not be present and joined with S.
So I do
SELECT
*
FROM
R
JOIN S ON (S.id = R.fks)
LEFT JOIN P ON (P.id = S.fkp)
What if I wanted the second JOIN to be tied to S not to R, like if I could use parentheses:
SELECT
*
FROM
R
JOIN (S ON (S.id = R.fks) JOIN P ON (P.id = S.fkp))
Or is that already a natural behaviour of the cartesian product between R, S and P?

All kinds of outer and normal joins are in the same precedence class and operators take effect left-to-right at a given nesting level of the query. You can put the join expression on the right side in parentheses to cause it to take effect first. Remember that you will have to move the ON clauses around so that they stay with their joins—the join in parentheses takes its ON clause with it into the parentheses, so it now comes textually before the other ON clause which will be after the parentheses in the outer join statement.
(PostgreSQL example)
In
SELECT * FROM a LEFT JOIN b ON (a.id = b.id) JOIN c ON (b.ref = c.id);
the a-b join takes effect first, but we can force the b-c join to take effect first by putting it in parentheses, which looks like:
SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
Often you can express the same thing without extra parentheses by moving the joins around and changing the direction of the outer joins, e.g.
SELECT * FROM b JOIN c ON (b.ref = c.id) RIGHT JOIN a ON (a.id = b.id);

When you join the third table, your first query
SELECT
*
FROM
R
JOIN S ON (S.id = R.fks)
is like a derived table to which you're joining the third table. So if R JOIN S produces no rows, then joining P will never yield any rows (because you're trying to join to an empty table).
So, if you're looking for precedence rules then in this case it's just set by using LEFT JOIN as opposed to JOIN.
However, I may be misunderstanding your question, because if I were writing the query, I would swap S and R around. eg.
SELECT
*
FROM
S
JOIN R ON (S.id = R.fks)

The second join is tied to S as you explicity state JOIN P ON (P.id = S.fkp) - no column from R is referenced in the join.

with a as (select 1 as test union select 2)
select * from a left join
a as b on a.test=b.test and b.test=1 inner join
a as c on b.test=c.test
go
with a as (select 1 as test union select 2)
select * from a inner join
a as b on a.test=b.test right join
a as c on b.test=c.test and b.test=1
Ideally, we would hope that the above two queries are the same. However, they are not - so anybody that says a right join can be replaced with a left join in all cases is wrong. Only by using the right join can we get the required result.

Related

use Inner Join in SQL

I want to join two tables and then I want to join this result with another table
but it doesn't work
select * from
(
(select SeId,FLName,Company from Sellers) s
inner join
(select SeId,BIId from BuyInvoices) b
on s.SeId=b.SeId
) Y
inner join
(select * from BuyPayments) X
on Y.BIId=X.BIId
thanks
In most databases, your syntax isn't going to work. Although parentheses are allowed in the FROM clause, they don't get their own table aliases.
You can simplify the JOIN. This is a simpler way to write the logic:
select s.SeId, s.FLName, s.Company, bp.*
from Sellers s inner join
BuyInvoices b
on s.SeId = b.SeId inner join
BuyPayments bp
on bp.BIId = b.BIId;

SQL - Consecutive "ON" Statements

As I was cleaning up some issues in an old view in our database I came across this "strange" join condition:
from
tblEmails [e]
join tblPersonEmails [pe]
on (e.EmailID = pe.EmailID)
right outer join tblUserAccounts [ua]
join People [p]
on (ua.PersonID = p.Id)
join tblChainEmployees [ce]
on (ua.PersonID = ce.PersonID)
on (pe.PersonID = p.Id)
Table tblUserAccounts is referenced as a right outer join, but the on condition for it is not declared until after tblChainEmployees is referenced; then there are two consecutive on statements in a row.
I couldn't find a relevant answer anywhere on the Internet, because I didn't know what this kind of join is called.
So the questions:
Does this kind of "deferred conditional" join have a name?
How can this be rewritten to produce the same result set where the on statements are not consecutive?
Maybe this is a "clever" solution when there has always been a simpler/clearer way?
(1) This is just syntax and I've never heard of some special name. If you read carefully this MSDN article you'll see that (LEFT|RIGHT) JOIN has to be paired with ON statement. If it's not, expression inside is parsed as <table_source>. You can put parentheses to make it more readable:
from
tblEmails [e]
join tblPersonEmails [pe]
on (e.EmailID = pe.EmailID)
right outer join
(
tblUserAccounts [ua]
join People [p]
on (ua.PersonID = p.Id)
join tblChainEmployees [ce]
on (ua.PersonID = ce.PersonID)
) on (pe.PersonID = p.Id)
(2) I would prefer LEFT syntax, with explicit parentheses (I know, it's a matter of taste). This produces the same execution plan:
FROM tblUserAccounts ua
JOIN People p ON ua.PersonID = p.Id
JOIN tblChainEmployees ce ON ua.PersonID = ce.PersonID
LEFT JOIN
(
tblEmails e
JOIN tblPersonEmails pe ON e.EmailID = pe.EmailID
) ON pe.PersonID = p.Id
(3) Yes, it's clever, just like some C++ expressions (i.e. (i++)*(*t)[0]<<p->a) on interviews. Language is flexible. Expressions and queries can be tricky, but some 'arrangements' lead to readability degradation and errors.
Looks to me like you have tblEmail and tblPerson with their own independent IDs, emailID and ID (person), a linking table tblPersonEmail with the valid pairs of emailID/IDs, and then the person table may have a 1-1 relationship with UserAccount, which may then have a 1-1 relationship with chainEmployee, so to get rid of the RIGHT OUTER JOIN in favor of LEFT, I'd use:
FROM
((tblPerson AS p INNER JOIN
(tblEmail AS e INNER JOIN
tblPersonEmail AS pe ON
e.emailID = pe.emailID) ON
p.ID = pe.personID) LEFT JOIN
tblUserAccount AS ua ON
p.ID = ua.personID) LEFT JOIN
tblChainEmployee AS ce ON
ua.personID = ce.personID
I can't think of a great practical example of this off the top of my head so I'll give you a generic example that hopefully makes sense. Unfortunately I'm not aware of a generic name for this either.
Many people will start off with a query like this:
select ...
from
A a left outer join
B b on b.id = a.id left outer join
C c on c.id2 = b.id2;
The look at the results and realize that they really need to eliminate the rows in B that don't have a corresponding C but if you tried to say where b.id2 is not null and c.id2 is not null you've defeated the whole purpose of the left join from A.
So next you try to do this but it doesn't take long to figure out it's not going to work. The inner join at the tail end of the chain has basically converted both the joins to inner joins.
select ...
from
A a left outer join
B b on b.id = a.id inner join
C c on c.id2 = b.id2;
The problem seems simple yet it doesn't work right. Essentially after you ponder for a while you discover that you need to control the join order and do the inner join first. So the three queries below are equivalent ways to accomplish that. The first one is probably the one you're more familiar with:
select ...
from
A a left outer join
(select * from B b inner join C c on c.id2 = b.id2) bc
on bc.id = a.id
select ...
from
A a left outer join
B b inner join
C c on c.id2 = b.id2
on b.id = a.id
select ...
from
B b inner join
C c on c.id2 = b.id2 right outer join -- now they can be done in order
A a on a.id = b.id
You query is a little more complicated but ultimately the same issues came into play which is where the odd stuff came from. SQL has evolved and you have to remember that platforms didn't always have the fancy things like derived tables, scalar subqueries, CTEs so sometimes people had to write things this way. And then there were graphical query builders with a lot of limitations in older versions of tools like Crystal Report that didn't allow for complex join conditions...

SQL: Proper JOIN Protocol

I have the following tables with the following attributes:
Op(OpNo, OpName, Date)
OpConvert(OpNo, M_OpNo, Source_ID, Date)
Source(Source_ID, Source_Name, Date)
Fleet(OpNo, S_No, Date)
I have the current multiple JOIN query which gives me the results that I want:
SELECT O.OpNo AS Op_NO, O.OpName, O.Date AS Date_Entered, C.*
FROM Op O
LEFT OUTER JOIN OpConvert C
ON O.OpNo = C.OpNo
LEFT OUTER JOIN Source D
ON C.Source_ID = D.Source_ID
WHERE C.OpNo IS NOT NULL
The problem is this. I need to join the Fleet table on the previous multiple JOIN statement to attach the relevant S_No to the multiple JOIN table. Would I still be able to accomplish this using a LEFT OUTER JOIN or would I have to use a different JOIN statement? Also, which table would I JOIN on?
Please note that I am only familiar with LEFT OUTER JOINS.
Thanks.
I guess in your case you could use INNER JOIN or LEFT JOIN (which is the same thing as LEFT OUTER JOIN in SQL Server.
INNER JOIN means that it will only return records from other tables only if there are corresponding records (based on the join condition) in the Fleet table.
LEFT JOIN means that it will return records from other tables even if there are no corresponding records (based on the join condition) in the Fleet table. All columns from Fleet will return NULL in this case.
As for which table to join, you should really join the table that makes more logical sense based on your data structure.
Yes, you can use all tables mentioned before in your join conditions. Actually, JOINS (no matter of INNER, LEFT OUTER, RIGHT OUTER, CROSS, FULL OUTER or whatever) are left- associative, i. e. they are implicitly evaluated as if they would have been included in parentheses from the left as follows:
FROM ( ( ( Op O
LEFT OUTER JOIN OpConvert C
ON O.OpNo = C.OpNo
)
LEFT OUTER JOIN Source D
ON C.Source_ID = D.Source_ID
)
LEFT OUTER JOIN Fleet
ON ...
)
This is similar to how + or - would implicitly use parentheses, i. e.
2 + 3 - 4 - 5
is evaluated as
(((2 + 3) - 4) - 5)
By the way: If you use C.OpNo IS NOT NULL, then the LEFT OUTER JOIN Source D is treated as if it were an INNER JOIN, as you are explicitly removing all the "OUTER" rows.

How to using Left Outer Join or Right Outer Join in Oracle 11g

I have a query using "=" in where clause, but it is long time to execute when many datas. How to use the Left Outer Join or Right Outer Join or something like that to increase performance
This is query:
select sum(op.quantity * op.unit_amount) into paid_money
from tableA op , tableB ssl, tableC ss, tableD pl, tableE p
where (op.id = ssl.id and ssL.id = ss.id and ss.type='A')
or
(op.id = pl.id and pl.id = p.id and p.type='B');
Your problem is not left or right joins. It is cross joins. You are doing many unnecessary cartesian products. I'm guessing this query will never finish. If it did, you'd get the wrong answer anyway.
Split this into two separate joins and then bring the results together. Only use the tables you need for each set of joins:
select SUM(val) into paid_money
from (select sum(op.quantity * op.unit_amount) as val
from tableA op , tableB ssl
where (op.id = ssl.id and ssL.id = ss.id and ss.type='A')
union all
select sum(op.quantity * op.unit_amount) as val
from tableA op , tableD pl, tableD p
where (op.id = pl.id and pl.id = p.id and p.type='B')
) t
I haven't fixed your join syntax. But, you should learn to use the join keyword and to put the join conditions in an on clause rather than the where clause.
Are you sure that this query is returning the required data? To me it looks like it will be returning the cartesian product of op, ssl & ss for each op, pl, p match and vice versa.
I would advise that you split it into two seperate queries, union them together, and then sum over the top.

Sybase 12 LEFT JOIN Performance issue for CrystalReports

I am trying to optimize a Crystal Report that is used very frequently here. I succeeded to optimize lots of queries but I still have one last bottleneck: This is the main query, generated from the report.
SELECT
A.*,
B.*,
C.*,
D.*,
E."N",
F."N",
G."N"
FROM
A
LEFT OUTER JOIN B ON
A."PK" = B."FK"
LEFT OUTER JOIN C ON
A."PK" = C."FK"
LEFT OUTER JOIN D ON
A."FK" = D."PK"
LEFT OUTER JOIN E ON
A."PK" = E."FK"
LEFT OUTER JOIN F ON
A."PK" = F."FK"
LEFT OUTER JOIN G ON
A."PK" = G."FK"
WHERE A.PK = ####
A,B,C and D are tables. E,F,G are simple views.
As you see, the report generated multiple LEFT JOINS. This query takes 2.28 seconds to complete (From the Plan Viewer stats). I identified three joins that seem problematic. If I remove E,F,G from the query, it becomes almost instant (0.0009s from the same stats)
SELECT
A.*,
B.*,
C.*,
D.*
FROM
A
LEFT OUTER JOIN B ON
A."PK" = B."FK"
LEFT OUTER JOIN C ON
A."PK" = C."FK"
LEFT OUTER JOIN D ON
A."FK" = D."PK"
WHERE A.PK = ####
I tought it might be the views that are slow, but if I do for example ...
SELECT *
FROM E
WHERE E.FK = ####
... it is also almost instant (0.0009s)
Tables all have indexes on PKs-FKs.
Views E,F,G all return one or no row with [FK|N] as columns, so the resulting column is NULL or a number.
Do you know how I could make this query fast?
PS: If I replace LEFT OUTER JOINS by INNER JOINS the main query becomes fast... :-/
Or trying to split this query into multiple queries on the report would be a better solution?
Thank you!
I would create functions for the lookup against E, F and G instead of joining them.
That way there is little chance the optimiser gets confused and tries to do stupid things.
SELECT
A.*,
B.*,
C.*,
D.*,
GET_E(A."PK"),
GET_F(A."PK"),
GET_G(A."PK")
FROM
A
LEFT OUTER JOIN B ON
A."PK" = B."FK"
LEFT OUTER JOIN C ON
A."PK" = C."FK"
LEFT OUTER JOIN D ON
A."FK" = D."PK"
WHERE A.PK = ####
The problem is probably because you are creating a huge cartesian product of 5 tables all joined to A in some way (A and D will only contribute one record to the product). Having such a big cartesian product will consume quite a bit of memory internally in Sybase. It is likely that your query is just wrong.