What's the difference between just using multiple froms and joins? - sql

Say I have this query:
SELECT bugs.id, bug_color.name FROM bugs, bug_color
WHERE bugs.id = 1 AND bugs.id = bug_color.id
Why would I use a join? And what would it look like?

Joins are synticatic sugar, easier to read.
Your query would look like this with a join:
SELECT bugs.id, bug_color.name
FROM bugs
INNER JOIN bug_color ON bugs.id = bug_color.id
WHERE bugs.id = 1
With more then two tables, joins help make a query more readable, by keeping conditions related to a table in one place.

The join keyword is the new way of joining tables.
When I learned SQL it did not yet exist, so joining was done the way that you show in your question.
Nowadays we have things like joins and aliases to make the queries more readable:
select
b.id, c.name
from
bugs as b
inner join
bug_color as c on c.id = b.id
where
b.id = 1
Also there are other variations of joins, as left outer join, right outer join and full join, that is harder to accomplish with the old syntax.

Join syntax allows for outer joins, so you can go:
SELECT bugs.id, bug_color.name
FROM bugs, bug_color
LEFT OUTER JOIN bug_color ON bugs.id = bug_color.id
WHERE bugs.id = 1

Related

SQL - Consecutive "ON" Statements

As I was cleaning up some issues in an old view in our database I came across this "strange" join condition:
from
tblEmails [e]
join tblPersonEmails [pe]
on (e.EmailID = pe.EmailID)
right outer join tblUserAccounts [ua]
join People [p]
on (ua.PersonID = p.Id)
join tblChainEmployees [ce]
on (ua.PersonID = ce.PersonID)
on (pe.PersonID = p.Id)
Table tblUserAccounts is referenced as a right outer join, but the on condition for it is not declared until after tblChainEmployees is referenced; then there are two consecutive on statements in a row.
I couldn't find a relevant answer anywhere on the Internet, because I didn't know what this kind of join is called.
So the questions:
Does this kind of "deferred conditional" join have a name?
How can this be rewritten to produce the same result set where the on statements are not consecutive?
Maybe this is a "clever" solution when there has always been a simpler/clearer way?
(1) This is just syntax and I've never heard of some special name. If you read carefully this MSDN article you'll see that (LEFT|RIGHT) JOIN has to be paired with ON statement. If it's not, expression inside is parsed as <table_source>. You can put parentheses to make it more readable:
from
tblEmails [e]
join tblPersonEmails [pe]
on (e.EmailID = pe.EmailID)
right outer join
(
tblUserAccounts [ua]
join People [p]
on (ua.PersonID = p.Id)
join tblChainEmployees [ce]
on (ua.PersonID = ce.PersonID)
) on (pe.PersonID = p.Id)
(2) I would prefer LEFT syntax, with explicit parentheses (I know, it's a matter of taste). This produces the same execution plan:
FROM tblUserAccounts ua
JOIN People p ON ua.PersonID = p.Id
JOIN tblChainEmployees ce ON ua.PersonID = ce.PersonID
LEFT JOIN
(
tblEmails e
JOIN tblPersonEmails pe ON e.EmailID = pe.EmailID
) ON pe.PersonID = p.Id
(3) Yes, it's clever, just like some C++ expressions (i.e. (i++)*(*t)[0]<<p->a) on interviews. Language is flexible. Expressions and queries can be tricky, but some 'arrangements' lead to readability degradation and errors.
Looks to me like you have tblEmail and tblPerson with their own independent IDs, emailID and ID (person), a linking table tblPersonEmail with the valid pairs of emailID/IDs, and then the person table may have a 1-1 relationship with UserAccount, which may then have a 1-1 relationship with chainEmployee, so to get rid of the RIGHT OUTER JOIN in favor of LEFT, I'd use:
FROM
((tblPerson AS p INNER JOIN
(tblEmail AS e INNER JOIN
tblPersonEmail AS pe ON
e.emailID = pe.emailID) ON
p.ID = pe.personID) LEFT JOIN
tblUserAccount AS ua ON
p.ID = ua.personID) LEFT JOIN
tblChainEmployee AS ce ON
ua.personID = ce.personID
I can't think of a great practical example of this off the top of my head so I'll give you a generic example that hopefully makes sense. Unfortunately I'm not aware of a generic name for this either.
Many people will start off with a query like this:
select ...
from
A a left outer join
B b on b.id = a.id left outer join
C c on c.id2 = b.id2;
The look at the results and realize that they really need to eliminate the rows in B that don't have a corresponding C but if you tried to say where b.id2 is not null and c.id2 is not null you've defeated the whole purpose of the left join from A.
So next you try to do this but it doesn't take long to figure out it's not going to work. The inner join at the tail end of the chain has basically converted both the joins to inner joins.
select ...
from
A a left outer join
B b on b.id = a.id inner join
C c on c.id2 = b.id2;
The problem seems simple yet it doesn't work right. Essentially after you ponder for a while you discover that you need to control the join order and do the inner join first. So the three queries below are equivalent ways to accomplish that. The first one is probably the one you're more familiar with:
select ...
from
A a left outer join
(select * from B b inner join C c on c.id2 = b.id2) bc
on bc.id = a.id
select ...
from
A a left outer join
B b inner join
C c on c.id2 = b.id2
on b.id = a.id
select ...
from
B b inner join
C c on c.id2 = b.id2 right outer join -- now they can be done in order
A a on a.id = b.id
You query is a little more complicated but ultimately the same issues came into play which is where the odd stuff came from. SQL has evolved and you have to remember that platforms didn't always have the fancy things like derived tables, scalar subqueries, CTEs so sometimes people had to write things this way. And then there were graphical query builders with a lot of limitations in older versions of tools like Crystal Report that didn't allow for complex join conditions...

What is actually sef-join?

I have several question about self join, could anyone help answer it?
is there strict format of self join? There are sample like this:
SELECT a.column_name, b.column_name...
FROM table1 a, table1 b
WHERE a.common_field = b.common_field;
But there are sample like:
SELECT a.ID, b.NAME, a.SALARY
FROM CUSTOMERS a, CUSTOMERS b
WHERE a.SALARY < b.SALARY;
I wonder is the connection (a.common_field = b.common_field) necessary? since both formats are self join.
How will the self join be optimized? will they are treated as INNER JOIN or CROSS JOIN? especially, for the second format, is it SELF CROSS JOIN? In SQLite and PostgreSQL, are they treated same way?
My question is I want to extract a structure from a bunch of graph-like data and My query is like
SELECT A.colum, B.colum,....N.colum
FROM
table1 as A, table1 as B, table1 as C .... table2 as M, table2 as N ....
where
A.colum1<B.colum1 and
C.colum1=D.colum1 and
....
In the query, table1,table2... are single column tables, they are components of final structure. is my problem best in this kind of self-join format? I find it's very slow in PostgreSQL but fast in SQLite which makes me confused.
A self join is no different than any other join as far as structure/behavior goes, but they are typically used in different ways.
You should ditch the deprecated syntax of comma separated lists of tables and use ANSI joins:
SELECT a.column_name, b.column_name...
FROM table1 a
JOIN table1 b
ON a.common_field = b.common_field;
You can specify what type of JOIN you want it to be (JOIN,LEFT JOIN, RIGHT JOIN,CROSS JOIN..), and how you want to relate the tables to each other, just like any other join. Equivalency is not required, as you've noted in your a.Salary < b.Salary example.
No, there's no such thing.
A self join is just a special case of joining the table with itself. Think about it like joining two instances of the same thing (is fact no using two instances but two references)
In general you ill inner self join but you can cross join or outter join a table with itself.
Example:
select * from tbPeople p0
join tbPeople p1 on p1.id = p0.parentId
where p0.id = you
that returns you and your parents
select * from tbPeople p0
left join tbPeople p1 on p1.parentId = p0.id
where p0.id = you
that returns your kids, or just you in case you don't have offspring yet

How to using Left Outer Join or Right Outer Join in Oracle 11g

I have a query using "=" in where clause, but it is long time to execute when many datas. How to use the Left Outer Join or Right Outer Join or something like that to increase performance
This is query:
select sum(op.quantity * op.unit_amount) into paid_money
from tableA op , tableB ssl, tableC ss, tableD pl, tableE p
where (op.id = ssl.id and ssL.id = ss.id and ss.type='A')
or
(op.id = pl.id and pl.id = p.id and p.type='B');
Your problem is not left or right joins. It is cross joins. You are doing many unnecessary cartesian products. I'm guessing this query will never finish. If it did, you'd get the wrong answer anyway.
Split this into two separate joins and then bring the results together. Only use the tables you need for each set of joins:
select SUM(val) into paid_money
from (select sum(op.quantity * op.unit_amount) as val
from tableA op , tableB ssl
where (op.id = ssl.id and ssL.id = ss.id and ss.type='A')
union all
select sum(op.quantity * op.unit_amount) as val
from tableA op , tableD pl, tableD p
where (op.id = pl.id and pl.id = p.id and p.type='B')
) t
I haven't fixed your join syntax. But, you should learn to use the join keyword and to put the join conditions in an on clause rather than the where clause.
Are you sure that this query is returning the required data? To me it looks like it will be returning the cartesian product of op, ssl & ss for each op, pl, p match and vice versa.
I would advise that you split it into two seperate queries, union them together, and then sum over the top.

Adding more condition while joining or in where which is better?

SELECT C.*
FROM Content C
INNER JOIN ContentPack CP ON C.ContentPackId = CP.ContentPackId
AND CP.DomainId = #DomainId
...and:
SELECT C.*
FROM Content C
INNER JOIN ContentPack CP ON C.ContentPackId = CP.ContentPackId
WHERE CP.DomainId = #DomainId
Is there any performance difference between this 2 queries?
Because both queries use an INNER JOIN, there is no difference -- they're equivalent.
That wouldn't be the case if dealing with an OUTER JOIN -- criteria in the ON clause is applied before the join; criteria in the WHERE is applied after the join.
But your query would likely run better as:
SELECT c.*
FROM CONTENT c
WHERE EXISTS (SELECT NULL
FROM CONTENTPACK cp
WHERE cp.contentpackid = c.contentpackid
AND cp.domainid = #DomainId)
Using a JOIN risks duplicates if there's more than one CONTENTPACK record related to a CONTENT record. And it's pointless to JOIN if your query is not using columns from the table being JOINed to... JOINs are not always the fastest way.
There's no performance difference but I would prefer the inner join because I think it makes very clear what is it that you are trying to join on both tables.

Multiple join in mysql

I am trying to do a multiple join like this
SELECT * FROM (((Customer FULL JOIN Booking ON Customer.ID = Booking.CustID)
FULL JOIN Flight ON Booking.FlightID = Flight.ID)
FULL JOIN FlightRoute ON Flight.RouteID = FlightRoute.ID)
But it is syntactically incorrect according to mysql. Please help
There is no FULL JOIN in MySQL. It's convoluted but a FULL JOIN is equivalent to a UNION ALL between a LEFT JOIN and a RIGHT JOIN, using a condition to remove duplicates. It's late in the day and the thought of your 3 FULL JOINs in that statement is hurting my head.
You do say in Conrad Frix's answer that removing the FULL makes it work, if it does then you have misunderstood how FULL JOINs and INNER JOINs work.
For the first FULL JOIN it would look like:
SELECT * FROM Customers c
LEFT JOIN Booking b ON c.ID = b.CustId
UNION ALL
SELECT * FROM Customers c
RIGHT JOIN Booking b ON c.ID = b.CustId
WHERE c.ID IS NULL
Use that basis to form the rest of your statements.
When i was looking about MySql and FULL Join i found this article which explains a number of ways to emulate FULL JOIN in Mysql. I think it could be usefull to you.
How to simulate FULL OUTER JOIN in MySQL
MySQL isn't my area of expertises but I don't think MySQL supports full joins. I found a simple article that details some alternatives.
http://www.xaprb.com/blog/2006/05/26/how-to-write-full-outer-join-in-mysql/