how to best organize the Inner Joins in (select) statement - sql

let's say i have three tables, each one relates to another,
when i need to get a column from each table, does it make difference how to organize
the (inner joins)??
Select table1.column1,table2.column2,table3.column2
From table1
Inner Join table2 on ..... etc
Inner Join table3 on .....
in another words, can i put (table2) after (From )??
Select table1.column1,table2.column2,table3.column2
From table2
Inner Join table1 on ..... etc
Inner Join table3 on .....

For most queries, order does not matter.
An INNER JOIN is both associative and commutative so table order does not matter
SQL is declarative. That is, how you define the query is not how the optimiser works it out. It does not do it line by line as you wrote it.
That said...
OUTER JOINs are neither associative nor commutative
For complex queries, the optimiser will "best guess" rather than go through all possibilities which "costs" too much. Table order may matter here

The inner join operation has left to right associativity. It doesn't matter much in which order you write the tables, as long as you don't refer to any tables in the ON condition before they have been joined.
When the statement is executed the database engine will determine the best order to execute the join and this may be different from the order they appear in the SQL query.

Related

There are a few questions about the select statement

I have a few questions about the select statement.
First of all, I have normalized 15 tables for this select query.
The problem is invisible because there is not much data right now.
However, since I try to process many tables in one select query, it seems to cause problems later.
So I want to add a few more select statements to divide the tables to search, but I want to know how different it is from doing it at once.
Secondly, if I use join, I will use outer join. If I join multiple tables with outer join, I'm not sure how to use left outer join and right outer join.
The currently created select query refers to 8 tables and one join is linked.
That is, the remaining rest of the tables have obtained data in subqueries and the remaining eight tables are likely to use join.
I would appreciate it if you could let me know the direction of the multiple outer joins.
Let me briefly show you some of the current select queries.
select
a.cal1,a.cal2,a.cal3,...,
(select b.cal1 from b
where a.cal4=b.cal2)
as "bcals",
(select c.cal1 from c
where a.cal5=c.cal2)
as "ccals",
....,
(select e.cal1 from e
where a.caln=e.cal2)
as "ecals",
(select sum(extract(year from age(f.endday,f.startday))
from f
where e.cal1=a.cal1)
as "fcals",
g.cal1,g.cal2,g.cal3,...,
(select h.cal1 from h
where g.cal4=h.cal2)
as "hcals"
from a left outer join g on a.cal1=g.cal5
where a.cal1=?;
Result:
a.cal1|a.cal2|a.cal3|...|hcals
var1 |var2 |var3 |...|varn
After this, I wonder how to join the rest of the tables.
To sum up
If there are many tables that need to be included in a select query statement, what is the difference between performance and performance when this complex query is divided into multiple select statements?
If we write inside a select statement, how should outer join be?
Is there a problem with the query?
Actually your code is correct, but it looks very complex. People will find it difficult to understand it. Using joins you can minimize the lines of code and also make it more readable.
SELECT
TBL1.AMOUNT T1,
TBL2.AMOUNT T2,
TBL3.AMOUNT T3
FROM TBL1
LEFT JOIN TBL2 ON TBL2.ID = TBL1.ID
LEFT JOIN TBL3 ON TBL3.ID = TBL1.ID
In the above code , there are three tables, and two joins. One can easily understand and debug/make changes. Please try this for your code.

Is the order of joining tables indifferent as long as we chose proper join types?

Can we achieve desired results of joining tables by executing joins in whatever order? Suppose we want to left join two tables A and B (order AB). We can get the same results with right join of B and A (BA).
What about 3 tables ABC. Can we get whatever results by only changing order and joins types? For example A left join B inner join C. Can we get it with BAC order? What about if we have 4 or more tables?
Update.
The question Does the join order matter in SQL? is about inner join type. Agreed that then the order of join doesn't matter. The answer provided in that question does not answer my question whether it is possible to get desired results of joining tables with whatever original join types (join types here) by choosing whatever order of tables we like, and achieve this goal only by manipulating with join types.
In an inner join, the ordering of the tables in the join doesn't matter - the same rows will make up the result set regardless of the order they are in the join statement.
In either a left or right outer join, the order DOES matter. In A left join B, your result set will contain one row for every record in table A, irrespective of whether there is a matching row in table B. If there are non matching rows, this is likely to be a different result set to B left join A.
In a full outer join, the order again doesn't matter - rows will be produced for each row in each joined table no matter what their order.
Regarding A left join B vs B right join A - these will produce the same results. In simple cases with 2 tables, swapping the tables and changing the direction of the outer join will result in the same result set.
This will also apply to 3 or more tables if all of the outer joins are in the same direction - A left join B left join C will give the same set of results as C right join B right join A.
If you start mixing left and right joins, then you will need to start being more careful. There will almost always be a way to make an equivalent query with re-ordered tables, but at that point sub-queries or bracketing off expressions might be the best way to clarify what you are doing.
As another commenter states, using whatever makes your purpose most clear is usually the best option. The ordering of the tables in your query should make little or no difference performance wise, as the query optimiser should work this out (although the only way to be sure of this would be to check the execution plans for each option with your own queries and data).

Successive LEFT SQL joins back to original

I need to redo sql statement in legacy Foxpro application and don't understand whether it is meaningful at all. Syntax is a bit specific - it extracts data from temporary table into the same temporary table ( overwriting) with some joins.
SELECT aa.*,b.spa_date FROM (ALIAS()) aa INNER JOIN jobs ON aa.seq=jobs.seq ;
LEFT JOIN job2 ON jobs.job_no=job2.rucjob;
left join jobs b on b.job_no=job2.job_no;
WHERE jobs.qty1<>0 INTO CURSOR (ALIAS())
Since only one field is added from joined tables ( spa_date ) is there any point in 2 left joins or I am missing something. Isn't it equivalent to
SELECT aa.*,jobs.spa_date FROM (ALIAS()) aa INNER JOIN jobs ON aa.seq=jobs.seq ;
WHERE jobs.qty1<>0 INTO CURSOR (ALIAS())
They are different because b.spa_date come from the second left join. You may be missing filtered rows without both left joins.
You would need to know the intent of the original query and perhaps rewrite it to make more sense but I'd say the two queries are different.

When to use SQL natural join instead of join .. on?

I'm studying SQL for a database exam and the way I've seen SQL is they way it looks on this page:
http://en.wikipedia.org/wiki/Star_schema
IE join written the way Join <table name> On <table attribute> and then the join condition for the selection. My course book and my exercises given to me from the academic institution however, use only natural join in their examples. So when is it right to use natural join? Should natural join be used if the query can also be written using JOIN .. ON ?
Thanks for any answer or comment
A natural join will find columns with the same name in both tables and add one column in the result for each pair found. The inner join lets you specify the comparison you want to make using any column.
IMO, the JOIN ON syntax is much more readable and maintainable than the natural join syntax. Natural joins is a leftover of some old standards, and I try to avoid it like the plague.
A natural join will find columns with the same name in both tables and add one column in the result for each pair found. The inner join lets you specify the comparison you want to make using any column.
The JOIN keyword is used in an SQL statement to query data from two or more tables, based on a relationship between certain columns in these tables.
Different Joins
* JOIN: Return rows when there is at least one match in both tables
* LEFT JOIN: Return all rows from the left table, even if there are no matches in the right table
* RIGHT JOIN: Return all rows from the right table, even if there are no matches in the left table
* FULL JOIN: Return rows when there is a match in one of the tables
INNER JOIN
http://www.w3schools.com/sql/sql_join_inner.asp
FULL JOIN
http://www.w3schools.com/sql/sql_join_full.asp
A natural join is said to be an abomination because it does not allow qualifying key columns, which makes it confusing. Because you never know which "common" columns are being used to join two tables simply by looking at the sql statement.
A NATURAL JOIN matches on any shared column names between the tables, whereas an INNER JOIN only matches on the given ON condition.
The joins often interchangeable and usually produce the same results. However, there are some important considerations to make:
If a NATURAL JOIN finds no matching columns, it returns the cross
product. This could produce disastrous results if the schema is
modified. On the other hand, an INNER JOIN will return a 'column does
not exist' error. This is much more fault tolerant.
An INNER JOIN self-documents with its ON clause, resulting in a
clearer query that describes the table schema to the reader.
An INNER JOIN results in a maintainable and reusable query in
which the column names can be swapped in and out with changes in the
use case or table schema.
The programmer can notice column name mis-matches (e.g. item_ID vs itemID) sooner if they are forced to define the ON predicate.
Otherwise, a NATURAL JOIN is still a good choice for a quick, ad-hoc query.

TABLE1 T1, TABLE2 T2 WHERE T1.Blah = T2.Blah - VS - INNER JOIN

Provided that the tables could essentially be inner joined, since the where clause excludes all records that don't match, just exactly how bad is it to use the first of the following 2 query statement syntax styles:
SELECT {COLUMN LIST}
FROM TABLE1 t1, TABLE2 t2, TABLE3 t3, TABLE4 t4 (etc)
WHERE t1.uid = t2.foreignid
AND t2.uid = t3.foreignid
AND t3.uid = t4.foreignid
etc
instead of
SELECT {COLUMN LIST}
FROM TABLE1 t1
INNER JOIN TABLE2 t2 ON t1.uid = t2.foreignid
INNER JOIN TABLE3 t3 ON t2.uid = t3.foreignid
INNER JOIN TABLE4 t4 ON t3.uid = t4.foreignid
I'm not sure if this is limited to microsoft SQL, or even a particular version, but my understanding is that the first scenario does a full outer join to make all possible correlations accessible.
I've used the first approach in the past to optimise queries that access two significantly large stores of data that each have peripheral table joined to them, with the product of those joins coming together late in the query. By allowing each of the "larger" table to join to their respective lookup tables, and only combining a specific subset of each of the larger tables, I found that there were notable speed improvements over introducing the large tables to each other prior to specific filtering.
Under normal (simple joins) circumstance, would it not be far better to use the second scenario? I find it to be more easily readable and it seems like it'll be much faster.
INNER JOIN ON vs WHERE clause
Maybe the best way to answer this is to take a look at how the database handles the query internally. If you're on SQL Server, use Profiler to see how many reads etc. each query takes and the query plan to see what route is being taken through the data. Statistics, skewing etc. will also most likely play a role.
The first query doesn't produce a full OUTER join (which is the union of both LEFT and RIGHT joins). Essentially unless there are some [internal] SQL parser - specific optimizations, both queries are equal.
Personally I would never use the first syntax. It may be the same performancewise but it is harder to maintain and far more subject to accidental cross joins when things get complex. If you miss an ON condition, it will fail the syntax check , if you miss one of the WHERE conditions that is the equivalent of an ON condition, it will happily do a cross join. It is also a syntax that is 17 years out of date for goodness sakes!
Further, the left and right join syntax in the old syntax are broken in SQL Server and do NOT always return the correct results (it can sometimes interpet the results as a corss join instead of an outerjoin) and they have been deprecated and will not be useable at all in the next version. If you need to change one of the queries to use an outer join, then you can be looikng at a major rewrite as it is especially bad to try to mix the two kinds of syntax.