Difference between inner join and where in select join SQL statement [duplicate] - sql

This question already has answers here:
INNER JOIN ON vs WHERE clause
(12 answers)
Closed 9 years ago.
I have two select join SQL statements:
select a.id from table_a as a, table_b as b where a.id=b.id;
select a.id from table_a as a inner join table_b as b on a.id=b.id;
Obviously, they are the same in result. But is there any difference between them , such as performance, portability.

One difference is that the first option hides the intent by expressing the join condition in the where clause.
The second option, where the join condition is written out is more clear for the user reading the query. It shows the exact intent of the query.
As far as performance or any other difference, there shouldn't be any. Both queries should return the exact same result and perform the same under most RDBMS.

The inner join syntax was added to SQL sometime in the 1990s. It's possible, but unlikely, that the optimizer can do better with it than with the old syntax that used the where clause for the join condition.
They should both be highly portable as things are now.
The inner join syntax is preferable because it is easier on the reader, as others have already remarked.

Both are standard SQL. Different DB systems may optimize them differently, but because they are so simple, I would be a little surprised if they do. But that is the nature of SQL: it is declarative, which gives the implementation a great deal of leeway in how to execute your query. There is no guarantee that these perform the same, or if they are different, which is faster.

They are exactly the same in SQL server. There is no performance difference.

Related

What is the difference between JOIN and SELECT? [duplicate]

This question already has answers here:
INNER JOIN ON vs WHERE clause
(12 answers)
Closed 9 years ago.
I have two select join SQL statements:
select a.id from table_a as a, table_b as b where a.id=b.id;
select a.id from table_a as a inner join table_b as b on a.id=b.id;
Obviously, they are the same in result. But is there any difference between them , such as performance, portability.
One difference is that the first option hides the intent by expressing the join condition in the where clause.
The second option, where the join condition is written out is more clear for the user reading the query. It shows the exact intent of the query.
As far as performance or any other difference, there shouldn't be any. Both queries should return the exact same result and perform the same under most RDBMS.
The inner join syntax was added to SQL sometime in the 1990s. It's possible, but unlikely, that the optimizer can do better with it than with the old syntax that used the where clause for the join condition.
They should both be highly portable as things are now.
The inner join syntax is preferable because it is easier on the reader, as others have already remarked.
Both are standard SQL. Different DB systems may optimize them differently, but because they are so simple, I would be a little surprised if they do. But that is the nature of SQL: it is declarative, which gives the implementation a great deal of leeway in how to execute your query. There is no guarantee that these perform the same, or if they are different, which is faster.
They are exactly the same in SQL server. There is no performance difference.

What is the difference between these two SQL? [duplicate]

This question already has answers here:
Explicit vs implicit SQL joins
(12 answers)
Closed 8 years ago.
select a.name from student a, student b
where a.id = b.id and a.id = 1;
vs
select a.name from student a
inner join student b
on a.id = b.id and a.id = 1;
Are they actually the same?
They are the same as far as the query engine is concerned.
The first type, commonly called a comma join, is an implicit inner join in most (all?) RDBMSs. The syntax is from ANSI SQL 89 and earlier.
The syntax of the second join, called an explicit inner join, was introduced in ANSI SQL 92. It is considered improved syntax because even with complex queries containing dozens of tables it is easy to see the difference between join conditions and filters in the where clause.
The ANSI 92 syntax also allows the query engine to potentially optimize better. If you use a something in the join condition, then it happens before (or as) the join is done. If the field is indexed, you can get some benefit since the query engine will know to not bother with certain rows in the table, whereas if you put it in the WHERE clause, the query engine will need to join the tables completely and then filter out results. Usually the RDBMS will treat them identically -- probably 999 cases out of 1000 -- but not always.
See also:
Why isn't SQL ANSI-92 standard better adopted over ANSI-89?
Obviously they are not the same syntactically, but they are semantically (they return the same). However, I'd write the second one as:
select a.name from student a
inner join student b on a.id = b.id
where a.id = 1;
The join you're doing is pointless because you're giving the same table different alias and joining on the same field i.e. id.
Your select statement should simply be as follows:
SELECT name FROM stud WHERE id = 1
the latter is per ansi standard and the formeris conventionally supported..they are functionally equivalent..if you use the latter, there is greater chance of your SQL being portable across vendors

Filter table before inner join condition

There's a similar question here, but my doubt is slight different:
select *
from process a inner join subprocess b on a.id=b.id and a.field=true
and b.field=true
So, when using inner join, which operation comes first: the join or the a.field=true condition?
As the two tables are very big, my goal is to filter table process first and after that join only the rows filtered with table subprocess.
Which is the best approach?
First things first:
which operation comes first: the join or the a.field=true condition?
Your INNER JOIN includes this (a.field=true) as part of the condition for the join. So it will prevent rows from being added during the JOIN process.
A part of an RDBMS is the "query optimizer" which will typically find the most efficient way to execute the query - there is no guarantee on the order of evaluation for the INNER JOIN conditions.
Lastly, I would recommend rewriting your query this way:
SELECT *
FROM process AS a
INNER JOIN subprocess AS b ON a.id = b.id
WHERE a.field = true AND b.field = true
This will effectively do the same thing as your original query, but it is widely seen as much more readable by SQL programmers. The optimizer can rearrange INNER JOIN and WHERE predicates as it sees fit to do so.
You are thinking about SQL in terms of a procedural language which it is not. SQL is a declarative language, and the engine is free to pick the execution plan that works best for a given situation. So, there is no way to predict if a join or a where will be executed first.
A better way to think about SQL is in terms of optimizing queries. Things like assuring that your joins and wheres are covered by indexes. Also, at least in MS Sql Server, you can preview an estimated or actual execution plan. There is nothing stopping you from doing that and seeing for yourself.

What is the difference between join in FROM clause and WHERE clause?

We have a Oracle 10g and most of our applications are running Oracle Forms 6i. I found that all of the queries written in views/packages/procedures/functions are JOINING tables at WHERE clause level. Example
SELECT * FROM
TABLE_A A,
TABLE_B B,
TABLE_C C,
TABLE_D D
WHERE
A.ID=B.ID(+)
AND B.NO=C.NO(+)
AND C.STATUS=D.ID
AND C.STATUS NOT LIKE 'PENDING%'
This query applies only to ORACLE since it has the (+) join qualifier which is not acceptable in other SQL platforms. The above query is equivalent to:
SELECT * FROM
TABLE_A A LEFT JOIN TABLE_B B ON A.ID=B.ID
LEFT JOIN TABLE_C C ON B.NO=C.NO
JOIN TABLE_D D ON C.STATUS=D.ID
WHERE
C.STATUS NOT LIKE 'PENDING%'
Non of the queries I have seen is written with join taking place in the FROM clause.
My question can be divided into three parts:
Q: Assuming that I have the same Oracle environment, which query is better in terms of performance, cache, CPU load, etc. The first one (joining at WHERE) or the second (joining at FROM)
Q: Is there any other implementation of SQL that accepts the (+) join qualifier other than oracle? if yes, which?
Q: Maybe having the join written at WHERE clause makes the query more readable but compromises the ability to LEFT/RIGHT join, that's why the (+) was for. Where can I read more about the origin of this (+) and why it was invented specifically to Oracle?
Q1. No difference. You can check it using profiling and compare execution plan.
Q2. As I know, only oracle support it. But it is not recommended to use in latest version of Oracle RDBMS:
Oracle recommends that you use the FROM clause OUTER JOIN syntax
rather than the Oracle join operator. Outer join queries that use the
Oracle join operator (+) are subject to the following rules and
restrictions, which do not apply to the FROM clause OUTER JOIN syntax:
Q3. Oracle "invent" (+) before outer join was specified in SQL ANSI.
There should be no performance difference. Assuming you're on a vaguely recent version of Oracle, Oracle will implicitly convert the SQL 99 syntax to the equivalent Oracle-specific syntax. Of course, there are bugs in all software so it is possible that one or the other will perform differently because of some bug. The more recent the version of Oracle, the less likely you'll see a difference.
The (+) operator (and a variety of other outer join operators in other databases) were created because the SQL standard didn't have a standard way of expressing an outer join until the SQL 99 standard. Prior to then, every vendor created their own extensions. It took Oracle a few years beyond that to support the new syntax. Between the fact that bugs were more common in the initial releases of SQL 99 support (not common but more common than they are now), the fact that products needed to continue to support older database versions that didn't support the new syntax, and people being generally content with the old syntax, there is still plenty of code being written today that uses the old Oracle syntax.
A1:
As far as I know they vary in-terms of syntax not in performance. So there is no difference between joining at 'where' clause and joining at 'from' clause.
A2:
To answer this in better way, 'Joining at FROM' clause is standard across all the platforms. So forget about (+) symbols
A3
I have worked in Oracle for sometimes. People use (+) symbols for left/right join because it's easy to write. Some ppl use join at (FROM) clause because it's more readable and understandable.
Hope these points helps you. Please let me know incase am wrong with anything.
One difference between Oracle syntax and ANSI syntax is:
In Oracle syntax you cannot make a FULL OUTER JOIN, there you have to use ANSI syntax.
Oracle introduced ANSI syntax in Oracle 9i - including several bugs. In the meantime since Oracle 11 or 12 it works quite well, but you may discover some obstacles in Oracle 9/10.
Another advandage of ANSI Join syntax is you cannot forget any join condition.
"SELECT * FROM TABLE_A, TABLE_B" performs implicitly a Cross-Join. "SELECT * FROM TABLE_A JOIN TABLE_B" raise an error, you are forced to provide the join condition. If you want to Cross-Join you have to specify it, i.e. "SELECT * FROM TABLE_A CROSS JOIN TABLE_B"

Cross Join difference question [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicates:
WHERE clause better execute before IN and JOIN or after
INNER JOIN versus WHERE clause — any difference?
What, if any, are the differences between the following?
Select
col1,
col2
from TableA A, TableB B
where A.ID = B.ID
and
Select
col1,
col2
From TableA A
Inner Join TableB B on A.ID = B.ID
They seem to have the same behaviors in SQL,
They will likely be optimized to the same thing by the RDBMS. They both JOIN the tables on the A.ID = B.ID criteria.
However, the JOIN syntax is explicit and considered correct.
The former is ANSI-89 syntax, and the latter is ANSI-92 syntax. The latter should almost always be used due to the fact that it's much clearer when you start to use outer joins when expressed in ANSI-92 syntax.
The first syntax is (as you pointed out) a cross join or Cartesian product of the two tables. In a system with no optimizer (or a poor optimizer) this will produce a combination of every record in the first table combined with every record in the second table, then filter them down to just those matching the WHERE clause.
The output from both statements will be the same, and if the system you are using has a good optimizer than the performance will be the same as well.
Two comments I would offer:
1) I find it better to be explicit about your intent when writing statements. If you intended to perform an INNER JOIN then use the INNER JOIN syntax. Future you 6 months form now will be thankful.
2) The optimizer in SQL Server will perform an INNER JOIN in this situation (at least recent versions, can't guarantee all versions), but how well it guesses that path is going to depend on the version of the SQL Server engine and is not guaranteed to remain the same in the future (I doubt it will change in this situation, but is the cost of a few more characters of typing really that high?)
#ypercube correctly pointed out your question is about two different INNER JOIN syntaxes. You don't have any outer join syntax. As #Matt Whitfield pointed out, the first syntax is ANSI-92 and the second one is ANSI-89 style. I agree with matt entirely that in more complicated queries the ANSI-92 syntax is way way more readable.
Furthermore, depending on your version of SQL Server THE ANSI-89 syntax is DEPRECATED and can give you problems. See SR0010: Avoid using deprecated syntax when you join tables or views In fact, in the next version SQL 2011, or Denali, or whatever we're calling it, the ANSI-89 syntax will not be supported. See: Features Not Supported in the Next Version of SQL Server
(search for the word "join").