I read from this answer (click), the following conditional statements
Invoices.CustomerID=Customers.CustomerID
and
Customers.CustomerID=Invoices.CustomerID
are identical because it produces the same result set.
Now, my problem is about commutativity of inner join. I have tried both of the following approaches and they produce the same result set (except for the column order).
Customers table first
use MMABooks
select *
from Customers
inner join Invoices
on Invoices.CustomerID=Customers.CustomerID
where Customers.CustomerID=10
Invoices table first
use MMABooks
select *
from Invoices
inner join Customers
on Invoices.CustomerID=Customers.CustomerID
where Invoices.CustomerID=10
Questions
Is inner join commutative by design?
Is there a best practice that suggest or prefer one approach over the other one? I mean, which approach should I use?
It would be really weird if they didn't produce the same result. Did you expect a difference?
A best practice is to start with the table from which you select most of the columns.
You do have to worry about the order when you work with LEFT or RIGHT JOINS.
Related
In http://www.w3schools.com/sql/sql_alias.asp, it mentions using alias to do the following query,
SELECT
Orders.OrderID, Orders.OrderDate, Customers.CustomerName
FROM
Customers, Orders
WHERE
Customers.CustomerName = "Around the Horn"
AND Customers.CustomerID = Orders.CustomerID;
This confuses me with the usage of JOIN. Isn't this kind of query joining the columns from two tables? What are the differences between this kind of query and JOIN?
JOIN and alias are two differnt concept .. the alias is for create a substitutive name (shorter usually) for a more easy object reference and for a more easy read .. so you can have column name alias or table name alias eg:
select a.col1
from my_table as a
a is an alias for the table my_table
or
select a.col1 as c1
from my_table as a
where c1 is an alias for col1
JOIN are for build relation between table
The Join can be implict or explict
In your code you are using implici join and the condition between the tables that keep the relation is based on where clause
but you could use a more espressive way using explict join
SELECT Orders.OrderID, Orders.OrderDate, Customers.CustomerName
FROM Customers
INNER JOIN Orders on Customers.CustomerID=Orders.CustomerID;
WHERE Customers.CustomerName="Around the Horn"
Both the same, there is no difference.
There are differences only in readability.
In my opinion, PLSQL(ORACLE) developer choosing alias when writing queries then TSQL(SQL Server) developer choosing by Join
The answer to your question is there is no difference between your query and an inner join but many times, your write queries between tables where the relationships are not explicitly defined or a table may not require a relationship. In those cases, you would use a left join to return data from a first table and zero to many items from the table on the right. Using your format, makes that a lot more difficult to write and read. As for table Aliases, when writing self joins for example, you will need to use them so understanding them is essential.
When you have two tables, and want to exclude rows from the second one, there are a multitude of options including EXISTS, NOT IN, LEFT JOIN and EXCEPT.
I've always used left join:
select N.ProductID from NewProducts N
left join Products P on P.ProductID = N.ProductID
where P.ProductID is null
Now I'm thinking it's cleaner to to use EXCEPT:
select ProductID from NewProducts
except
select ProductID from Products
Are there performance issues of using EXCEPT?
You can check execution plan and SQL profiler to choose the suitable query.
But, for me, NOT EXISTS is good. Reference here
The answer to your question is all up to you, depending on how large the data.
You can use any of that (EXISTS, NOT IN, LEFT JOIN and EXCEPT.) depending on your requirement.
you said that you always use LEFT JOIN , and that is good.. because joining the two tables will minimize the execution time of the query, especially when you are holding large amount of data.
JOIN is advisable but it is always depends on you.
You can see the difference of execution time using the execution plan of sql.
This question already has answers here:
INNER JOIN ON vs WHERE clause
(12 answers)
Closed 8 years ago.
Hello :) I've got a question on MySQL queries.
Which one's faster and why?
Is there any difference at all?
select tab1.something, tab2.smthelse
from tab1 inner join tab2 on tab1.pk=tab2.fk
WHERE tab2.somevalue = 'value'
Or this one:
select tab1.something, tab2.smthelse
from tab1 inner join tab2 on tab1.pk=tab2.fk
AND tab2.somevalue = 'value'
As Simon noted, the difference in performance should be negligible. The main concern would be ensuring your query correctly expresses your intent, and (especially) you get the expected results.
Generally, you want to add filters to the JOIN clause only if the filter is a condition of the join. In most (not all) cases, a filter should be applied to the WHERE clause, as it is a filter of the overall query, not of the join itself.
AFAIK, the only instance where this really affects the outcome of the query is when using an OUTER JOIN.
Consider the following queries:
SELECT *
FROM Customer c
LEFT JOIN Orders o ON c.CustomerId = o.CustomerId
WHERE o.OrderType = "InternetOrder"
vs.
SELECT *
FROM Customer c
LEFT JOIN Orders o ON c.CustomerId = o.CustomerId AND o.OrderType = "InternetOrder"
The first will return one row for each customer order that has an order type of "Internet Order". In effect, your left join has become an inner join because of the filter that was applied to the whole query (i.e. customers who do not have an "InternetOrder" will not be returned at all).
The second will return at least one row for each customer. If the customer has no orders of order type "Internet Order", it will return null values for all order table fields. Otherwise it will return one row for each customer order of type "Internet Order".
If the constraint is based off the joined table (as yours is) then it makes sense to specify the constraint when you join.
This way MySQL is able to reduce the rows from the joined table at the time it joins, as otherwise it will need to be able to select all data that fulfills the basic JOIN criteria prior to applying the WHERE logic.
In reality you'll see little difference in performance until you get to more complex queries or larger datasets, however limiting the data at each JOIN will be more efficient overall if done well especially if there are good indexes on the joined table.
I'm studying SQL for a database exam and the way I've seen SQL is they way it looks on this page:
http://en.wikipedia.org/wiki/Star_schema
IE join written the way Join <table name> On <table attribute> and then the join condition for the selection. My course book and my exercises given to me from the academic institution however, use only natural join in their examples. So when is it right to use natural join? Should natural join be used if the query can also be written using JOIN .. ON ?
Thanks for any answer or comment
A natural join will find columns with the same name in both tables and add one column in the result for each pair found. The inner join lets you specify the comparison you want to make using any column.
IMO, the JOIN ON syntax is much more readable and maintainable than the natural join syntax. Natural joins is a leftover of some old standards, and I try to avoid it like the plague.
A natural join will find columns with the same name in both tables and add one column in the result for each pair found. The inner join lets you specify the comparison you want to make using any column.
The JOIN keyword is used in an SQL statement to query data from two or more tables, based on a relationship between certain columns in these tables.
Different Joins
* JOIN: Return rows when there is at least one match in both tables
* LEFT JOIN: Return all rows from the left table, even if there are no matches in the right table
* RIGHT JOIN: Return all rows from the right table, even if there are no matches in the left table
* FULL JOIN: Return rows when there is a match in one of the tables
INNER JOIN
http://www.w3schools.com/sql/sql_join_inner.asp
FULL JOIN
http://www.w3schools.com/sql/sql_join_full.asp
A natural join is said to be an abomination because it does not allow qualifying key columns, which makes it confusing. Because you never know which "common" columns are being used to join two tables simply by looking at the sql statement.
A NATURAL JOIN matches on any shared column names between the tables, whereas an INNER JOIN only matches on the given ON condition.
The joins often interchangeable and usually produce the same results. However, there are some important considerations to make:
If a NATURAL JOIN finds no matching columns, it returns the cross
product. This could produce disastrous results if the schema is
modified. On the other hand, an INNER JOIN will return a 'column does
not exist' error. This is much more fault tolerant.
An INNER JOIN self-documents with its ON clause, resulting in a
clearer query that describes the table schema to the reader.
An INNER JOIN results in a maintainable and reusable query in
which the column names can be swapped in and out with changes in the
use case or table schema.
The programmer can notice column name mis-matches (e.g. item_ID vs itemID) sooner if they are forced to define the ON predicate.
Otherwise, a NATURAL JOIN is still a good choice for a quick, ad-hoc query.
I was thinking about the syntax of inner joins in Oracle's SQL implementation and here is something that seems a bit inconsistent:
Let's say you have two relations loan(loan_number, branch_name, amount) and borrower(customer_name, loan_number). loan_number is the attribute common to both tables. Now, Oracle gives you two ways to express an inner join:
select *
from loan, borrower
where loan.loan_number = borrower.loan_number;
The above statement is equivalent to:
select *
from loan
inner join borrower
on loan.loan_number = borrower.loan_number;
However, when expressing a cross join there is only one way to express it:
select *
from loan, borrower;
The following statement is syntactically incorrect:
select *
from loan
inner join borrower;
This is invalid; Oracle expects the ON... part of the clause
Given that an inner join is just a cross join with a filter condition, do you guys think that this is an inconsistency in Oracle's SQL implementation? Am I missing something?
I'd be interested in hearing some other opinions. Thanks.
As David pointed out in his answer the syntax is:
select *
from loan cross join borrower;
Even though I was not aware of the above syntax I still think it's inconsistent. Having the cross join keyword in addition to allowing inner join without a join condition would be fine. A cross join is in fact an inner join without a join condition, why not express it as an inner join without the join condition?
I would agree that it is not consistent.
But I would argue that the Oracle implementation is a good thing:
when you do a join, you almost always want to include a filter condition, therefore the ON part is mandatory.
If you really, really don't want to have a filter condition (are you really sure?), you have to tell Oracle explicitly with CROSS JOIN sytax.
Makes a lot of sense to me not to be 100% consistent - it helps to avoid you mistakes.
SELECT *
FROM Loan
CROSS JOIN Borrower
No inconsistency.
Oracle also supports the natural join syntax, which joins two tables on the basis of shared column name(s). This would work in your case because both tables have a column called LOAN_NUMBER.
SELECT *
FROM Loan
NATURAL JOIN Borrower
Now, your same argument could be made in this case, that the use of the keyword natural is strictly unnecessary. But if we follow the logic we end up with a situation in which this statement could be either a cross join or a natural join, depending on the column names:
SELECT *
FROM Loan
JOIN Borrower
This is clearly undesirable, if only because renaming LOAN.LOAN_NUMBER to LOAN_ID would change the result set.
So, there's your answer: disambiguation.
This way of expressing inner joins:
select * from loan, borrower where loan.loan_number = borrower.loan_number;
is not recommended for almost 20 years. It was kept because it is simply a valid expression that happens to convey an inner join. I would concentrate in using the version closer to the current standard, minimizing the chances for misunderstanding and flat out errors.