Which approach should I use ?
This
Select * from table1,table2 where table1.id=table2.id;
or
Select * from table1 inner join table2 on table1.id=table2.id;
Note : Id is foriegn Key .
In most modern RMDBS both would yield the same execution plan but
the second one is the reccommended form since it makes clear what are the join conditions right after you declare said join
If your query gets big as they do, the second style is usually regarded as easier to read and comprehend as the JOIN and the WHERE parts of the query are separated.
Select * from table1
INNER JOIN table2 on table1.id=table2.id
INNER JOIN table3 on table1.id=table3.id
WHERE table2.something = 1
Indeed both styles should have the same execution pan under the hood.
Related
This question already has answers here:
Select from Table1, Table2
(3 answers)
Closed 4 years ago.
I know different joins, but I wanted to know which of them is being used when we run queries like this:
select * from table1 t1, table2 t2
is it full outer join or natural join for example?
Also does it have a unique meaning among different databases or all do the same?
UPDATE: what if we add where clause ? will it be always inner join?
The comma in the from clause -- by itself -- is equivalent to cross join in almost all databases. So:
from table1 t1, table2 t2
is functionally equivalent to:
from table1 t1 cross join table2 t2
They are not exactly equivalent, because the scoping rules within the from clause are slightly different. So:
from table1 t1, table2 t2 join
table3 t3
on t1.x = t3.x
generates an error, whereas the equivalent query with cross join works.
In general, conditions in the WHERE clause will always result in the INNER JOIN. However, some databases have extended the syntax to support outer joins in the WHERE clause.
I can think of one exception where the comma does not mean CROSS JOIN. Google's BigQuery originally used the comma for UNION ALL. However, that is only in Legacy SQL and they have removed that in Standard SQL.
Commas in the FROM clause have been out of fashion since the 1900s. They are the "original" form of joining tables in SQL, but explicit JOIN syntax is much better.
To me, they also mean someone who learned SQL decades ago and refused to learn about outer joins, or someone who has learned SQL from ancient materials -- and doesn't know a lot of other things that SQL does.
demo: db<>fiddle
This is a CROSS JOIN (cartesian product). So both of the following queries are equal
SELECT * FROM table1, table2 -- implicit CROSS JOIN
SELECT * FROM table1 CROSS JOIN table1 -- explicit CROSS JOIN
concerning UPDATE
A WHERE clause makes the general CROSS JOIN to an INNER JOIN. An INNER JOIN can be got by three ways:
SELECT * FROM table1, table2 WHERE table1.id = table2.id -- implicit CROSS JOIN notation
SELECT * FROM table1 CROSS JOIN table2 WHERE table1.id = table2.id -- really unusual!: explicit CROSS JOIN notation
SELECT * FROM table1 INNER JOIN table2 ON (table1.id = table2.id) -- explicit INNER JOIN NOTATION
Further reading (wikipedia)
I have 4 tables.
1 with just sitenames.
3 tables, which contains sitenames and the ammount of hits on them for different user types.
I need to make a report on the number of hits on thoose sites for each user type like this
Site Userype1hits Userype2hits Userype3hits
so in the select part I neeed to crosscheck with a table called noanswer
like
select *
from table
where site in (select site from noanswer)
So from what I understand I need to use join, and in this case right join?
How do I do with join in this query?
You would use left join:
select sn.*, t1.cnt, t2.cnt, t3.cnt
from sitenames sn left join
table1 t1
on t1.name = sn.name left join
table2 t2
on t2.name = sn.name left join
table3 t3
on t3.name = sn.name;
Your question is vague on the field names.
A left join keeps all rows in the first table and matches rows in the subsequent tables. It is much more commonly used than right join, probably because it is easier to read the logic thinking "I'll keep all of these rows". A series of right joins actually keeps all rows in the last table, so you have to wait to see which rows stay.
I did like this, googled some further as I did not get it to work even with the answer here, joins are totaly new to me. The internalusers, external and so on is just fake to post what I did here as I cant use the real names.
select s.Site,s2.Total_hits as 'Internalusers',s3.Total_Hits as 'ExternalUsers',
s4.Total_hits as 'Comapany2users' from Database.Table as S
left outer join HITAnalyze.dbo.Internal as s2 on s2.site = s.Site
left outer join HITAnalyze.dbo.External1 as s3 on s3.site = s.Site
left outer join HITAnalyze.dbo.Company2 as s4 on s4.site = s.Site
How does one implement SQL joins without using the JOIN keyword?
This is not really necessary, but I thought that by doing this I could better understand what joins actually do.
The basic INNER JOIN is easy to implement.
The following:
SELECT L.XCol, R.YCol
FROM LeftTable AS L
INNER JOIN RightTable AS R
ON L.IDCol=R.IDCol;
is equivalent to:
SELECT L.XCol, R.YCol
FROM LeftTable AS L, RightTable AS R
WHERE L.IDCol=R.IDCol;
In order to extend this to a LEFT/RIGHT/FULL OUTER JOIN, you only need to UNION the rows with no match, along with NULL in the correct columns, to the previous INNER JOIN.
For a LEFT OUTER JOIN, add:
UNION ALL
SELECT L.XCol, NULL /* cast the NULL as needed */
FROM LeftTable AS L
WHERE NOT EXISTS (
SELECT * FROM RightTable AS R
WHERE L.IDCol=R.IDCol)
For a RIGHT OUTER JOIN, add:
UNION ALL
SELECT NULL, R.YCol /* cast the NULL as needed */
FROM RightTable AS R
WHERE NOT EXISTS (
SELECT * FROM LeftTable AS L
WHERE L.IDCol=R.IDCol)
For a FULL OUTER JOIN, add both of the above.
There is an older deprecated SQL syntax that allows you to join without using the JOIN keyword.. but I personally find it more confusing than any permutation of the JOIN operator I've ever seen. Here's an example:
SELECT A.CustomerName, B.Address1, B.City, B.State, B.Zip
FROM dbo.Customers A, dbo.Addresses B
WHERE A.CustomerId = B.CustomerId
In the older way of doing it, you join by separating the tables with a comma and specifying the JOIN conditions in the WHERE clause. Personally, I would prefer the JOIN syntax:
SELECT A.CustomerName, B.Address1, B.City, B.State, B.Zip
FROM dbo.Customers A
JOIN dbo.Addresses B
ON A.CustomerId = B.CustomerId
The reason you should shy away from this old style of join is clarity and readability. When you are simply joining one table to another, it's pretty easy to figure out what's going on. When you're combining multiple types of joins across a half dozen (or more) tables, this older syntax becomes very challenging to manage.
The best way to get a handle on the JOIN operator is working with it. Here's a decent visual example of what the different JOINs do:
http://blog.codinghorror.com/a-visual-explanation-of-sql-joins/
Some more info:
https://sqlblog.org/2009/10/08/bad-habits-to-kick-using-old-style-joins
http://www.sqlservercentral.com/blogs/brian_kelley/2009/09/30/the-old-inner-join-syntax-vs-the-new-inner-join-syntax/
When SQL was an infant we didn't have "inner join" "left outer join" etc. All we did was list the tables like this:
FROM table1, table2, table3, .... tablen
Then we had a where clause that was like a novel in length, some of the conditions were for filtering the data, many of the conditions were to join tables, like this
FROM table1, table2, table2, .... tablen
WHERE table1.code = 'x' and table1.id = table3.fk and table2.name like 'a%' and table2.id = table1.fk and tablen.fk = table3.id and table2.dt >= '2014-01-01'
from this we hoped like heck we had all the tables nicely related and we crossed our fingers. The worst case scenario - which happened a lot - was that we forgot to include a table at all in the where clause. This was not nice because what we get when we do that is a "Cartesian product" (basically a multiplication of all rows by the number of rows in the table we missed).
Then came ANSI standard join syntax, and life was better. We now place the join conditions on the join - not in the where clause - and as a bonus the where clause is easier to understand.
I don't think you will find it easier to understand this ancient syntax, for example an outer join was join = bizarre(+) or maybe it was (+)bizarre = join (I try not to remember).
Try http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins
A simple inner join
select * from table1 a
inner join
dbo.table2 b
on a.inventory_id = b.inventory_id
Wouldn't it be nice and intuitive to put it like this?
select * from table1 a
inner join
dbo.table2 b
on inventory_id
Are there any comparable succinct approaches?
Thanks!
If you were using PostgreSQL, MySQL or Oracle, you can use a Natural Join
select *
from table1 a
natural join table2 b
Not sure why the question title includes "left", but you can do a natural left join as well.
Unfortunately, I'm sure you're using SQL Server due to the dbo., so no, the ON condition is required.
How about a natural join:
select *
from table1 a
natural join dbo.table2 b
However, your RDBMS may not support it, and I would recommend always specifying the join type and conditions in your queries. It's much more maintainable in the long run.
I'm guessing from the dbo. that you're using SQL Server though, and it's not supported there. See here for more info.
Edit:
There's another possibility that's again not supported by SQL Server but is worth noting. This could actually be worth using, as your join condition is clearly specified. More info here.
select *
from table1
inner join dbo.table2 using (inventory_id)
If you don't want to use ANSI standard JOINs, then use implicit syntax:
select * from table1 a, table2 b
where a.inventory_id = b.inventory_id
If you are going to join multiple tables in a SQL query, where do you think is a better place to put the join statement: in the FROM clause or the WHERE clause?
If you are going to do it in the FROM clause, how do you format it so that it is clear and readable? (I'm talking about indents, newlines, whitespace in general.)
Are there any advantages/disadvantages to each?
I tend to use the FROM clause, or rather the JOIN clause itself, indenting like this (and using aliases):
SELECT t1.field1, t2.field2, t3.field3
FROM table1 t1
INNER JOIN table2 t2
ON t1.id1 = t2.id1
INNER JOIN table3 t3
ON t1.id1 = t3.id3
This keeps the join condition close to where the join is made. I find it easier to understand this way then trying to look through the WHERE clause to figure out what exactly is joined how.
When making OUTER JOINs (ANSI-89 or ANSI-92), filtration location matters because criteria specified in the ON clause is applied before the JOIN is made. Criteria against an OUTER JOINed table provided in the WHERE clause is applied after the JOIN is made. This can produce very different result sets.
In comparison, it doesn't matter for INNER JOINs if the criteria is provided in the ON or WHERE clauses -- the result will be the same. That said, I strive to keep the WHERE clause clean -- anything related to JOINed tables will be in their respective ON clause. Saves hunting through the WHERE clause, which is why ANSI-92 syntax is more readable.
I prefer the FROM clause if for no other reason that it distinguishes between filtering results (from a Cartesian product) merely between foreign key relationships and between a logical restriction. For example:
SELECT * FROM Products P JOIN ProductPricing PP ON P.Id = PP.ProductId
WHERE PP.Price > 10
As opposed to
SELECT * FROM Products P, ProductPricing PP
WHERE P.Id = PP.ProductID AND Price > 10
I can look at the first one and instantly know that the only logical restriction I'm placing is the price, as opposed to the implicit machinery of joining tables together on the relationship key.
I almost always use the ANSI 92 joins because it makes it clear that these conditions are for JOINING.
Typically I write it this way
FROM
foo f
INNER JOIN bar b
ON f.id = b.id
sometimes I write it this way when it trivial
FROM
foo f
INNER JOIN bar b ON f.id = b.id
INNER JOIN baz b2 ON b.id = b2.id
When its not trivial I do the first way
e.g.
FROM
foo f
INNER JOIN bar b
ON f.id = b.id
and b.type = 1
or
FROM
foo f
INNER JOIN (
SELECT max(date) date, id
FROM foo
GROUP BY
id) lastF
ON f.id = lastF.id
and f.date = lastF.Date
Or really the weird (not sure if I got the parens correctly but its supposed to be an LEFT join to table bar but bar needs an inner join to baz)
FROM
foo f
LEFT JOIN (bar b
INNER JOIN baz b2
ON b.id = b2.id
)ON f.id = b.id
You should put joins in Join clauses which means the From clause. A different question could be had about where to put filtering statements.
With respect to indenting, there are many styles. My preference is to indent related joins and keep main clauses like Select, From, Where, Group By, Having and Order By indented at the same level. In addition, I put each of these main attributes and the first line of an On clause on its own line.
Select ..
From Table1
Join Table2
On Table2.FK = Table1.PK
And Table2.OtherCol = '12345'
And Table2.OtherCol2 = 9876
Left Join (Table3
Join Table4
On Table4.FK = Table3.PK)
On Table3.FK = Table2.PK
Where ...
Group By ...
Having ...
Order By ...
Use the FROM clause to be compliant with ANSI-92 standards.
This:
select *
from a
inner join b
on a.id = b.id
where a.SomeColumn = 'x'
Not this:
select *
from a, b
where a.id = b.id
and a.SomeColumn = 'x'
I definitely always do my JOINS (of whatever type) in my FROM clause.
The way I indent them is this:
SELECT fields
FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.t1_id
INNER JOIN table3 t3 ON t1.id = t3.t1_id
AND
t2.id = t3.t2_id
In fact, I'll generally go a step farther and move as much of my constraining logic from the WHERE clause to the FROM clause, because this (at least in MS SQL) front-loads the constraint, meaning that it reduces the size of the recordset sooner in the query construction (I've seen documentation that contradicts this, but my execution plans are invariably more efficient when I do it this way).
For example, if I wanted to only select things in the above query where t3.id = 3, you could but that in the WHERE clause, or you could do it this way:
SELECT fields
FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.t1_id
INNER JOIN table3 t3 ON t1.id = t3.t1_id
AND
t2.id = t3.t2_id
AND
t3.id = 3
I personally find queries laid out in this way to be very readable and maintainable, but this is certainly a matter of personal preference, so YMMV.
Regardless, I hope this helps.
ANSI joins. I omit any optional keywords from the SQL as they only add noise to the equation. There's no such thing as a left inner join, is there? And by default, a simple join is an inner join, so there's no particular point to saying 'inner join'.
Then I column align things as much as possible.
The point being that a large complex SQL query can be very difficult to comprehend, so the more order that is imposed on it to make it more readable, the better. Any body looking at the query to fix, modify or tune it, needs to be able to answer a few things off right off the bat:
what tables/views are involved in the query?
what are the criteria for each join? What's the cardinality of each join?
what/how many columns are returned by the query
I like to write my queries so they look something like this:
select PatientID = rpt.ipatientid ,
EventDate = d.dEvent ,
Side = d.cSide ,
OutsideHistoryDate = convert(nchar, d.devent,112) ,
Outcome = p.cOvrClass ,
ProcedureType = cat.ctype ,
ProcedureCategoryMajor = cat.cmajor ,
ProcedureCategoryMinor = cat.cminor
from dbo.procrpt rpt
join dbo.procd d on d.iprocrptid = rpt.iprocrptid
join dbo.proclu lu on lu.iprocluid = d.iprocluid
join dbo.pathlgy p on p.iProcID = d.iprocid
left join dbo.proccat cat on cat.iproccatid = lu.iproccatid
where procrpt.ipatientid = #iPatientID