Use of more then one join and left join - sql-server-2012

If I got more then one join and in the second join I use left join.
by using this clause its going to take all the data from the two first tables or only from the second table.
Thanks

Join is just a method to connect different tables. Theoretically (not computationally), there's no limit on the number of joins you used on a query.
Keep in mind that the order of joins are important once you started to use something other than inner joins. For example, a LEFT JOIN b is not equivalent to b LEFT JOIN a.
With that being said, when you have more than one join, the result should be interpreted carefully.
Consider
SELECT a.id,b.name,c.department
FROM
a INNER JOIN b on a.id = b.id
LEFT JOIN c on a.id = c.id
The resulting table would consist of all id that is present in both a and b, and return NULL if a department is not present for those id.
So to answer your question, joins consider all your data in the query. However, the output table depends on the joins you used. If there are still confusion, you can refer to this question which addressed a similar thing.

Related

difference in with/without "left join" and matching in "where" or "on"?

Is there any performance difference between two different SQL-codes as below? The first one is without left jon and matching with where, the other is with left join and matching with on.
Because I get exactly the same result/output from those sql's, but I will be working with bigger tables soon (like couple of billions rows), so I don't want to have any performance issues. Thanks in advance ...
select a.customer_id
from table a, table b
where a.customer_id = b.customer_id
select a.customer_id
from table a
left join table b
on a.customer_id = b.customer_id
The two do different things and yes, there is a performance impact.
Your first example is a cross join with a filter which reduces it to an inner join (virtually all planners are smart enough to reduce this to an inner join but it is semantically a cross join and filter).
Your second is a left join which means that where the filter is not met, you will still get all records from table a.
This means that the planner has to assume all records from table a are relevant, and that correlating records from table b are relevant in your second example, but in your first example it knows that only correlated records are relevant (and therefore has more freedom in planning).
In a very small data set you will see no difference but you may get different results. In a large data set, your left join can never perform better than your inner join and may perform worse.

Is the order of joining tables indifferent as long as we chose proper join types?

Can we achieve desired results of joining tables by executing joins in whatever order? Suppose we want to left join two tables A and B (order AB). We can get the same results with right join of B and A (BA).
What about 3 tables ABC. Can we get whatever results by only changing order and joins types? For example A left join B inner join C. Can we get it with BAC order? What about if we have 4 or more tables?
Update.
The question Does the join order matter in SQL? is about inner join type. Agreed that then the order of join doesn't matter. The answer provided in that question does not answer my question whether it is possible to get desired results of joining tables with whatever original join types (join types here) by choosing whatever order of tables we like, and achieve this goal only by manipulating with join types.
In an inner join, the ordering of the tables in the join doesn't matter - the same rows will make up the result set regardless of the order they are in the join statement.
In either a left or right outer join, the order DOES matter. In A left join B, your result set will contain one row for every record in table A, irrespective of whether there is a matching row in table B. If there are non matching rows, this is likely to be a different result set to B left join A.
In a full outer join, the order again doesn't matter - rows will be produced for each row in each joined table no matter what their order.
Regarding A left join B vs B right join A - these will produce the same results. In simple cases with 2 tables, swapping the tables and changing the direction of the outer join will result in the same result set.
This will also apply to 3 or more tables if all of the outer joins are in the same direction - A left join B left join C will give the same set of results as C right join B right join A.
If you start mixing left and right joins, then you will need to start being more careful. There will almost always be a way to make an equivalent query with re-ordered tables, but at that point sub-queries or bracketing off expressions might be the best way to clarify what you are doing.
As another commenter states, using whatever makes your purpose most clear is usually the best option. The ordering of the tables in your query should make little or no difference performance wise, as the query optimiser should work this out (although the only way to be sure of this would be to check the execution plans for each option with your own queries and data).

Why are the two queries different (left join on ... and ... as opposed to using where clause)

I'm wondering why the following two queries produce different results (the first query has more rows than second).
SELECT * FROM A
JOIN ...
JOIN ...
JOIN C ON ...
LEFT JOIN B ON B.id = A.id AND B.otherId = C.otherId
As opposed to:
SELECT * FROM A
JOIN ...
JOIN ...
JOIN C ON ...
LEFT JOIN B ON B.id = A.id
WHERE B.otherId = C.otherId
Please help me understand. In the second query, the left join has only 1 condition so shouldn't it include all the results from the first query and more (where the extra rows have unmatched otherId). Then the WHERE clause should ensure that the otherId matches, like in the first query. Why are they different?
The WHERE is performed first by the Query engine before performing the JOIN.
The reasoning being why do the expensive JOIN, if we are going to filter some rows later.
The query engines are pretty good at optimizing the query you write.
Also you will see this effect only in OUTER JOINs. In inner joins both WHERE and JOIN conditions behave the same.
The second query returns less rows because your where clause was filtering the records out, and this is essentially changing the query from a left outer join to an inner join. So, you need to be careful where you place your filters in, but this will not matter if you were to do an inner join.
You've received correct answers, but allow me to delve a little deeper into the difference between join criteria and filtering criteria. Take a simple query with a left join:
select a.Key, a.NonKey1, b.NonKey2
from a
left join b on b.Key = a.Key;
This lists out all of NonKey1 values from table a and any NonKey2 fields from table b with matching key values or NULL where there is no match. A common variant is to look at only those rows in a that do not have a match in b:
select a.Key, a.NonKey1, b.NonKey2
from a
left join b on b.Key = a.Key
where b.Key is null;
Careful! If you accidentally write where b.Key is not null you've just changed your outer join into a regular inner join. Do that sometime and see if QA can catch it. On second thought, don't. (Also, having b.NonKey2 in the selection list is meaningless as it can only ever be NULL, but let's leave it there for the moment.) The join is based on the key fields of both tables matching. After the joining is complete, all rows with a successful join are discarded and only the results without a match remain. That means b.Key in the join criteria cannot be NULL and in the filtering criteria must be NULL for a row to be added to the result set. Fine, that's what we wanted. But consider what would happen if we moved the check to become part of the join criteria.
select a.Key, a.NonKey1, b.NonKey2
from a
left join b on b.Key = a.Key and b.Key is null;
The result is everything from a with nothing at all from b. Probably not what we wanted. If you think about it, you will see we could just as well have written on 0 = 1 and gotten the same result. What we've done is move a value from one context where NULL means one thing (success) to a context where NULL means something entirely different (failure).
So, in computer languages as in human languages, be careful of context. It can completely change the meaning of what you're trying to say.

SQL Server JOINS: Are 'JOIN' Statements 'LEFT OUTER' Associated by Default in SQL Server? [duplicate]

This question already has answers here:
What is the difference between "INNER JOIN" and "OUTER JOIN"?
(28 answers)
Closed 8 years ago.
I have about 6 months novice experience in SQL, TSQL, SSIS, ETL. As I find myself using JOIN statements more and more in my intern project I have been experimenting with the different JOIN statements. I wanted to confirm my findings. Are the following statements accurate pertaining to the conclusion of JOIN statements in SQL Server?:
1)I did a LEFT OUTER JOIN query and did the same query using JOIN which yielded the same results; are all JOIN statements LEFT OUTER associated in SQL Server?
2)I did a LEFT OUTER JOIN WHERE 2nd table PK (joined to) IS NOT NULL and did the same query using an INNER JOIN which yielded the same results; is it safe to say the the INNER JOIN statement will yield only matched records? and is the same as LEFT OUTER JOIN where joined records IS NOT NULL ?
The reason I'm asking is because I have been only using LEFT OUTER JOINS because that is what I was comfortable with. However, I want to eliminate as much code as possible when writing queries to be more efficient. I just wanted to make sure my observations are correct.
Also, are there any tips that you could provide on easily figuring out which JOIN statement is appropriate for specific queries? For instance, what JOIN would you use if you wanted to yield non-matching records?
Thanks.
A join or inner join (same thing) between table A and table B on, for instance, field1, would narrow in on all rows of table A and B sharing the same field1 value.
A left outer join between A and B, on field1, would show all rows of table A, and only those rows of table B that have a field1 existing in table A.
Where the rows of field1 on table A have a field1 value that doesn't exist in table B, the table B value would show null for field1, but the row of table A would be retained because it is an outer join. These are rows that wouldn't show up in a join which is an implied inner join.
If you get the same results doing a join between table A and table B as you do a left outer join between table A and B, then whatever fields you're joining on have values that exist in both tables. No value for any of the joined fields in A or B exist exclusively in A or B, they all exist in both A and B.
It is also possible you're putting criteria into the where clause that belongs in the on clause of the outer join, which may be causing your confusion. In my example above of tables A and B, where A is being left outer joined with B, you would put any criteria related to table B in the on clause, not the where clause, otherwise you would essentially be turning the outer join into an inner join. For example if you had b.field4 = 12 in the WHERE clause, and table B didn't have a match with A, it would be null and that criteria would fail, and it'd no longer come back even though you used a left outer join. That may be what you are referring to.
JOIN's are mapped to 'INNER JOIN' by default

Join clause joining 3 tables in same criteria

I've saw a join just like this:
Select <blablabla>
from
TableA TA
Inner join TableB TB on Ta.Id = Tb.Id
Inner join TableC TC on Tc.Id = Tb.Id and Ta.OtheriD = Tc.OtherColumn
But what's the point (end effect) of that second join clause?
What the implications when an outer join clause is used?
And, more important, what is the best to rewrite it in a way that is easy
to understand what it's trying to join?
And, more important, what is the best way to rewrite it to get rid of the construction
and mantain the correctness of the query.
I don't specify the RDBMS, because it's a more generic question, but for those
curious (since people always ask): it's SQL Server 2005.
EDIT: It's just a made up example (since I would have to dig the original source - which I don't have access anymore). I found the original join clause on a 10 join SELECT command.
It simply means you have an extra restriction on the intersection between tablea and tablec.
Because we know Ta.Id = Tb.Id, Tc.Id = Tb.Id is the same as Tc.Id = Ta.Id. Inner joins are associative. So it makes more sense like this so each join is between 2 tables only
Select <blablabla>
from
TableB TB
Inner join
TableA TA on Tb.Id = Ta.Id --a and b intersection
Inner join
TableC TC on Ta.Id = Tc.Id and Ta.OtheriD = Tc.Column --a and c intersection
Your Q : But what's the point (end effect) of that second join clause?
Effectively filters rows...you could move the second half of the on statement into the where clause if you really want, only really effects readability. gbn's answer looks good for this 3 table example,but to expand on it...sometimes a rewrite like this isn't possible. I have seen an occasion where 2 different systems (one oracle 8i and one SQL server 2000) had their databases joined together. A 3 part key was identified as being required to make the records unique in both systems, but each component of the 3 part key was held in different tables...the final result had a few joins like that.
Functionally...I'm not sure if there's a difference really. Unless I'm completely off, readability seems to be the biggest difference.
Your Second Q: What the implications when an outer join clause is used?
You'll potentially get a bunch of nulls (pending how you setup the outer join) while the inner join would have dropped them. Be careful though...inner joins is associative...as gbn put it: An OUTER JOIN is different and order does matter
The user may want to furthur filter the set of rows which are included in the Join set...
The point of the second join is to further limit your result set based on the contents of TableC. The first join gives you ONLY records that exist in TA and TB. The second join gives you ONLY results from the first join that also exist in TC.