bit of a novice question, I am running a query and left joining and wanted to know whether there was a difference when you specify a filter in terms of performance, in e.g below, top I filter straight after first join and below I do all joins and then filter:
Select t1.*,t2.* from t1 t1
left join t2 t2
on t1.key = t2.key
and t1.date < today
left join t3 t3
on t2.key2 = t3.key
vs
Select t1.*,t2.* from t1 t1
left join t2 t2
on t1.key = t2.key
left join t3 t3
on t2.key2 = t3.key
and t1.date < today
Learn what LEFT JOIN ON returns: INNER JOIN ON rows UNION ALL unmatched left table rows extended by NULLs. Always know what INNER JOIN you want as part of an OUTER JOIN.
In general your queries have different inner join & null-extended rows for the 1st left joins & then further differences due to more joining. Unless certain constraints hold, the 2 queries return different functions of their inputs. So comparing their performance seems moot.
Related
I am currently trying to join a few tables together (maybe join 2 additional more if possible) but with how my query is written right now, I cant even see the results with 3 tables
select t1.x,
t1.y,
t1.z,
t4.a,
t4.b,
t4.c,
t4.d
from t1
left join t2 on t1.id=t2.id
left join t3 on t2.id=t3.id
left join t4 on t1.id2=t4.id
where t1.date between 'x' and'x'
and t1.city not in ('x')
and t3.column = x;
Is there a way to optimize this code to run faster and perhaps make it able to add more tables to it?
Thank you in advance!
Your query has some logic issues that might help with the speed.
t2 is joined to t1 when they have the same id value.
t3 is then pulled in, if and only if, there was a row in t2 and it has the same value as t1 and t2.
Finally, in your where clause, the t3.column has to be x else it's filtered.
This means a row in t3 has to exist. Every t1 record that doesn't have a t2 record and a t3 record will be filtered out with that where. Thus you don't need a left join, you need an INNER join.
select t1.x,
t1.y,
t1.z,
t4.a,
t4.b,
t4.c,
t4.d
from t1
inner join t2 on t1.id=t2.id
inner join t3 on t2.id=t3.id
left join t4 on t1.id2=t4.id
where t1.date between 'x' and'x'
and t1.city not in ('x')
and t3.column = x;
In some DBMS you can move the t3.column clause to the join command which can help filter out the rows earlier in the plan.
select t1.x,
t1.y,
t1.z,
t4.a,
t4.b,
t4.c,
t4.d
from t1
inner join t2 on t1.id=t2.id
inner join t3 on t2.id=t3.id and t3.column = x
left join t4 on t1.id2=t4.id
where t1.date between 'x' and'x'
and t1.city not in ('x');
My final advise is to take a close look at t2 to see if you really need it. Ask yourself, is there a reason a row has to exist in t2 in order for me to get the right results? ... because if t1.id = t2.id then t1.id = t3.id and you can eliminate the t2 table completely.
I work with DB / SQL almost on a daily basis and the more I work with sql, the more I'm the opinion that there is no reason to use a right join or a full outer join.
Let's assume we have two tables: table1 and table2. Either I want to receive additional information for the rows in table1 so I can use an inner join on table2 and if I want to keep the original rows if there is no match, I use the left join then:
In case I have to add additional information to table 2, I can do the same and left join table 2 to table on. So I do not see a reason why I should ever use a right join. Is there any use case where you can not use a left join for a right join?
I also wondered if I would ever need a full outer join. Why would you join two tables and keep the rows that do not match of BOTH tables? We you could also achieve this by using two left joins.
Why would you join two tables and keep the rows that do not match of BOTH tables?
The full join has cases where it is useful.One of them is comparing two tables for differences like XOR between tables:
SELECT *
FROM t1
FULL JOIN t2
ON t1.id = t2.id
WHERE t1.id IS NULL
OR t2.id IS NULL;
Example:
t1.id ... t2.id
1 NULL
NULL 2
you could also achieve this by using two left joins.
Yes you could:
SELECT t1.*, t2.*
FROM t1
LEFT JOIN t2
ON t1.id = t2.id
WHERE t2.id IS NULL
UNION ALL
SELECT t1.*, t2.*
FROM t2
LEFT JOIN t1
ON t1.id = t2.id
WHERE t1.id IS NULL;
Some SQL dialects does not support FULL OUTER JOIN and we emulate it that way.
Related: How to do a FULL OUTER JOIN in MySQL?
On the other hand RIGHT JOIN is useful when you have to join more than 2 tables:
SELECT *
FROM t1
JOIN t2
...
RIGHT JOIN t3
...
Of course you could argue that you could rewrite it to correspodning form either by changing join order or using subqueries(inline views). From developer perspective it is always good to have tools(even if you don't have to use them)
I'm new to SQL. I need help with left join with an select.
The part I'm interested in:
Select...
from table t1
left join table t2
on t1.id=t2.id,
left join (select * from table 3 where ...) t3
on t1.id=t3.id
where t1.id='something'
Also i tried to moved in the where clause the t1.id(+)=t3.id but didn't work.
I think the logic you want is:
Select...
from t1 left join
t2
on t1.id = t2.id left join
t3
on t1.id = t3.id and
<t3 conditions go here>
where t1.id = 'something'
You have a superfluous comma -- and commas should never be in the FROM clause.
You also have a superfluous subquery. You can get the same functionality by just including the condition in the ON clause.
This question might seem quite trival, but being new to sql programming I'm having some trouble understanding the left joins.
To illustrate, I have the following scenario -
I have to perform left joins on the following tables -
from T1.id to T2.id
from T2.Oi to T3.Oi
from T1.Pi to T4.Pi
from t4.Si to T5.Si
from T6.Ki to T7.Ki
I'm trying to do the following method, but not sure if its correct approach, if so, then not sure if its an efficient approach
select /*(whatever I want)*/
from
T1 left join T2 on T1.id = T2.id
left join T4 on T1.Pi = T4.Pi
left join T5 on T4.Si = T5.Si
left join T3 on T2.Oi = T3.Oi
(Getting stuck on joining T6 and T7)
Can someone help me in understanding if my above approach is right and how solve in joining T6 and T7
Cheers!
joining tables T1..T5 should be like that:
SELECT *
FROM T1
LEFT JOIN T2
ON T2.ID=T1.ID
LEFT JOIN T3
ON T3.OI=T2.OI
LEFT JOIN T4
ON T4.PI=T1.PI
LEFT JOIN T5
ON T5.SI=T4.SI
I don't know what you have in those tables so please consider cartesian product (of course it can be desired result in some cases). Read more here.
I don't know what about tables T6 and T7. If records are in the same form you may want to use UNION (please consider UNION ALL operator - read about difference:
SELECT *
FROM T1
LEFT JOIN T2
ON T2.ID=T1.ID
LEFT JOIN T3
ON T3.OI=T2.OI
LEFT JOIN T4
ON T4.PI=T1.PI
LEFT JOIN T5
ON T5.SI=T4.SI
UNION
SELECT *
FROM T6
LEFT JOIN T7
ON T7.KI=T6.KI
I should probably know this by now, but what, if any is the difference between the two statements below?
The nested join:
SELECT
t1.*
FROM
table1 t1
INNER JOIN table2 t2
LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
ON t2.table2_ID = t1.table1_ID
The more traditional join:
SELECT
t1.*
FROM
table1 t1
INNER JOIN table2 t2 ON t2.table2_ID = t1.table1_ID
LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
Well, it's the order of operations..
SELECT
t1.*
FROM
table1 t1
INNER JOIN table2 t2
LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
ON t2.table2_ID = t1.table1_ID
could be rewritten as:
SELECT
t1.*
FROM
table1 t1 -- inner join t1
INNER JOIN
(table2 t2 LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID) -- with this
ON t2.table2_ID = t1.table1_ID -- on this condition
So basically, first you LEFT JOIN t2 with t3, based on the join condition: table3_ID = table2_ID, then you INNER JOIN t1 with t2 on table2_ID = table1_ID.
In your second example you first INNER JOIN t1 with t2, and then LEFT JOIN the resulting inner join with table t3 on the condition table2_ID = table1_ID.
SELECT
t1.*
FROM
table1 t1
INNER JOIN table2 t2 ON t2.table2_ID = t1.table1_ID
LEFT JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
could be rewritten as:
SELECT
t1.*
FROM
(table1 t1 INNER JOIN table2 t2 ON t2.table2_ID = t1.table1_ID) -- first inner join
LEFT JOIN -- then left join
table3 t3 ON t3.table3_ID = t2.table2_ID -- the result with this
EDIT
I apologize. My first remark was wrong. The two queries will produce the same results but there may be a difference in performance as the first query may perform slower than the second query in some instances ( when table 1 contains only a subset of the elements in table 2) as the LEFT JOIN will be executed first - and only then intersected with table1. As opposed to the second query which allows the query optimizer to do it's job.
For your specific example, I don't think there should be any difference in the query plans generated, but there's definitely a difference in readability. Your 2nd example is MUCH easier to follow.
If you were to reverse the types of joins in the example, you could end up with much different results.
SELECT t1.*
FROM table1 t1
LEFT JOIN table2 t2 ON t2.table2_ID = t1.table1_ID
INNER JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
-- may not produce the same results as...
SELECT t1.*
FROM table1 t1
LEFT JOIN table2 t2
INNER JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
ON t2.table2_ID = t1.table1_ID
Based on the fact that order of the joins DOES matter in many cases - careful thought should go into how you're writing your join syntax. If you find that the 2nd example is what you're really trying to accomplish, i'd consider rewriting the query so that you can put more emphasis on the order of your joins...
SELECT t1.*
FROM table2 t2
INNER JOIN table3 t3 ON t3.table3_ID = t2.table2_ID
RIGHT JOIN table1 t1 ON t2.table2_ID = t1.table1_ID
The best way to see what is different in these two queries is to compare the Query Plan for both these queries.
There is no difference in the result sets for these IF there are always rows in table3 for a given row in table2.
I tried it on my database and the difference in the query plans was that
1. For the first query, the optimizer chose to do the join on table2 and table 3 first.
2. For the second query, the optimizer chose to join table1 and table2 first.
You should see no difference at all between the two queries, provided your DBMS' optimizer is up to scratch. That, however, even for big-iron, high-cost platforms, is not an assumption I'd be confident in making, so I'd be fairly unsurprised to discover that query plans (and consequently execution times) varied.