Explanation about nested loop - sql

Nested Loop Join
In this kind of join operation it process each row from outer input and loop through all rows of inner input to search for matching row based on join column.
Nested loops joins perform a search on the inner table for each row of the outer table, typically using an index.
example:
Select T1.Col2
From Table1 T1
Inner Join Table2 T2 ON T1.Col1 = T2.Col1 AND T1.Col1 between 1 AND 36
can you please explain which is outer input and inner input. Here we have two condition that is T1.Col1 = T2.Col1 AND T1.Col1 between 1 AND 36 table is first filtered by which condition

I would rather write the query in this way:
SELECT T1.Col2
FROM Table1 T1
INNER JOIN Table2 T2 ON T1.Col1 = T2.Col1
WHERE T1.Col1 BETWEEN 1 AND 36
The second condition is not a join condition, but a where condition (Table2 is not involved in solving that condition).
The optimizer of your database should be able to decide if filtering first Table1 is faster than join Table2 and then filter, I imagine that the later can be true if Table2 is quite small. Also indexes can change the query plan.
Anyway if you want to be sure about how your database is executing your query just check the query plan.

SELECT T1.Col2
FROM Table1 T1
INNER JOIN Table2 T2 ON T1.Col1 = T2.Col1
WHERE T1.Col1 >=1 and T1.Col1<36
you'll find better explaination to join follow the link
http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html

Related

Exactly when to user inner join or an alternative query

SELECT .... FROM TABLE1 T1, TABLE2 T2, TABLE3 T3
WHERE T1.NAME = 'ABC' AND T1.ID = T2.COL_ID AND T2.COL1 = T3.COL2
vs
SELECT .... FROM TABLE1 T1
WHERE T1.NAME = 'ABC'
INNER JOIN TABLE2 T2 ON T1.ID = T2.COL_ID
INNER JOIN TABLE3 T3 ON T2.COL1 = T3.COL2
Two questions
In terms of performance, which will perform better and why?
If Option 2 has the better performance, when should be using Option 1? (vice versa question if Option 1 has better performance)
The second query is not correct. It should be:
SELECT .... FROM TABLE1 T1
INNER JOIN TABLE2 T2 ON T1.ID = T2.COL_ID
INNER JOIN TABLE3 T3 ON T2.COL1 = T3.COL2
WHERE T1.NAME = 'ABC'
This is the right way to write your join condition. The 1st one is accepted, but technically creates a cartesian product. All modern database deals perfectly with both 1st and 2nd queries and interprets them the same way, therefore, performance should be the same. But still, you should use the second one because it is more readable and allows you to have only one way to write join weither it is a inner, left or full outer.
The answer is easy: Don't use comma-separated joins (first query). We used these in the 1980s for the lack of something better, but then in 1992 the new syntax (second query) was introduced1, because the old syntax was error-prone (it was easier to forget to apply join criteria) and harder to maintain (was missing join criteria intended or not in a query?) and there was no standard syntax for outer joins.
1 Oracle was a little late though featuring the new syntax. They introduced the new ANSI joins in Oracle 9i in 2001.
In terms of performance: There should be no difference in speed, because DBMS optimizers see that this is essentially the same query.
Your second query is syntactically incorrect by the way. The query's WHERE clause belongs after the complete FROM clause, i.e. after all the joins:
SELECT ....
FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.col_id
INNER JOIN table3 t3 ON t2.col1 = t3.col2
WHERE t1.name = 'ABC';

Convert OUTER APPLY to LEFT JOIN

We have query which is slow in production(for some internal reason),
SELECT T2.Date
FROM Table1 T1
OUTER APPLY
(
SELECT TOP 1 T2.[DATE]
FROM Table2 T2
WHERE T1.Col1 = T2.Col1
AND T1.Col2 = T2.Col2
ORDER BY T2.[Date] DESC
) T2
But when I convert to LEFT JOIN it become fast,
SELECT Max(T2.[Date])
FROM Table1 T1
LEFT JOIN Table2 T2
ON T1.Col1 = T2.Col1
AND T1.Col2 = T2.Col2
GROUP BY T1.Col1, T1.Col2
Can we say that both queries are equal? If not then how to convert it properly.
The queries are not exactly the same. It is important to understand the differences.
If t1.col1/t1.col2 are duplicated, then the first query returns a separate row for each duplication. The second combines them into a single row.
If either t1.col1 or t1.col2 are NULL, then the first query will return NULL for the maximum date. The second will return a row and the appropriate maximum.
That said, the two queries should have similar performance, particularly if you have an index on table2(col1, col2, date). I should note that under some circumstances the apply method is faster than joins, so relative performance depends on circumstances.

SQL Server Query having multiple left outer joins hangs

I have two tables say for ex: table1 and table2 as below
Table1(id, desc )
Table2(id, col1, col2.. col10.....)
col1 to col10 in table 2 could be linked with id field in table1.
I write a query which has 10 instances of table1 (each one to link col1 to col10 of table2)
select t2.id, t1_1.desc, t1_2.desc,.....t1_10.desc from table2 t2
left outer join table1 t1_1 on t1_1.id = t2.col1
left outer join table1 t1_2 on t1_2.id = t2.col2
left outer join table1 t1_3 on t1_3.id = t2.col3
.
.
.
left outer join table1 t1_10 on t1_10.id = t2.col10
where t2.id ='111'
This query is inside the Sp and when i try to execute the Sp in SSMS, it works without any problems.
However When my web application runs, the query works for few where clause value and hangs for few.
I have checked the cost of the query, and created one nonclusteredindex with this 10 columns in table2. The cost found to be reduced to 0 on joins. However, I am still seeing the query hangs
The table 1 has 500 rows and table 2 has 700 rows in it.
Can any one help.
First of all, why are you rejoining to the table 10 times rather than one join with 10 predicates?
left outer join table1 t1_1 on t1_1.id = t2.col1
left outer join table1 t1_2 on t1_2.id = t2.col2
left outer join table1 t1_3 on t1_3.id = t2.col3
.
.
.
left outer join table1 t1_10 on t1_10.id = t2.col10
vs.
left outer join table1 t1 on t1.col1 = t2.col1
and t1.col2 = t2.col2
and t1.col3 = t2.col3
just wanted to bring that up because its very unusual to rejoin to the same table like that 10 times.
As far as your query plan goes, sql server sniffs the first parameter used in the query and caches that query plan for use in future queries. This query plan can be a good plan for certain where clause values and a bad plan for other where clause values which is why sometimes it is performing well and other times it is not. If you have skews in your table columns (some where clause values have a high number of recurring values) then you could consider using OPTION(RECOMPILE) in your query to force it to develop a new execution plan each time it is called. This has pros and cons, see this answer for a discussion OPTION (RECOMPILE) is Always Faster; Why?

How can i have INNER and LEFT JOIN in the same query?

I am trying to do an INNER JOIN and LEFT JOIN in the same query in MS ACCESS and here is my query
SELECT T2.Col1, T2.Col2, T2.Col3, TB.Col1
FROM (T2
INNER JOIN TB ON
TB.Col1 = T2.Col1 AND TB.Col2 = T2.Col2)
LEFT JOIN T1
ON (T1.Col1 = TB.Col1) AND (T1.Col2 = T2.Col2)
WHERE T1.Col1 IS NULL OR T1.Col2 IS NULL
But at (T1.Col1 = TB.Col1)` it says JOIN Expression not supported. Can some one help me with this.
I don't want to create an inner query and then create another left query with that seperately
An earlier answer, since deleted, recommended you remove all the parentheses from your query. However Access requires parentheses in the FROM clause when it includes more than one join.
Since the problem is with the joins, start with a simpler query which focuses on them only. See whether this query runs without error.
SELECT *
FROM
(T2
INNER JOIN TB
ON T2.Col1 = TB.Col1 AND T2.Col2 = TB.Col2)
LEFT JOIN T1
ON T2.Col1 = T1.Col1 AND T2.Col2 = T1.Col2
Once you get the joins correct, replace * with your field names and add your WHERE clause.

Rewrite SQL code SELECT block to simplify logic

I am trying to rewrite this block with simpler logic if this can be done. I am using it within a larger SELECT statement and I think IF I can simplify this block, I might be able to improve performance of my query.
proj_catg_type_id, proj_catg_id and proj_id are all PKs in their tables.
select t1.proj_catg_name
from table1 t1, table2 t2, table3 t3
where t2.proj_catg_type_id = t1.proj_catg_type_id
and t2.proj_catg_type_id = 213
and t3.proj_id = t2.proj_id
Without knowing the referential integrety rules and the logic behind the tables it is difficult to give a 100% correct answer. But just by looking to this statement the most simplified logic would be
select t1.proj_catg_name
from table1 t1
where t1.proj_catg_type_id = 213;
select t1.proj_catg_name
from table1 t1 inner join table2 t2
on t2.proj_catg_type_id=t1.proj_catg_type_id
where t2.proj_catg_type_id=213
and t3.proj_id=t2.proj_i
maybe? is t3 used outside this subselect?
If t3 is a table outside the selct you showed, then this is a correlated subquery which you should not be using at all, ever! That turns your query into a row-by agonizing row cursor.
Use derived tables or joins to get the results.
You don't give me enough code to write a specific solution for your problem, but let me give you an example:
SELECT
field1
, field2
, (SELECT t3.field3
FROM table2 t2
JOIN table3 t3 ON t2.id = t3.id
WHERE t4.somefield = t2.somefield)
FROM table1 t1
JOIn table4 t4 ON t1.id = t4.id
SELECT
field1
, field2
, t3.field3
FROM table1 t1
JOIn table4 t4
ON t1.id = t4.id
join (SELECT field3
FROM table2 t2
JOIN table3 t3 ON t2.id = t3.id) a
ON t4.somefield = t2.somefield
The first query runs one row at a time which is extremely slow. The second should give the same results but runs in a set-based fashion which is much faster. It is important to make sure the derived table has an a alias. You could also use a CTE.