Convert OUTER APPLY to LEFT JOIN - sql

We have query which is slow in production(for some internal reason),
SELECT T2.Date
FROM Table1 T1
OUTER APPLY
(
SELECT TOP 1 T2.[DATE]
FROM Table2 T2
WHERE T1.Col1 = T2.Col1
AND T1.Col2 = T2.Col2
ORDER BY T2.[Date] DESC
) T2
But when I convert to LEFT JOIN it become fast,
SELECT Max(T2.[Date])
FROM Table1 T1
LEFT JOIN Table2 T2
ON T1.Col1 = T2.Col1
AND T1.Col2 = T2.Col2
GROUP BY T1.Col1, T1.Col2
Can we say that both queries are equal? If not then how to convert it properly.

The queries are not exactly the same. It is important to understand the differences.
If t1.col1/t1.col2 are duplicated, then the first query returns a separate row for each duplication. The second combines them into a single row.
If either t1.col1 or t1.col2 are NULL, then the first query will return NULL for the maximum date. The second will return a row and the appropriate maximum.
That said, the two queries should have similar performance, particularly if you have an index on table2(col1, col2, date). I should note that under some circumstances the apply method is faster than joins, so relative performance depends on circumstances.

Related

Why full outer join returns less results than join?

I have two sql in netezza:
1.
SELECT T1.Col1, T2.Col2 FROM TableA T1 JOIN TableB T2 ON T1.Col3 = T2.Col3
2.
SELECT T1.Col1, T2.Col2 FROM TableA T1 FULL OUTER JOIN TableB T2 ON T1.Col3 = T2.Col3
I assume that 2 should return more or equal results as 1. However, results from 1 is more than 2.
Could someone help me understand why?

SQL - Compare all column's data from two tables efficiently

I have two tables say Table1 and Table2 with same columns and same schema structure.
Now I want to select data from Table1 which is present in Table2. However, when comparing data, I want to compare all the columns present in both these table. Like, entire record in Table1 should be present in Table2. What is the fastest and most efficient way to achieve this in SQL Server 2008? Both the tables contain around 1000-2000 records and these tables will get accessed very frequently.
The intersect operator does just that:
SELECT *
FROM table1
INTERSECT
SELECT *
FROM table2
With an " exists", you have a solution :
select * from Table1 t1 where exists (select 1 from Table2 t2
where t1.col1 = t2.col1
and t1.col2 = t2.col2
and t1.col3 = t2.col3
and ... // check here all columns ...
)
There is however a little problem in this solution in the case of null values, which can only be tested via a "IS NOT NULL" or "IS NULL", hence the complementary solution:
select * from Table1 t1 where exists (select 1 from Table2 t2
where (t1.col1 = t2.col1 or (t1.col1 IS NULL and t2.col1 IS NULL))
and (t1.col2 = t2.col2 or (t1.col2 IS NULL and t2.col2 IS NULL))
and (t1.col3 = t2.col3 or (t1.col3 IS NULL and t2.col3 IS NULL))
and ... // check here all columns ...
)

Explanation about nested loop

Nested Loop Join
In this kind of join operation it process each row from outer input and loop through all rows of inner input to search for matching row based on join column.
Nested loops joins perform a search on the inner table for each row of the outer table, typically using an index.
example:
Select T1.Col2
From Table1 T1
Inner Join Table2 T2 ON T1.Col1 = T2.Col1 AND T1.Col1 between 1 AND 36
can you please explain which is outer input and inner input. Here we have two condition that is T1.Col1 = T2.Col1 AND T1.Col1 between 1 AND 36 table is first filtered by which condition
I would rather write the query in this way:
SELECT T1.Col2
FROM Table1 T1
INNER JOIN Table2 T2 ON T1.Col1 = T2.Col1
WHERE T1.Col1 BETWEEN 1 AND 36
The second condition is not a join condition, but a where condition (Table2 is not involved in solving that condition).
The optimizer of your database should be able to decide if filtering first Table1 is faster than join Table2 and then filter, I imagine that the later can be true if Table2 is quite small. Also indexes can change the query plan.
Anyway if you want to be sure about how your database is executing your query just check the query plan.
SELECT T1.Col2
FROM Table1 T1
INNER JOIN Table2 T2 ON T1.Col1 = T2.Col1
WHERE T1.Col1 >=1 and T1.Col1<36
you'll find better explaination to join follow the link
http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html

How can i have INNER and LEFT JOIN in the same query?

I am trying to do an INNER JOIN and LEFT JOIN in the same query in MS ACCESS and here is my query
SELECT T2.Col1, T2.Col2, T2.Col3, TB.Col1
FROM (T2
INNER JOIN TB ON
TB.Col1 = T2.Col1 AND TB.Col2 = T2.Col2)
LEFT JOIN T1
ON (T1.Col1 = TB.Col1) AND (T1.Col2 = T2.Col2)
WHERE T1.Col1 IS NULL OR T1.Col2 IS NULL
But at (T1.Col1 = TB.Col1)` it says JOIN Expression not supported. Can some one help me with this.
I don't want to create an inner query and then create another left query with that seperately
An earlier answer, since deleted, recommended you remove all the parentheses from your query. However Access requires parentheses in the FROM clause when it includes more than one join.
Since the problem is with the joins, start with a simpler query which focuses on them only. See whether this query runs without error.
SELECT *
FROM
(T2
INNER JOIN TB
ON T2.Col1 = TB.Col1 AND T2.Col2 = TB.Col2)
LEFT JOIN T1
ON T2.Col1 = T1.Col1 AND T2.Col2 = T1.Col2
Once you get the joins correct, replace * with your field names and add your WHERE clause.

Optimal query writing

I have 3 tables t1,t2,t3 each having 35K records.
select t1.col1,t2.col2,t3.col3
from table1 t1,table2 t2,table3 t3
where t1.col1 = t2.col1
and t1.col1 = 100
and t3.col3 = t2.col3
and t3.col4 = 101
and t1.col2 = 102;
It takes more time to return the result (15 secs). I have proper indexes.
What is the optimal way of rewriting it?
It's probably best to run your query with Explain Extended placed in front of it. That will give you a good idea of what indexes it is or isn't using. Include the output in your question if you need help parsing the results.
If you have an index based on t1.Col1 or t1.Col2, use THAT as the first part of your WHERE clause. Then, by using the "STRAIGHT_JOIN" clause, it tells MySQL to do exactly as I've listed here. Yes, this is older ANSI querying syntax which is still completely valid (as you originally had too), but should come out quickly with a response. The first two of the where clause will immediately restrict the dataset while the rest actually completes the joins to the other tables...
select STRAIGHT_JOIN
t1.Col1,
t2.Col2,
t3.Col3
from
table1 t1,
table2 t2,
table3 t3
where
t1.Col1 = 100
and t1.Col2 = 102
and t1.col1 = t2.col1
and t2.col3 = t3.col3
and t3.Col4 = 101