Semantic difference between join queries - sql

I have two queries that I thought meant the same thing, but I keep getting different results and I was hoping someone could explain how these are different:
1.
select *
from table1 a
left join table2 b on a.Id = b.Id and a.val = 0
where b.Id is null
2.
select *
from table1 a
left join table2 b on a.Id = b.Id
where b.Id is null
and a.val = 0
The point of the query is to find the rows that are in table1 and val = 0 that are not in table2.
I'm using sql server 2008 as well, but I doubt that this should matter.

When considering left joins think of them as having 3 conceptual stages.
The join filter is applied
The left rows are added back in
the where clause is applied.
You will then see why you get different results.
That also explains why this returns results
select o.*
from sys.objects o
left join sys.objects o2 on o.object_id=o2.object_id and 1=0
And this doesn't.
select o.*
from sys.objects o
left join sys.objects o2 on o.object_id=o2.object_id
where 1=0

SELECT * from TABLE1 t1
WHERE Val = 0
AND NOT EXISTS(SELEct 1 from Table2 t2 Where t1.Id = t2.Id)

If you remove the WHERE clause entirely, using a LEFT OUTER JOIN means that all the rows from the table on the left hand side will appear, even if they don't satisfy the JOIN criteria. For example, no rows satisfy the expression 1 = 0 however this:
SELECT *
FROM table1 AS a
LEFT OUTER JOIN table2 AS b
ON a.Id = b.Id
AND 1 = 0;
still results in all rows in table1 being returned where the id values match. Simply put, that's the way OUTER JOINs work.
The WHERE clause is applied after the JOIN, therefore this
SELECT *
FROM table1 AS a
LEFT OUTER JOIN table2 AS b
ON a.Id = b.Id
WHERE 1 = 0;
will return no rows.

Related

When I do left join, I am still not able to get data from left table?

The condition is I know the right table does not have all matching records with the left table.
but I am still not able to get data from left table with null from right table
select a.sales, b.profit
from T1 a
left join T2 b on a.id = b.id
where b.category = 'office'
and b.code = '245'
because of the where condition of right table, the right table does not have matching records,
without where condition I got the records.
My question is will left table be affected with where condition of right table although using left join to retain the left table records.
Your WHERE clause forces the query to only return rows which b.category and b.code match the required values and so are non-NULL; this effectively turns your JOIN condition into an INNER JOIN.
You want to put the filters in the join condition:
select a.sales,
b.profit
from T1 a
left join T2 b
on ( a.id = b.id
AND b.category = 'office'
AND b.code = '245')
Or to pre-filter T2 in a sub-query:
select a.sales,
b.profit
from T1 a
left join (
SELECT *
FROM T2
WHERE category = 'office'
AND code = '245'
) b
on a.id = b.id

Joining on one column and if no match join on another

I am attempting to use multiple columns in my join like this:
FROM Table1 t
INNER JOIN Table2 s ON t.number = s.number OR t.letter = s.letter
Both of these tables have several hundred thousand rows of data and it is running infinitely.
Any ideas?
You mean something like:
FROM Table1 t
INNER JOIN Table2 s ON case
when t.number = s.number then 1
when t.letter = s.letter then 1
else 0 end = 1
The first matching condition wins.
One possibility is to use left join and fix the rest of the query:
FROM Table1 t LEFT JOIN
Table2 sn
ON t.number = sn.number LEFT JOIN
Table2 sl
ON t.letter = sl.letter and sn.number is null
For performance, you want indexes on Table2(number) and Table2(letter).
ORs are usually produce bad performance. I would go for:
SELECT *
FROM Table1 t
INNER JOIN Table2 s ON t.number = s.number
UNION
SELECT *
FROM Table1 t
INNER JOIN Table2 s ON t.letter = s.letter

Syntax for multiple joins in sql

Working on Oracle: I am attempting to do an inner self join, with a where clause, then take that result and do a left outer join on it:
(select * from table1 A
inner join
select * from table1 B
on A.id = B.id
where
A.id is not null and B.id is not null) C
left outer join
select * from table2 D
on C.id = D.id
Somehow I am syntactically challenged and can't make this work. Can't seem to find the right syntax anywhere.
Just the put the where clause at the end. The database will get it right:
select *
from table1 A
inner join table1 B on A.id = B.id
left join table2 D on D.id = A.id
where A.id is not null
In this case, we can take advantage of the logical transitive property for your id column joins and where clause.
Your second join needs to be joined to a query add a select * from at the beginning
select * from (select * from table1 A
inner join
select * from table1 B
on A.id = B.id
where
A.id is not null and B.id is not null) C
left outer join
select * from table2 D
on C.id = D.id

how to do a join using a conditional ON

I would like to do as follows:
Select * from
table1 a
inner join table2 b
on
a.id = b.id
if (some condition is met) // Now it gets interesting!
begin
and a.name = b.name
end
Obviously, this doesn't work.
How can this best be accomplished?
Thanks Stackers!
Why can't you just put the condition in the WHERE-clause?
Generally, you would make a conditional join something like this:
Select *
from table1 a
inner join table2 b
on (a.conditional_field = 1 and a.id = b.id)
or (a.conditional_field = 2 and a.id2 = b.id2)
The important thing to note here is that makes the join condition optional, not the join itself. If you're looking to make the join itself conditional, that's what outer joins are for:
Select *
from table1 a
left outer join table2 b
on a.id = b.id
The first query will return all matching rows from either condition is true. The second query will unconditionally return all rows from table1 and only those rows from table2 where the condition is true.
I would use something like this:
SELECT * FROM table1 a
JOIN table2 b ON (a.id = b.id)
WHERE NOT ( == your condition here == ) OR a.name = b.name
If you really want to put it in the join condition, you could do something like this:
SELECT * FROM table1 a
JOIN table2 b ON (a.id = b.id AND (NOT ( == your condition here == ) OR a.name = b.name))
but I think the first form is more clear.
EDIT: as #James Curtis noted in the comments:
it is important to note that the option to put the condition in the
WHERE clause is only valid for an INNER JOIN, for an outer join you
may eliminate rows.

sql, outer join

I have two tables, linked with an outer join. The relationship between the primary and secondary table is a 1 to [0..n]. The secondary table includes a timestamp column indicating when the record was added. I only want to retrieve the most recent record of the secondary table for each row in the primary. I have to use a group by on the primary table due to other tables also part of the SELECT. There's no way to use a 'having' clause though since this secondary table is not part of the group.
How can I do this without doing multiple queries?
For performance, try to touch the table least times
Option 1, OUTER APPLY
SELECT *
FROM
table1 a
OUTER APPY
(SELECT TOP 1 TimeStamp FROM table2 b
WHERE a.somekey = b.somekey ORDER BY TimeStamp DESC) x
Option 2, Aggregate
SELECT *
FROM
table1 a
LEFT JOIN
(SELECT MAX(TimeStamp) AS maxTs, somekey FROM table2
GROUP BY somekey) x ON a.somekey = x.somekey
Note: each table is mentioned once, no correlated subqueries
Something like:
SELECT a.id, b.*
FROM table1 a
INNER JOIN table2 b ON b.parentid = a.id
WHERE b.timestamp = (SELECT MAX(timestamp) FROM table2 c WHERE c.parentid = a.id)
Use LEFT JOIN instead of INNER JOIN if you want to show rows for IDs in table1 without any matches in table2.
select *
from table1 left outer join table2 a on
table1.id = a.table1_id
where
not exists (select 1 from table2 b where a.table1_id = b.table1_id and b.timestamp > a.timestamp)
The quickest way I know of is this:
SELECT
A.*,
B.SomeField
FROM
Table1 A
INNER JOIN (
SELECT
B1.A_ID,
B1.SomeField
FROM
Table2 B1
LEFT JOIN Table2 B2 ON (B1.A_ID=B2.A_ID) AND (B1.TimeStmp < B2.TimeStmp)
WHERE
B2.A_ID IS NULL
) B ON B.A_ID = A.ID