Left Outer Join of Same Table (self-join; not first table) Not Returning Null Values - sql

My LEFT OUTER JOIN clause on a table (not the first table) on itself (self-join too) is not returning null values, which skews my SELECT statements. The query is written (table names inconsequential):
Select
SUM(CASE WHEN table2.date =‘day’ and table4.columnX =‘5’ then table3.value1 END),
SUM(CASE WHEN table2.date=’day’ and table4.columnX IS NULL then table3.value2 END)
FROM table1
INNER JOIN table2 on ...
INNER JOIN table3 on ...
LEFT OUTER JOIN table4 on table4.columnX=’5’
In which the first SUM(CASE WHEN) statement uses on one value in table4.columnX - when it is equal to '5' - and the second SUM(CASE WHEN) statement uses all other values - whenever it is not equal to '5'.
As it stands, the query is only returning results where table4.columnX='5', and not where table4.columnX is equal to everything else. As such, it appears the LEFT OUTER JOIN is not returning all null values for table4.columnX<>'5'. I think this may be because the join is written incorrectly. As a note, there are no fields in table4 that can be joined to fields in other tables, so it has to be a self-join of some sort (I believe). Help is appreciated - thank you!

What a LEFT OUTER JOIN will do is take all the rows from the "left table" and conditionally join rows from the "right table". If for a specific row in the "left table", there is no matching row in the "right table", then your "right data values" for the joined row will be NULL.
Your join condition is strange, though, since it doesn't reference another table. If that's really what you want, then you should change it to a CROSS JOIN:
Old: LEFT OUTER JOIN table4 on table4.columnX=’5’
New: CROSS JOIN table4 on table4.columnX=’5’
This will join all rows in table4 where columnX = 5 to each row in your preceding query above. Not sure if that's what you're looking for...

Related

How to use UPPER keyword with the INNER JOIN in SQL?

I have a SQL query in which I want to use INNER JOIN with the UPPER keyword.
=> CASE A with INNER JOIN: I am doing INNER JOIN with the other table. Notice I have not written the complete query so you won't find INNER JOIN keyword below.
=> CASE B without INNER JOIN: I am not doing INNER JOIN with the other table. Notice I have not written the complete query.
1st query:
Case A with INNER JOIN: UPPER(hello_world.column_name1) AS COLUMN_NAME1,
CASE B without INNER JOIN: UPPER(column_name1) AS COLUMN_NAME1,
2nd query:
CASE A with INNER JOIN: UPPER(CASE WHEN hello_world.COLUMN_NAME2 IS NULL THEN 'Open' ELSE hello_world.COLUMN_NAME2 END) AS COLUMN_NAME2,
CASE B without INNER JOIN: UPPER(CASE WHEN COLUMN_NAME2 IS NULL THEN 'Open' ELSE COLUMN_NAME2 END) AS COLUMN_NAME2,
Problem Statement:
I am wondering if its the right way to use INNER JOIN with the UPPER keyword (CASE A in both queries).
If it yields the results you expect, then yes. INNER JOIN is the act of joining two tables, that is, based on a criteria, records from table1 will be matched by records from table2 as long as they meet the criteria and such matches will yield records of table3, i.e. the result.
Your criteria needs to be boolean, that is, it can be true and it can be false. In the formula of your criteria you may use the UPPER function, like
...
FROM A
JOIN B
ON UPPER(A.X) = UPPER(B.Y)
...
Yet, your UPPER(hello_world.column_name1) AS COLUMN_NAME1 strongly suggests that you ponder about the SELECT clause of a query, which, based on your description has a join. The use of UPPER is useful wherever it makes sense and it is syntactically correct.

SQL Different between Left join on... and Left Join on..where

I have two sql to join two table together:
select top 100 a.XXX
,a.YYY
,a.ZZZ
,b.GGG
,b.JJJ
from table_01 a
left join table_02 b
on a.XXX = b.GGG
and b.JJJ = "abc"
and a.YYY between '01/08/2009 13:18:00' and '12/08/2009 13:18:00'
select top 100 a.XXX
,a.YYY
,a.ZZZ
,b.GGG
,b.JJJ
from table_01 a
left join table_02 b
on a.XXX = b.GGG
where b.JJJ = "abc"
and a.YYY between '01/08/2009 13:18:00' and '12/08/2009 13:18:00'
The outcome of them is different but I don't understand the reason why.
I would be grateful if I can get some help here.
Whenever you are using LEFT JOIN, all the conditions about the content of the right table should be in the ON clause, otherwise you are effectively converting your LEFT JOIN to an INNER JOIN.
The reason for that is that when a LEFT JOIN is used, all the rows from the left table will be returned. If they are matched by the right table, the values of the matching row(s) will be returned as well, but if they are not matched with any row on the right table, then the right table will return a row of null values.
Since you can't compare anything to NULL (not even another NULL) (Read this answer to find out why), you are basically telling your database to return all rows that are matched in both tables.
However, when the condition is in the ON clause, Your database knows to treat it as a part of the join condition.

In SQL outer joins, what part of the query puts a table on the "left" or "right"?

I am reading books on outer joins and they reference the "position" of a table in determining whether all of its records will be displayed.
I am confused about how exactly the position (left/right) is determined?
If we consider the standard SQL join
Select * FROM
Table_A left outer join Table_B
on Table_A.ID = Table_B.Product_ID
What part of this query is determining the position of each Table?
Is it the join part:
Table_A left outer join Table_B
Where Table_A is on the "left" because it is left of the join word?
Or is it the "=" part:
on Table_A.ID = Table_B.Product_ID
Where Table_A is on the "left" because it is left of the "=" sign?
This makes the difference, so Table name on left side of LEFT OUTER JOIN is leading one
Table_A left outer join Table_B
As per the MSDN, left outer joins include all of the records from the first (left) of two tables, even if there are no matching values for records in the second (right) table.
It's the relative position of the table name in the SQL.
table1 left join table2
means table 1 is the principle table, where as in a right join the table2 would be the principle table.

What means "table A left outer join table B ON TRUE"?

I know conditions are used in table joining. But I met a specific situation and the SQL codes writes like "Table A join table B ON TRUE"
What will happen based on the "ON TRUE" condition? Is that just a total cross join without any condition selection?
Actually, the original expression is like:
Table A LEFT outer join table B on TRUE
Let's say A has m rows and B has n rows. Is there any conflict between "left outer join" and "on true"? Because it seems "on true" results a cross join.
From what I guess, the result will be m*n rows. So, it has no need to write "left outer join", just a "join" will give the same output, right?
Yes. That's the same thing as a CROSS JOIN.
In MySQL, we can omit the [optional] CROSS keyword. We can also omit the ON clause.
The condition in the ON clause is evaluated as a boolean, so we could also jave written something like ON 1=1.
UPDATE:
(The question was edited, to add another question about a LEFT [OUTER] JOIN b which is different than the original construct: a JOIN b)
The "LEFT [OUTER] JOIN" is slightly different, in that rows from the table on the left side will be returned even when there are no matching rows found in the table on the right side.
As noted, a CROSS JOIN between tables a (containing m rows) and table b containing n rows, absent any other predicates, will produce a resultset of m x n rows.
The LEFT [OUTER] JOIN will produce a different resultset in the special case where table b contains 0 rows.
CREATE TABLE a (i INT);
CREATE TABLE b (i INT);
INSERT INTO a VALUES (1),(2),(3);
SELECT a.i, b.i FROM a LEFT JOIN b ON TRUE ;
Note that the LEFT JOIN will returns rows from table a (a total of m rows) even when table b contains 0 rows.
A cross join produces a cartesian product between the two tables, returning all possible combinations of all rows. It has no on clause because you're just joining everything to everything.
Cross join does not combine the rows, if you have 100 rows in each table with 1 to 1 match, you get 10.000 results, Innerjoin will only return 100 rows in the same situation.
These 2 examples will return the same result:
Cross join
select * from table1 cross join table2 where table1.id = table2.fk_id
Inner join
select * from table1 join table2 on table1.id = table2.fk_id
Use the last method
The join syntax's general form:
SELECT *
FROM table_a
JOIN table_b ON condition
The condition is used to tell the database how to match rows from table_a to table_b, and would usually look like table_a.some_id = table_b.some_id.
If you just specify true, you will match every row from table_a with every row of table_b, so if table_a contains n rows and table_b contains m rows the result would have m*n rows.
Most(?) modern databases have a cleaner syntax for this, though:
SELECT *
FROM table_a
CROSS JOIN table_b
The difference between the pure cross join and left join (where the condition is forced to be always true, as when using ON TRUE) is that the result set for the left join will also have rows where the left table's rows appear next to a bunch of NULLs where the right table's columns would have been.

Left Outer join and an additional where clause

I have a join on two tables defined as a left outer join so that all records are returned from the left hand table even if they don't have a record in the right hand table. However I also need to include a where clause on a field from the right-hand table, but.... I still want a row from the left-hand table to be returned for each record in the left-hand table even if the condition in the where clause isn't met. Is there a way of doing this?
Yes, put the condition (called a predicate) in the join conditions
Select [stuff]
From TableA a
Left Join TableB b
On b.Pk = a.Pk
-- [Put your condition here, like this]
And b.Column = somevalue
The reason this works is because the query processor applies conditions in a where clause after all joins are completed, and the final result set has been constructed. So, at that point, a column from the a table on the outer side of a join that has null in a a column you have established a predicate on will be excluded.
Predicates in a join clause are applied before the two result sets are "joined". At this point all the rows on both sides of the join are still there, so the predicate is effective.
You just need to put the predicate into the JOIN condition. Putting it into the WHERE clause would effectively convert your query to an inner join.
For Example:
...
From a
Left Join b on a.id = b.id and b.condition = 'x'
You can use
WHERE (right_table.column=value OR right_table.column IS NULL)
This will return all rows from table 1 and table 2, but only where table 1 does not have a corresponding row in table 2 or the corresponding row in table 2 matches your criteria.
SELECT x.fieldA, y.fieldB
FROM x
LEFT OUTER JOIN (select fieldb, fieldc from Y where condition = some_condition)
ON x.fieldc = y.fieldc
select *
from table1 t1
left outer join table2 t2 on t1.id = t2.id
where t1.some_field = nvl(t2.some_field, t1.some_field)
UPD: errr... no. this way:
select *
from table1 t1
left outer join table2 t2 on t1.id = t2.id
where some_required_value = nvl(t2.some_field, some_required_value)
nvl is an Oracle syntax which replaces first argument with second in case it is null (which is common for outer joins). You can use ifnull or coalesce for other databases.
Thus, you compare t2.some_field with your search criteria if it has met join predicate, but if it has not, then you just return row from table1, because some_required_value compared to itself will always be true (unless it is null, however - null = null yields null, neither true not false.