I am just trying to understand the concept behind joining of 2 tables with an OR condition.
My requirement is: I need to join 2 tables Table1 [colA, colB] and Table2 [colX, colY] on columns Table1.colA = Table2.colB but if colA is NULL the condition should be Table1.colB = Table2.colY.
Do I need to do join them separately and then do union? Or is there a way I can do it in one join? Note that I have millions of records in both tables and its a left join and the tables reside in HIVE. I don't have a reproducible example, just trying to understand the concept.
While I'm not familiar with HiveQL, in SQL server this could be accomplished as follows:
SELECT *
FROM table1 t1
JOIN table2 t2
ON COALESCE(t1.cola, t1.colb) = CASE
WHEN t1.cola IS NULL THEN t2.coly
ELSE t2.colx
END
The logic should be fairly readable.
Translate your conditions directly:
SELECT *
FROM table1 t1 JOIN
table2 t2
ON (t1.cola = t2.colb) or
(t1.cola is null and t1.colb = t2.coly)
Usually, or is a performance killer in joins. This wold often be expressed using two separate left joins:
SELECT . . . , COALESCE(t2a.col, t2b.col) as col
FROM table1 t1 LEFT JOIN
table2 t2a
ON (t1.cola = t2.colb) LEFT JOIN
table2 t2b
ON t1.cola is null and t1.colb = t2.coly;
Related
I have three tables and I have to write one query to update table 1 row from table 3 and the only matching columns I have is in table 2.
Table 1 which has incorrect data:
Table 3 has the correct data:
I did try to write a query and execute it but it gives me an error saying there are too many rows too select which is true I do have many rows to correct but it still wouldn't correct. What do you think I should do. This is my query so far.
UPDATE Table1
SET Table1.Number = (SELECT Table3.Number
FROM Table2
FULL OUTER JOIN Table1 ON Table1.ID = Table2.ID
FULL OUTER JOIN Table3 ON Table3.Signin = Table2.Signin
WHERE (Table2.ID = Table1.ID)
AND (Table1.Number = 'xxx'))
WHERE (Tale1.Number = 'xxx')
In Where clause of JOIN query need to modify as multiple records are generating by inappropriate condition.Try to use Table3 components instead of using Table1 in joining query where clause.
UPDATE Table1
SET Table1.NUMBER = (SELECT table3.NUMBER FROM Table1 FULL OUTER JOIN Table2
ON Table1.ID = Table2.ID
FULL OUTER JOIN Table3
ON Table2.SIGNIN = Table3.SIGNIN
WHERE Table3.SIGNIN = 100) // This is the point where you need to modify your code
WHERE Table1.ID = 1;
ONLINE DEMO HERE
It actually worked after I removed this line from my query.
FULL OUTER JOIN Table1 ON table1.ID = Table2.ID
Thanks for the help.
You are fairly close. When doing the update though unless you are wanting to clear value for t1.number when a record is not matched in t3, you will want to use INNER JOIN. FULL OUTER JOIN would mean you are trying to update rows in t1 that don't exist but a LEFT JOIN you would update t1.number to NULL if a record in t3 doesn't exist.
UPDATE t1
SET t1.Number = t3.Number
FROM
Table1 t1
INNER JOIN Table2 t2
ON t1.Id = t2.Id
INNER JOIN Table3 t3
ON t2.Signin = t.3.Signin
WHERE
t1.number <> t3.number
--Or if you have nulls something like
--ISNULL(t1.number,'xxx') <> ISNULL(t3.number,'xxx')
-- if you only want to update when t1.number = 'xxx' then
--t1.number = 'xxx'
t1,t2,t3 are table aliases that I created by adding the alias after table name. By using join syntax rather than a sub select you simplify your were conditions. In sql-sever if more than 1 record in t2 & t3 match it will select one row randomly in the case of a one to many relationship. If you want a specific record when not one to one relation you can use window functions and common table expressions (cte) to limit t3 to the exact record you want to use.
This is somewhat of a followon to SQL JOIN where to place the WHERE condition?
I would prefer to use the USING clause on the join, but am not yet able to put the field value condition with the join.
SELECT 1
FROM table1 t1
JOIN table2 t2 USING(id) AND t2.field = 0;
ORA-00933: SQL command not properly ended
Is it possible to have USING and another condition as part of the JOIN clause?
You can use:
SELECT 1
FROM table1 t1
JOIN table2 t2 USING(id)
WHERE t2.field = 0;
Using USING(id) is like using ON t1.id = t2.id except that in the JOIN result instead of two columns t1.id & t2.id there is only one id column.
For INNER JOIN USING with a condition followed by an OUTER JOIN you need a subquery to keep the WHERE with the USING:
SELECT ...
FROM (SELECT id, ...
FROM table1 t1
JOIN table2 t2 USING(id)
WHERE t2.field = 0) s
LEFT JOIN ...;
For an OUTER JOIN USING with a condition you need a subselect:
SELECT ...
FROM table1 t1
LEFT JOIN (SELECT *
FROM table2 t2
WHERE t2.field = 0) t2
USING (id);
See this re ON/WHERE with JOIN. See this re ON/WHERE when mixing INNER & OUTER JOINs.
I tried these queries in our application. Each returned different result sets for me.
Query Set 1
SELECT *
FROM TABLE1 T1
LEFT OUTER JOIN TABLE2 T2 ON (T1.ID = T2.ID
AND T1.STATUS = 'A'
AND T2.STATUS = 'A')
INNER JOIN TABLE3 T3 ON (T2.ID = T3.ID)
WHERE T3.STATUS = 'A'
Query Set 2
SELECT *
FROM TABLE1 T1
LEFT OUTER JOIN TABLE2 T2 ON (T1.ID = T2.ID
AND T2.STATUS = 'A')
INNER JOIN TABLE3 T3 ON (T2.ID = T3.ID)
WHERE T3.STATUS = 'A'
AND T1.STATUS = 'A'
I couldn't find out why each query returns different outputs. Also please guide me about which approach is best when we use multiple joins (left, right, Inner) with Filtering clauses.
Thanks for any help
On the first
SELECT *
FROM TABLE1 T1
LEFT OUTER JOIN TABLE2 T2 ON (T1.ID = T2.ID
AND T1.STATUS = 'A'
AND T2.STATUS = 'A')
INNER JOIN TABLE3 T3 ON (T2.ID = T3.ID)
WHERE T3.STATUS = 'A'
AND T1.STATUS = 'A' has zero effect
It is a left join - you are gong to get all of T1 period
When you move AND T1.STATUS = 'A' to the where then it is applied
The difference in your code is the T1.STATUS = 'A' location.
Query 1:
You join the T1 and T2 tables on all the common IDS and only if both T1.STATUS = 'A' = T2.STATUS.
Query 2:
You join the T1 and T2 tables on all the common IDS and only if T2.STATUS = 'A'
Basically, query 1: you filter T1's data first and then you join.
query 2: you join the tables first and then you filter the data on the already joined table.
Also regarding the joins, usually the left and inner joins return the same results. Have a look here and here I found both links really useful.
Finally, my personal preference is to use inner joins unless I definitely need all the rows from either left or right table. I believe it makes my queries simpler to read and maintain.
I hope that helps.
You are putting a filter on the right table of a left join, creating an inner join. Your first query will return less results, whilst your second will have NULLS against non matching rows on the right table. This link helped me to understand joins much better, as I really struggled for a long time to grasp the concept. HERE
If my answer is not clear please ask me for a revision.
How can I give conditions for left join in NHibernate 2.0 HQL query.
Eg in SQL.
select t1.* from table1 t1
left join table t2 on t2.id = t1.id and t2.column2 = t1.column2
I tried the below HQL query, but got an exception "unexpected token: with"
select t1 from Table1 t1
left join t1.Table2 t2 with t2.column 2 = t1.column2
There is no need to use ON statement on (LEFT or any other) JOIN. That condition will be injected by mapping. There is the HQL doc:
14.3. Associations and joins
So, this HQL will work:
SELECT t1 from Entity1 AS t1
LEFT JOIN t1.ReferencePropertyToEntity2 t2
and the generated sql will be like this:
SELECT t1 from Table1 AS t1
LEFT JOIN Table2 t2
ON t1.ReferenceColumnToTable2 = t2.Table2_ID
But in case, that we want to do more restrictions, we can extend ON clause with more conditions - but they will be all applied with AND
SELECT t1 from Entity1 AS t1
LEFT JOIN t1.ReferencePropertyToEntity2 t2
WITH t2.IsActive = 1
OR t1.IsDeleted = 0
will result in
SELECT t1 from Table1 AS t1
LEFT JOIN Table2 t2
ON t1.ReferenceColumnToTable2 = t2.Table2_ID
AND (
t2.IsActive = 1 OR t1.IsDeleted = 0
)
So, in case that we want to use WITH to totally replace ON generated by mapping, we have to go different way - with CROSS JOIN:
NHibernate HQL Inner Join (SQL Server,Visual C#)
How to join Two tables of two non relashinship defined columns using Nhibernate QueryOver
I have three tables and two seperate SQL queries which are working correctly and I am having correct results.
If I try to join these three tables I am having null as result.
First query:
select T1.ID,T3.COMPANY
from T1,T3
where (T1.status!='CLOSED') and (T1.PRIORITY)>5 and T1.CLASSID=T3.CLASSID
Second query:
SELECT T1.ID, T2.DESCRIPTION
FROM T1
LEFT OUTER JOIN T2
ON T1.ID=T2.KEY
WHERE T1.status!='CLOSED'
AND (T2.CREATEDATE= (SELECT MAX(CREATEDATE)
FROM T2
WHERE T2.KEY=T1.ID))
I tried to join them but as result I am having null:
select T1.ID,T3.COMPANY,T2.DESCRIPTION
from T1
INNER JOIN T3 ON T1.CLASSID=T3.CLASSID
LEFT OUTER JOIN T2
ON T1.ID=T2.KEY
where (T1.status!='CLOSED') AND (T1.PRIORITY)>5
AND (T2.CREATEDATE= (SELECT MAX(CREATEDATE)
FROM T2
WHERE T2.KEY=T1.ID))
like it does not recognized last part for taking MAX value from T2 table.
What am I doing wrong? Thanks for help
Firstly, use an alias for the subquery on table T2.
T2.CREATEDATE =
(SELECT MAX(T2Alias.CREATEDATE)
FROM T2 AS T2Alias
WHERE T2Alias.KEY = T1.ID)
Secondly, consider moving this condition into the ON clause of the LEFT JOIN to table T2.
The first thing that jumps out at me is the new dependency on both T1.Priority > 5 and T2.CreateDate value being equal to the result of the inline query:
( AND (T1.PRIORITY) > 5
AND (T2.CREATEDATE =
(SELECT MAX(CREATEDATE) FROM T2 WHERE T2.KEY = T1.ID) )
Without the data it's difficult to check however this may be the issue