select a columns not exist in another tables multiple columns in sql - hive

I have two tables Tables1 with ID, name, and Table2 has ID1, ID2, and ID3, name1, name2, and name3.
I want to select table1.ID not exists in tables2: ID1, ID2, and ID3
select T1.ID,t1.name
from table1 t1
where not exists (
SELECT *
FROM table2 t2
where t1.ID=t2.ID1 or t1.ID=t2.ID2 or or t1.ID=t2.ID3 )
I get error message for this query

After a little research, I could find this. Basically, you can't have multiple columns for a subquery in a IN or a NOT IN condition in the WHERE clause. This is why your query is currently failing : Your subquery get all the columns from Table2.
From my understanding of your question, you want a select where the results would be the elements that are not existing in Table2.
To do this, you can simply use a LEFT OUTER JOIN. In SQL, I would left join on all three columns, but it seems like Hive does not support multiple conditions in JOIN statements, so you can use the following alternative :
SELECT T1.ID, T1.name
FROM Table1 T1
LEFT JOIN Table2 T2_1 ON T2_1.ID1 = T1.ID
LEFT JOIN Table2 T2_2 ON T2_2.ID2 = T1.ID
WHERE (T2_1.id IS NULL) AND -- the id of Table2 - T2_1
(T2_2.id IS NULL) -- the id of Table2 - T2_2
Just add as many LEFT JOIN and conditions in the WHERE clause as you have columns to check.
Here's a fiddle with this query concept (the data are not the same, but the concept is).

Join first by ID1, then resulted dataset join by ID2, then resulted dataset join by ID3:
select p2.ID, p2.name --pass3
from
(select p1.ID, p1.name --pass2
from
(SELECT T1.ID, T1.name --pass1
FROM Table1 T1
LEFT JOIN Table2 T2 ON T2.ID1 = T1.ID
where T2.ID1 is null --not in ID1
) p1 LEFT JOIN Table2 T2 ON T2.ID2 = p1.ID
where T2.ID1 is null --also not in ID2
) p2 LEFT JOIN Table2 T2 ON T2.ID3 = p2.ID
where T2.ID1 is null --also not in ID3
Joins on 2 and 3 steps will receive already reduced dataset from T1 and this solution may be good for big tables.

SELECT DISTINCT ID,NAME
FROM
(SELECT T1.ID, T1.name
FROM Table1 T1
LEFT OUTER JOIN Table2 T2 ON T2.ID1 = T1.ID
where T2.ID1 is null
union
SELECT T1.ID, T1.name
FROM Table1 T1
LEFT OUTER JOIN Table2 T2 ON T2.ID2 = T1.ID
where T2.ID2 is null
union
SELECT T1.ID, T1.name
FROM Table1 T1
LEFT OUTER JOIN Table2 T2 ON T2.ID3 = T1.ID
where T2.ID3 is null)JO

Related

SAS SQL inner join two columns able to match one column

How can I do something like select * from T1 inner join T2 on (T1.ID=T2.ID OR T1.ID2=T2.ID)
When I execute this code, it seems to fall in a infinity loop so I guess I'm wrong.
In other words, how can I match one of two columns from T1 to one column from T2
T1
ID ID2
1 10
2 20
T2
ID value
1 dummy10
20 dummy20
Result
ID ID2 value
1 10 dummy10
2 20 dummy20
Try to do like this:
select *
from T1, T2
where T1.ID = T2.ID or T1.ID2 = T2.ID
you can use 2 select statements with union, like this:
select
t1.ID,
t1.ID2,
t2.value
from Table1 as t1
inner join Table2 as t2 on t1.ID = t2.ID
UNION
select
t1.ID,
t1.ID2,
t2.value
from Table1 as t1
inner join Table2 as t2 on t1.ID2 = t2.ID
/* this will exclude values selected by other statement */
where t1.ID2 not in (select ID2 from Table1 inner join Table2 on Table1.ID = Table2.ID)
The only issue I can see with the code you provide is that you have not specified from which table you want the common column ID to be selected:
proc sql;
select
t1.*
,t2.value
from t1
inner join t2
on t1.id = t2.id or t1.id2 = t2.id;
quit;
Otherwise, your code should work. Perhaps the size of the data being joined is the problem?

How do I look for non-matching values across 3 SQL tables?

I'm looking to do what I believe is a double-nested check across three tables, but have no idea how to do so.
I have Table1, Table2, and Table3.
All are tied by an ID and a "Longform" and "Shortform" in Table1:
I'm trying to find:
Entries whose IDs appear in Table2 that have the same Longform as those in Table3, but don't share the same Shortform.
This is about as far as I've gotten:
SELECT T2.Longform,T2.Shortform FROM(
SELECT Table1.Longform,Table1.Shortform,Table1.ID FROM OuterTable1.Table1
LEFT JOIN OuterTable2.Table2 on Table1.ID = Table2.ID)
WHERE Table2.ID IS NOT NULL) T2
;
I know I'm probably going to have to do another nested select, or a join, on Outertable3.Table3 but I'm not sure which... Or where...
Any help appreciated as always.
Try the following:
Select *
(
Select T1.*
from T2
inner join T1
on T1.ID = T2.ID
) as Tab
inner join
(
Select T1.*
from T3
inner join T1
on T1.ID = T3.ID
) as Tab2
on Tab.id = Tab2.id
and Tab.Longform = Tab2.Longform
and Tab.Shortform <> Tab2.Shortform
To get the longform join table1 to table2 or table3. Then use EXISTS to check in a subquery if the IDs of table1 are different but the longform is equal.
SELECT *
FROM table2 t21
INNER JOIN table1 t11
ON t11.id = t21.id
WHERE EXISTS (SELECT *
FROM table3 t32
INNER JOIN table1 t12
ON t12.id = t32.id
WHERE t12.id <> t11.id
AND t12.longform = t11.longform);
Assuming ID is unique in all three tables
Select t2.id,t2.shortform, t1.shortform AS shortformTab1, t2.longform
FROM table2 t2
JOIN table3 t3
ON t2.id = t3.id AND t2.longform = t3.longform
JOIN table1 t1
ON t2.id = t1.id AND t2.shortform != t1.shortform

How to join these tables safely?

I have a table Table1. My application code reads from Table1 like this:
Select id, table2_id, table3_id from Table1
I would like it to also return values name from tables Table2 and Table3 by changing my query like this:
select t1.id, t1.table2_id, t1.table3_id, t2.name, t3.name
from table1 t1
left outer join table2 t2 on t1.table2_id = t2.id
left outer join table3 t3 on t1.table3_id = t3.id
However, I don't want to change the behavior of the original query, which returns 1 result per row in Table1.
I believe my changes to the query are safe because they t2.id and t3.id are unique columns, so Table2 and Table3 will contain at most 1 record for each Table3 record. If I was to user inner join, this changes would not be safe, because my query would return no results if Table2 or Table3 happen to not contain the expected record.
For this scenario, are my changes safe and correct? Or is it necessary to write some subqueries to join on?
Your query looks safe:
select t1.id, t1.table2_id, t1.table3_id, t2.name, t3.name
from table1 t1 left join
table2 t2
on t1.table2_id = t2.id left join
table3 t3
on t1.table3_id = t3.id;
You can also phrase this using correlated subqueries (or a lateral join):
select t1.*,
(select t2.name from table2 t2 where t1.table2_id = t2.id) as t2_name,
(select t2.name from table3 t3 where t1.table2_id = t3.id) as t3_name,
from table1 t1;
This is even more of a guarantee that there are no duplicates. If there were, the query would return an error.

Convert to join query

select t.* from table1 t where t.id NOT IN(
select Id from t2 where usrId in
(select usrId from t3 where sId=value));
I the result i need is like if there are matching id's in t1 and t2 then those id's should be omitted and only the remaining rows should be given to me. I tried converting into join but it is giving me the result i wanted. Below is my join query.
SELECT t.* FROM table1 t JOIN table2 t2 ON t.Id <> t2.Id
JOIN table3 t3 ON t3.Id=t2.Id WHERE t3.sId= :value
This doesn't feth me the correct result. it was returning all the rows, but i want to restrict the result based on the matching id's in table t1 and table t2. Matching id's should be ommited from the result.I will be passing the value for sId.
I believe this to be an accurate refactor of your query using joins. I don't know if we can do away with the subquery, but in any case the logic appears to be the same.
select t1.*
from table1 t1
left join
(
select t2.Id
from table2 t2
inner join table3 t3
on t2.usrId = t3.usrId
where t3.sId = <value>
) t2
on t1.Id = t2.Id
where t2.Id is null
Let's break down and solve problem step by step.
So your query
select t.* from table1 t where t.id NOT IN(
select Id from t2 where usrId in
(select usrId from t3 where sId=value));
on converting the inner query to JOIN will yield
select t.* from table1 t where t.id NOT IN
(SELECT T2.ID FROM T2 JOIN T3 on T2.UsrID =T3.UsrID and T3.sID=value)
which on further converting to JOIN with outer table will be
select t.* from table1 t LEFT JOIN
(SELECT T2.ID FROM T2 JOIN T3 on T2.UsrID =T3.UsrID and T3.sID=value)t4
ON t.id =T4.ID
WHERE t4.ID is NULL
In case you completely want to remove sub-query you can try like this
SELECT t.*
FROM table1 t
LEFT JOIN T2
ON T.ID=T2.ID
LEFT JOIN T3
ON T3.UsrId=T2.UsrID AND T3.sId=value
WHERE T3.UsrID IS NULL

Applying joins conditionally in SQL Server

I have some set of records, but now i have to select only those records from this set which have theeir Id in either of the two tables.
Suppose I have table1 which contains
Id Name
----------
1 Name1
2 Name2
Now I need to select only those records from table one
which have either their id in table2 or in table3
I was trying to apply or operator witin inner join like:
select *
from table1
inner join table2 on table2.id = table1.id or
inner join table3 on table3.id = table1.id.
Is it possible? What is the best method to approach this? Actually I am also not able to use
if exist(select 1 from table2 where id=table1.id) then select from table1
Could someone help me to get over this?
Use left join and then check if at least one of the joins has found a relation
select t1.*
from table1 t1
left join table2 t2 on t2.id = t1.id
left join table3 t3 on t3.id = t1.id
where t2.id is not null
or t3.is is not null
I would be inclined to use exists:
select t1.*
from table1 t1
where exists (select 1 from table2 t2 where t2.id = t1.id) or
exists (select 1 from table3 t3 where t3.id = t1.id) ;
The advantage to using exists (or in) over a join involves duplicate rows. If table2 or table3 have multiple rows for a given id, then a version using join will produce multiple rows in the result set.
I think the most efficient way is to use UNION on table2 and table3 and join to it :
SELECT t1.*
FROM table1 t1
INNER JOIN(SELECT id FROM Table2
UNION
SELECT id FROM Table3) s
ON(t.id = s.id)
Alternatively, you can use below SQL as well:
SELECT *
FROM dbo.Table1
WHERE id Table1.IN ( SELECT table2.id
FROM dbo.table2 )
OR Table1.id IN ( SELECT table3.id
FROM Table3 )