Checking to see if this is possible in Hive:
Select a.col1,b.col1
from tableA a join tableB b on a.col1 = b.col1
lateral view explode(numcred) tableA as creds
where creds.id = 9;
I can not find the answer in the docs. In short:
I want to JOIN on two tables AND LATERAL VIEW EXPLODE TABLEA
Seems simple enough but throws syntax issue.
select a.col1
,b.col1
from (Select a.col1
from tableA a
lateral view explode(numcred) e as creds
where e.creds.id = 9
) a
join tableB b
on a.col1 = b.col1
Not at my computer now, so no way to test this, but my guess is you'll have to write an inner query. Something like this:
SELECT
a.col1,
b.col1
FROM (
SELECT
dummy.col1
FROM table_a dummy
LATERAL VIEW EXPLODE(numcred) tableA as creds
WHERE
creds.id = 9
) a
JOIN tableB b
ON
a.col1 = b.col1
Related
I have a query like this:
Query 1)
select A.col1, B.col2
from A
left join B on A.id= B.id and B.col3 = 'Hello';
I want to rewrite it to use a temp table for performance issue (I need the result the be exactly the same):
Query 2.1 and 2.2
Select B.id, B.col2
into #temp
from B
where B.col3 ='Hello';
select A.col1, t.col2
from A
left join #temp AS t on A.id= t.id;
But my result is not the same (the temp table version has some nulls in B.col2 where the first version does not have).
for me, both queries have the same result
in your first query, you have LEFT join, this means all the rows from table A. Try inner join instead.
Suppose you have two tables A and B and you are trying to write a JOIN query, is the following possible:
SELECT A.col1, B.col1
FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = 'hello')
Will this return a table of col1 from table A and col2 from table B where there is a match in the second column across the tables and the third column of table B is 'hello'?
I.e. it will only return rows that are matching in col2 and this is further reduced to the cases where col3 in table B is 'hello'?
Yes. You can use:
Below will Join the Records in B table (Col3='hello') with A:
SELECT A.col1, B.col1
FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = 'hello')
Below will Join all Records in B table with A, And performing where at Result of A and B:
SELECT A.col1, B.col1
FROM A JOIN B on A.col2 = B.col2
WHERE B.col3 = 'hello'
Both will give the same result when no other tables joined.
Yes you can.
You can specify any kind of boolean condition in the ON clause.
It is not mandatory that any column is involved in the condition so all of the following are valid:
SELECT A.col1, B.col1 FROM A JOIN B on 1=1
SELECT A.col1, B.col1 FROM A JOIN B on B.col3 = 'hello'
SELECT A.col1, B.col1 FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = 'hello')
SELECT A.col1, B.col1 FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = C.col3)
SELECT A.col1, B.col1 FROM A LEFT JOIN B on (C.col3 = 'bye')
But pay attention, if you limit the condition to only key fields the optimizer engine will improve the performances very much.
For an inner join, these two statements are equivalent:
SELECT A.col1, B.col1
FROM A JOIN
B
ON A.col2 = B.col2 AND B.col3 = 'hello';
and:
SELECT A.col1, B.col1
FROM A JOIN
B
ON A.col2 = B.col2
WHERE B.col3 = 'hello';
Both should have the same execution plans as well.
Some people prefer putting filtering conditions in the WHERE clause, so the query is more clear about "conditions between tables" versus "filters on the result set". I tend to agree with this sentiment, although I'm not dogmatic about it.
OUTER JOINs are different. For an outer join, it makes a big different where the conditions go. In that case, you generally do not have a choice, so you use ON or WHERE to get the logic that you want.
Is there a way to join 2 tables together on one and only one of the possible conditions? Joining on condition "a" or "b" could duplicate rows, but I'm looking to only join once. I came up with a potential solution, but I'm wondering if there is a more slick way to do it.
For example:
SELECT *
FROM TableA a
LEFT JOIN TableB b
ON a.col1 = b.col1
OR (a.col1 != b.col1 AND a.col2 = b.col2)
This would join the tables on col1 OR col2 BUT NOT BOTH. Is there a cleaner way of doing this?
Not more efficient but I think more clear
SELECT *
FROM TableA a
LEFT JOIN TableB b
ON (a.col1 = b.col1 or a.col2 = b.col2)
AND NOT (a.col1 = b.col1 and a.col2 = b.col2)
Your method works. If you only want one (or a handful) of columns from b, I would suggest:
SELECT a.*, COALESCE(b.col3, b2.col3)
FROM TableA a LEFT JOIN
TableB b
ON a.col1 = b.col1 LEFT JOIN
TableB b2
ON a.col1 <> b2.col1 AND a.col2 = b2.col2;
Removing the OR from the JOIN conditions allows the optimizer to generate a better execution plan.
I need to select a set of data from a table (TableA), but only if it's not in another table (TableB).
SELECT thisData FROM dbo.TableA WHERE thisData is not existing in dbo.TableB
I'm not really well versed in SQL.
You can do EXCEPT:
SELECT thisData FROM dbo.TableA
except
SELECT thisData FROM dbo.TableB
Or, a more general solution, NOT EXISTS:
select * from dbo.TableA ta
where not exists (select 1 from dbo.TableB tb
where tb.thiscolumn1 = ta.thiscolumn1
[ and tb.thiscolumn2 = ta.thiscolumn2 etc]
)
You can use a not exists condition:
SELECT thisData
FROM dbo.TableA a
WHERE NOT EXISTS
(
SELECT *
FROM TableB b
WHERE a.thisData = b.thisData
)
Use a left join and NULL check. This will only return the rows that don't exist in table b and is much more performant than doing multiple 'selects' and 'exists'.
SELECT a.thisData FROM dbo.TableA a
LEFT JOIN dbo.TableB b ON b.thisData = a.thisData
WHERE b.thisData IS NULL
Edit: if you need to compare multiple columns, you can achieve that as well
SELECT a.col1, a.col2, ... FROM dbo.TableA a
LEFT JOIN dbo.TableB b ON b.col1 = a.col1 AND b.col2 = a.col2 AND ...
WHERE b.col1 IS NULL
I have two tables, say A and B. I wish to compare three or more columns in both tables and to return any rows in table B that don't match all of the compared columns.
I've looked at doing a left join function from recommendations, but can't quite figure it out.
Please help!
You can use left join or not exists for this. Here is one method:
select b.*
from tableb as b
where not exists (select 1
from tablea as a
where a.col1 = b.col1 and a.col2 = b.col2 and a.col3 = b.col3
);
how about something like this
Select b.col1,b.col2,b.col3 from
tableb b left outer join tablea a
on ( b.col1 != a.co11 and b.col2 != a.co12 and b.col3 != a.co13 )