Hive : Checking if a string from table 1 is present in a list of strings from table 2 while joining two tables - sql

I am trying to join on whether a string(a column from table 1) is present in list of strings(a column from table 2) in Hive QL. Can anyone please help me with the syntax.
SELECT
A.id
FROM tab1 A
inner join tab2 B
ON (
(array_contains(B.purchase_items, A.item_id) = true )
)
Above SQL does not work.

First, unless Hive QL is backwards, your query is wrong upfront:
SELECT A.ID FROM A tab1
will return nothing because you've declared table "A" as "tab1". Either reverse the Alias or correct the table alias reference: (I assume tab1 is the table name, so go with option 1)
SELECT A.ID from tab1 A
--OR
SELECT tab1.id from A tab1
Second, joins do not work based on conditional criteria, they ARE the conditional criteria. Sort of...
For example:
SELECT A.ID
FROM tab1 A
INNER JOIN tab2 B
ON A.item_id = B.purchase_item
is almost like doing a simple cross join with a WHERE condition:
SELECT A.ID
FROM tab1 A, tab2 B --better to use it straight as "FROM tab1 A cross join tab2 B"
WHERE a.item_id = b.purchase_item

You can use LEFT SEMI JOIN, which would retrieve rows from left side table with columns matched from right side table.
SELECT A.id FROM tab1 A
LEFT SEMI JOIN tab2 B
ON A.col1 = B.col1 AND <any-other-join-cond>;
Note that the SELECT and WHERE clauses can’t reference columns from the right hand table.

Related

H2 does not allow to execute select with join inside set

I want to fill all columns in one table basing on columns from select with left join from two others:
update TAB1 as P
set P.COL1 = (
select CODE from (
select * from TAB2 as A left outer join TAB3 as T on A.TAGID = T.ID
) as O
where P.ACTID = O.ACTID
);
It works properly on Oracle, but when i want to execute it on h2 I got this error:
Duplicate column name "ID"; SQL statement
I dont know where is a problem. I could'nt find any solution for that.
Thx for the answers
This statement is your problem:
(select * from TAB2 as A left outer join TAB3 as T on A.TAGID = T.ID)
Presumably, you have an ID in both tables, so the SELECT * returns two columns named ID. I'm surprised this works in Oracle -- but perhaps Oracle optimizes the code because the IDs are not needed.
Just return the value you want:
(select ?.CODE from TAB2 as A left outer join TAB3 as T on A.TAGID = T.ID)
The question mark is either A or T, depending on which table the value comes from.

Join two tables on a field contais other field

How to join two tables if a field contains other field? Example:
On table A I have a field with data '000;111;222' and on table B I have a field with data '111'.
I want to join like this:
select * from A join B on A.field contains B.field
You could do:
select a.*, b.*
from a
inner join b on b.field = any(string_to_array(a.field, ';'))
The join condition turns a.field to an array, then checks if it contains b.field.
Well perhaps you are giving string_to_array the incorrect parameters. As alternative you can use the POSITION function to find if there is a sbustring match.
with table_a (acol) as ( values('000;111;222'),('000;xxx;222') )
, table_b (bcol) as ( values ('111'),('xxx'),('000'),('123') )
select *
from table_a
join table_b on POSITION(bcol in acol) > 0;

Impala: duplicate table alias when trying joining on multiple columns

I want to left outer join table A and table B on multiple columns. Below is my code:
select * from table_A
left outer join table_B
on (table_A.a1 = table_B.b1)
left outer join table_B
on (table_A.a2 = table_B.b2)
But then I got error:
HiveServer2Error: AnalysisException: Duplicate table alias: 'table_B'
Does anyone know whatI did wrong here? Thanks!
Use different table aliases as you are joining the same table twice.
select * -- use column names here instead of *
from table_A ta
left outer join table_B tb1 on (ta.a1 = tb1.b1)
left outer join table_B tb2 on (ta.a2 = tb2.b2)

Separated JOIN form main INNER JOINS's

I want to INNER JOIN some tables and then insert a condition where the entries of a table are dependant on another table (that was not joined with the others)
Something like this:
SELECT * FROM TABLE_A AS a
INNER JOIN TABLE_B AS b ON b.id_b=a.id_a
INNER JOIN TABLE_C AS c ON c.id_c=b.id_b
Now I want to add a condition (possibly a "WHERE" clause) that only selects the values in a field in TABLE_C that match another condition, the existence of a value in a field in TABLE_D
Possible statement:
WHERE c.code=d.another_code AND d.reg_number LIKE 999%
How do i declare in the query the TABLE_D, since I do not want to Join it with the others?
In other words, I want to intersect 3 sets (A,B,C) and the other one (set D) is intersected only with set C
The title of the question Run-time error '13': ... doesn't seem to match the content so I'll just answer the SQL part.
Maybe this is what you want?
SELECT * FROM TABLE_A AS a
INNER JOIN TABLE_B AS b ON b.id_b=a.id_a
INNER JOIN TABLE_C AS c ON c.id_c=b.id_b
WHERE c.code = -- or possiby IN instead of =
(SELECT another_code FROM TABLE_D WHERE another_code LIKE '999%')
If the subquery can return multiple rows you need to use WHERE c.code IN instead of WHERE c.code =

Oracle SQL WITH clause select joined column

SQL:
WITH joined AS (
SELECT *
FROM table_a a
JOIN table_b b ON (a.a_id = b.a_id)
)
SELECT a_id
FROM joined
returns invalid identifier.
How can you select joined column when using WITH clause? I have tried aliases, prefixing and nothing worked. I know I can use:
WITH joined AS (
SELECT a.a_id
FROM table_a a
JOIN table_b b ON (a.a_id = b.a_id)
)
SELECT a_id
FROM joined
but I need this alias to cover all fields.
Only way I managed to meet this condition is using:
WITH joined AS (
SELECT a.a_id a_id_alias, a.*, b.*
FROM table_a a
JOIN table_b b ON (a.a_id = b.a_id)
)
SELECT a_id_alias
FROM joined
but it is not perfect solution...
You can use the effect of the USING clause when joining the tables.
When you join tables where the join columns have the same name (as it is the case with your example), the USING clause will return the join column only once, so the following works:
with joined as (
select *
from table_a a
join table_b b using (a_id)
)
select a_id
from joined;
SQLFiddle example: http://sqlfiddle.com/#!4/e7e099/2
I don't think you can do this without aliases. The result of the "joined" query has two fields, both named a_id. Unless you alias one (or both), as you did in your final query, the outer query has no idea which a_id you are referring to.
Why is your final query not a "perfect" solution?
You can probably use alias as below:
WITH JOINED AS (
SELECT A.A_ID A_A_ID, B.A_ID B_A_ID,
A.FIELD_NAME1 A_FIELDNAME1, A.FIELDNAME2 A_FIELDNAME2,A.FIELDNAME_N A_FIELDNAME_N,
B.FIELD_NAME1 B_FIELDNAME1, B.FIELDNAME2 B_FIELDNAME2,B.FIELDNAME_N B_FIELDNAME_N,
FROM TABLE_A A
JOIN TABLE_B B ON (A.A_ID = B.A_ID)
)
SELECT A_A_ID, B_A_ID
FROM JOINED
IT IS ALWAYS A GOOD PRACTICE TO AVOID USING SELECT *