NOT IN subquery with hiveql retuning NullPointerException null - hive

I'm trying to run a query in hive involving inlining 2 arrays in the same table and effectively taking the difference using NOT IN operator
select c1 from t1
lateral view inline(m1) m1
where m1.key = 'x'
AND t1.c1 NOT IN
(
select c1 from t1
lateral view inline(m2) m2
where m2.key = 'y'
);
The above query returns
FAILED: NullPointerException null

First filter out all c1 having value 'y'
With temp as
(select distinct c1 from t1
lateral view inline(m2) m2
where m2.key = 'y')
select c1 from t1
lateral view inline(m1) m1
where m1.key = 'x')
and c1 not in (select c1 from temp)

Related

Can we use correlated sub-query in the group clause?

I have a table t2 with field Col & a,b,c,d,e as records. I'm trying to get an output like:
r1 0 1
1 b a
2 d c
4 e
when i use the below query i get an error: Syntax error in the expresion (((Select Count(b.Col)+1 from t2 as b where a.col>b.col)+1)\2
Transform
first(col) as col1
Select ((Select Count(b.Col)+1 from t2 as b where a.col>b.col)+1)\2 as r1
From t2 as a
Group by (((Select Count(b.Col)+1 from t2 as b where a.col>b.col)+1)\2)
Pivot
(Select Count(b.Col)+1 from t2 as b where a.col>b.col) MOD 2
I don't think so. Just use a subquery:
select r1
from (select a.*,
(Select Count(b.Col)+1 from t2 as b where a.col>b.col)+1)\2 as r1
from t2 as a
) as a1
group by r1;
Or, because you are only selecting distinct values, use select distinct rather than group by in the original query.

Oracle - inner and left join

create table t1 (v varchar2(500), n number);
Insert into T1 (V,N) values ('a',1);
Insert into T1 (V,N) values ('bb',2);
Insert into T1 (V,N) values ('c',3);
Insert into T1 (V,N) values ('d',4);
create table t2 (v varchar2(500), n number);
Insert into T2 (V,N) values ('a',1);
Insert into T2 (V,N) values ('bb',2);
select * from t1 join t2 on t1.v = t2.v
union all
select * from t1 LEFT join t2 on t1.v = t2.v ;
Output:
a 1 a 1
bb 2 bb 2
a 1 a 1
bb 2 bb 2
d 4 (null) (null)
c 3 (null) (null)
Can we get the same above output from single scan of T1 and T2 ie from single query without UNION ALL etc? Want to re-write the above Select query so that it scans the tables T1 and T2 only once and give the same result. See the LEFT join.
The output cant be changed as we are passing it further in the application, duplicate data is required as per the requirement.
" Want to re-write the above Select query so that it scans the tables T1 and T2 only once"
You could use subquery factoring . The WITH clauses read each table once and the UNION-ed queries read from them:
with cte1 as ( select * from t1 )
, cte2 as ( select * from t2 )
select * from cte1 join cte2 on cte1.v = cte2.v
union all
select * from cte1 LEFT join cte2 on cte1.v = cte2.v ;
Here is a SQL Fiddle demo.
You can avoid excess joins and unions by doubling the rows:
select t1.*,t2.* from t1
left join t2 on t1.v=t2.v
cross join (select 1 as dbl from dual
union select 2 as dbl from dual) dbl
where dbl=1 or t2.v is not null

Filter values if even one raw contains any value from another table

I have table1
c1 c2
1 a
1 b
1 c
2 a
3 b
and table2
c3
a
h
y
I need to filter all c1 if even 1 one of c2 contains any ofc3 from table2
result should be
c1
3
So far I tried
with cte as(
select c1, collect_set(c2) as c2
from table1
)
but I can't join it with table2 in such a way that will allow me to filter raws I don't need. For example, with
select c1
from cte
cross join table2
I could filter raws like
1 (a, b, c) a
but not
1 (a, b, c) x
and in the ennd I would even get
2 (a) x
which I don't need at all.
I also thought about concatinating
select c1, concat_ws(',', c2)
and using like '%c3%', but c3 is a column with many values and not some string.
NOT EXISTS wouldn't work either
Is there a way to do it?
I think you want something like this:
select t1.c1
from table1 t1 left join
table2 t2
on t1.c2 = t2.c3
group by t1.c1
having count(t2.c3) = 0;
Please check out Gordon's answer for a more appropriate and better solution.
SELECT [c1]
FROM #table1
WHERE [c1] NOT IN ( SELECT [c1]
FROM #table1
INNER JOIN #table2 ON c2 = c3 );

SQL recursive CTE

I have a column where I need to keep finding the last record value associated with the original record in that column.
select rec1, val1 from table1:
rec1 val1
a1 t1
t1 t2
t2 null
a2 t7
t7 null
There are essentially 2 original records in this table (a1, a2). I need to associate t2 with a1 in my sql query since the link is based on val1 column (a1 -> t1 -> t2) until val1 is null. The record a2 is linked to t7 only since there is no further linkage for t7 (a2 -> t7).
I hope there is a 'simple' way to accomplish this. I have tried but am unable to make much progress.
Thanks
Here is a recursive CTE formulation. This version assumes no loops and that you don't have more than 100 links in the chain:
with cte as (
select rec1, val1, 1 as lev
from table1 t1
where not exists (select 1 from table1 tt1 where tt1.val1 = t1.rec1)
union all
select cte.rec1, t.val1, cte.lev + 1 as lev
from cte join
table1 t1
on t1.val1 = cte.rec1
)
select *
from (select cte.*, max(lev) over (partition by rec1) as maxlev
from cte
) cte
where maxlev = lev;

SQL : Filtering with multiple columns in a subquery

I want to select all the rows from a table, those are not present in ID column of another table.
For example my Table1 has below structure :
C1 C2 C3
-- -- --
1 A Z
2 B Y
3 C X
My Table2 Looks like :
D1 D2
-- --
1 A
2 Y
3 X
My working query looks something like :
slect * from Table1
where (C2 NOT IN (Select D2 from Table2);
This works fine, but if I want to filter on basis of combination of both the columns (i.e. D1 & D2, then I cant write the query as :
slect * from Table1
where ((C1,C2) NOT IN (Select (D1,D2) from Table2);
Can anyone help me rectify the above query?
Use NOT EXISTS:
SELECT t.* from Table1 t
WHERE NOT EXISTS
(
SELECT 1 FROM Table2 t2
WHERE t.C1 = t2.D1
AND t.C2 = t2.D2
)
Result:
C1 C2 C3
2 B Y
3 C X
Here's a Demo: http://sqlfiddle.com/#!3/81fdd/4/0
NOT EXISTS has lesss isues than NOT IN anyway:
Should I use NOT IN, OUTER APPLY, LEFT OUTER JOIN, EXCEPT, or NOT EXISTS?
SELECT T1.*
FROM Table1 AS T1
LEFT JOIN Table2 AS T2
ON T2.D1 = T1.C1
AND T2.D2 = T1.C2
WHERE T2.D1 IS NULL