Incorrect result when use same alias name in HiveQL - hive

We have encountered a weird problem, which Hive returned incorrect result while use same alias name in subquery.
Following 3 SQL will return "A, C":
SELECT * FROM
(
SELECT T1.C1, T2.C1 C2
FROM (SELECT 'A' C1) T1
LEFT JOIN (SELECT 'C' C1) T2 ON 1 = 1
WHERE T1.C1 = 'C') T1
SELECT * FROM (SELECT T1.C1, T2.C1 C2 FROM (SELECT 'A' C1) T1 LEFT JOIN (SELECT 'C' C1) T2 ON 1 = 1 WHERE T1.C1 = 'C') T2
SELECT * FROM (SELECT T1.C1, T2.C1 C2 FROM (SELECT 'A' C1) T1 LEFT JOIN (SELECT 'C' C1) T2 ON 1 = 1 WHERE T2.C1 = 'C') T1
Following 1 SQL will return "C, C":
SELECT * FROM (SELECT T1.C1, T2.C1 C2 FROM (SELECT 'A' C1) T1 LEFT JOIN (SELECT 'C' C1) T2 ON 1 = 1 WHERE T2.C1 = 'C') T2

Related

Pairwise intersection counts of multiple columns in SQL

Say I have a SQL db with multiple tables, and each of these tables has a column code. What I want to produce is a table showing the number of codes shared between each pair of code columns. I.e. a pairwise intersection count plot:
table1 table2 table3 table4
table1 10 2 3 5
table2 2 10 4 1
table3 3 4 10 2
table4 5 1 2 10
I can get each value individually using e.g.
WITH abc AS
(
SELECT code FROM table1
INTERSECT
SELECT code FROM table2
)
SELECT COUNT(*) AS table1Xtable2 FROM abc
But is there a query that will generate the entire output as desired as a table?
The following gets all combinations among the tables:
select t1, t2, t3, t4, count(*)
from (select code, max(t1) as t1, max(t2) as t2, max(t3) as t3, max(t4) as t4
from ((select code, 1 as t1, 0 as t2, 0 as t3, 0 as t4
from table1
) union all
(select code, 0 as t1, 1 as t2, 0 as t3, 0 as t4
from table2
) union all
(select code, 0 as t1, 0 as t2, 1 as t3, 0 as t4
from table3
) union all
(select code, 0 as t1, 0 as t2, 0 as t3, 1 as t4
from table4
)
) t
group by code
) t
group by t1, t2, t3, t4;
For your particular problem, you can use:
with t as (
select code, 'table1' as tablename from table1 union all
select code, 'table2' as tablename from table2 union all
select code, 'table3' as tablename from table3 union all
select code, 'table4' as tablename from table4
)
select t1.tablename,
sum(case when t2.tablename = 'table1' then 1 else 0 end) as t1,
sum(case when t2.tablename = 'table2' then 1 else 0 end) as t2,
sum(case when t2.tablename = 'table3' then 1 else 0 end) as t3,
sum(case when t2.tablename = 'table4' then 1 else 0 end) as t4
from t t1 join
t t2
on t1.code = t2.code
group by t1.tablename
Note that the above assumes that code is unique in the tables. If it is duplicated, you can replace union all with union.

SQL update query on join table

UPDATE
t1
SET
t1.c3 = t2.c3,
t1.c4 = t1.c4
FROM
t1
LEFT JOIN t2 ON t1.c1 = t2.c1 AND t1.c2 = t2.c2
WHERE
t1.c5 = 'In Progrss'
I want to update value from top row of table t2.
For example in t2 table having 3 rows with criteria match above only top row value update in t1 table(ROW ID 3 VALUES TO UPDATE IN t1 table).
t2 table:
id c1 c2 c3 c4
-----------------------------
1 ABC XYZ 280 300
2 ABC XYZ 290 400
3 ABC XYZ 310 500
4 PQR STR 210 400
t1 table:
id c1 c2 c3 c4 c5
----------------------------------
1 ABC XYZ In Progrss
5 ABC XYZ In Progrss
8 ABC XYZ In Progrss
15 PQR STR IN Progress
Postgres’ update/join syntax goes like:
update t1
set c3 = t2.c3, c4 = t1.c4
from t2
where
t1.c1 = t2.c1
and t1.c2 = t2.c2
and t1.c5 = 'In Progress'
If there are several matching rows in t2 and you want the one with the smallest id, then you can use row_number() in a subquery:
update t1
set c3 = t2.c3, c4 = t1.c4
from (select t2.*, row_number() over(partition by c1, c2 order by id) rn from t2) t2
where
t1.c1 = t2.c1
and t1.c2 = t2.c2
and t1.c5 = 'In Progress'
and t2.rn = 1

How to make this Sql Consult

I have two tables:
T1
C1 C2 (Columns)
A 1
B 2
T2
C1 C2 (Columns)
A Null
C Null
what I expect as result is:
C1 C2 (Columns)
A 1
B 2
C NULL
please help, I try to find the most clean solution, thanks.
I do this:
Select case when T1.C1 is Null then T2.C1 else T1.C1 end As C1,
T1.C2
from T1 FULL OUTER JOIN T2 on T1.C1 = T2.C1

Union two select statements while keeping distinct for not null column

My Oracle statement has two parts:
Select statement 1 is returning rows as:
a b c NULL
a x y NULL
Select statement 2 is returning rows as:
a b c d
e f g h
I want to union both the selects provided for a row having same columns(except NULL column) as in select 2 , only the not NULL row is returned.
Output:
a b c d
a x y NULL
e f g h
CHANGED REQUIREMENTS:
The requirements are bit changed now and i have case like:
Select statement 1 as:
a b c e NULL
a x y s NULL
Select statement 2 as:
a b c d text
e f g h text
Output:
a b c d text
a x y s NULL
e f g h text
I.e. in case of NULL field in last column, I need to fetch the row from "Select statement 2".
Considering that first three columns are not nullable you can use FULL OUTER JOIN:
with t1 as (
select 'a' c1, 'b' c2, 'c' c3, null c4 from dual
union all
select 'a', 'x', 'y', null from dual),
t2 as (
select 'a' c1, 'b' c2, 'c' c3, 'd' c4 from dual
union all
select 'e', 'f', 'g', 'h' from dual)
select c1, c2, c3, coalesce(t1.c4, t2.c4) c4
from t1 full outer join t2 using(c1, c2, c3);
C1 C2 C3 C4
-- -- -- --
a b c d
e f g h
a x y (NULL)
According to updated requirements:
with t1(c1, c2, c3, c4, c5) as (
select 'a', 'b', 'c', 'e', null from dual
union all
select 'a', 'x', 'y', 's', null from dual),
t2(c1, c2, c3, c4, c5) as (
select 'a', 'b', 'c', 'd', 'qwerty' from dual
union all
select 'e', 'f', 'g', 'h', 'asdfgh' from dual)
select c1,
c2,
c3,
nvl(nvl2(t1.c5, t1.c4, t2.c4), t1.c4) c4,
coalesce(t1.c5, t2.c5) c5
from t1
full outer join t2
using (c1, c2, c3);
C1 C2 C3 C4 C5
-- -- -- -- ------
a b c d qwerty
e f g h asdfgh
a x y s (NULL)
This is fast and dirty hack. Although it works on sample data you provided, it might behave unpredictable on your full dataset. This just gives you an idea how to accomplish your goal. I strongly recommend you to test it thoroughly before use.
Suppose that we got TABLE1 and TABLE2, in this query TAB1 returns same rows in tables and remove the null columns from that rows and TAB2 returns all rows from TABLE1 with out the rows which are the same with the rows in TABLE2 and TAB3 is same TAB2 but the tables changed, and finally UNION ALL TAB1 and TAB2 and TAB3:
SELECT * FROM
(SELECT
CASE WHEN T1.COL1 IS NULL THEN T2.COL1 ELSE T1.COL1 END COL1,
CASE WHEN T1.COL2 IS NULL THEN T2.COL2 ELSE T1.COL2 END COL2,
CASE WHEN T1.COL3 IS NULL THEN T2.COL3 ELSE T1.COL3 END COL3,
CASE WHEN T1.COL4 IS NULL THEN T2.COL4 ELSE T1.COL4 END COL4
FROM
TABLE2 T2 INNER JOIN TABLE1 T1 ON
(T1.COL1 = T2.COL1 OR T1.COL1 IS NULL OR T2.COL1 IS NULL) AND
(T1.COL2 = T2.COL2 OR T1.COL2 IS NULL OR T2.COL2 IS NULL) AND
(T1.COL3 = T2.COL3 OR T1.COL3 IS NULL OR T2.COL3 IS NULL) AND
(T1.COL4 = T2.COL4 OR T1.COL4 IS NULL OR T2.COL4 IS NULL))TAB1
UNION ALL
SELECT * FROM
(SELECT * FROM TABLE1
MINUS
SELECT T1.*
FROM
TABLE2 T2 INNER JOIN TABLE1 T1 ON
(T1.COL1 = T2.COL1 OR T1.COL1 IS NULL OR T2.COL1 IS NULL) AND
(T1.COL2 = T2.COL2 OR T1.COL2 IS NULL OR T2.COL2 IS NULL) AND
(T1.COL3 = T2.COL3 OR T1.COL3 IS NULL OR T2.COL3 IS NULL) AND
(T1.COL4 = T2.COL4 OR T1.COL4 IS NULL OR T2.COL4 IS NULL))TAB2
UNION ALL
SELECT * FROM
(SELECT * FROM TABLE2
MINUS
SELECT T2.*
FROM
TABLE2 T2 INNER JOIN TABLE1 T1 ON
(T1.COL1 = T2.COL1 OR T1.COL1 IS NULL OR T2.COL1 IS NULL) AND
(T1.COL2 = T2.COL2 OR T1.COL2 IS NULL OR T2.COL2 IS NULL) AND
(T1.COL3 = T2.COL3 OR T1.COL3 IS NULL OR T2.COL3 IS NULL) AND
(T1.COL4 = T2.COL4 OR T1.COL4 IS NULL OR T2.COL4 IS NULL))TAB3;
SQL Fiddle1
SQL Fiddle2

select columns matching

Newbie question. I want to do something like:
SELECT c1,c2,c3 FROM TABLE t1
UNION
SELECT c1,c2,c3 FROM TABLE t2 WHERE t1.c1 IS NOT NULL AND t1.c2 IS NULL;
so if I have t1:
c1|c2|c3
1 | a|v1
2 | b|v2
and t2:
c1|c2|c3
1 | a|v3
2 | b|v4
2 | c|v5
3 | d|v6
I would get:
c1|c2|c3
1 | a|v1
2 | b|v2
2 | c|v5
Anybody knows how to do this?
According to SQL Docs, JOIN is faster that SubQueries. try this:
SELECT c1, c2, c3
FROM t1
UNION
SELECT b.c1, b.c2, b.c3
FROM t1 a inner join t2 b ON
(a.c1 = b.c1) and (a.c1 = b.c1)
WHERE (b.c1,b.c2) NOT IN (SELECT c1,c2 FROM t1)
to prove, see here: http://sqlfiddle.com/#!2/50a4e/23
The query should look like something like below.However,I am not sure how efficient it is.
SELECT
T1.c1,
T1.c2,
T1.c3
FROM T1
UNION
SELECT
T2.c1,
T2.c2,
T2.c3
FROM T2
WHERE ((T2.c1,T2.c2) NOT IN (SELECT t1.c1,t1.c2 FROM t1)) AND
(T2.c1 IN (SELECT t1.c1 FROM t1))
select t2.c1,t2.c2,t2.c3
from t1,t2
where t1.c1=t2.c2;