Union two select statements while keeping distinct for not null column - sql

My Oracle statement has two parts:
Select statement 1 is returning rows as:
a b c NULL
a x y NULL
Select statement 2 is returning rows as:
a b c d
e f g h
I want to union both the selects provided for a row having same columns(except NULL column) as in select 2 , only the not NULL row is returned.
Output:
a b c d
a x y NULL
e f g h
CHANGED REQUIREMENTS:
The requirements are bit changed now and i have case like:
Select statement 1 as:
a b c e NULL
a x y s NULL
Select statement 2 as:
a b c d text
e f g h text
Output:
a b c d text
a x y s NULL
e f g h text
I.e. in case of NULL field in last column, I need to fetch the row from "Select statement 2".

Considering that first three columns are not nullable you can use FULL OUTER JOIN:
with t1 as (
select 'a' c1, 'b' c2, 'c' c3, null c4 from dual
union all
select 'a', 'x', 'y', null from dual),
t2 as (
select 'a' c1, 'b' c2, 'c' c3, 'd' c4 from dual
union all
select 'e', 'f', 'g', 'h' from dual)
select c1, c2, c3, coalesce(t1.c4, t2.c4) c4
from t1 full outer join t2 using(c1, c2, c3);
C1 C2 C3 C4
-- -- -- --
a b c d
e f g h
a x y (NULL)
According to updated requirements:
with t1(c1, c2, c3, c4, c5) as (
select 'a', 'b', 'c', 'e', null from dual
union all
select 'a', 'x', 'y', 's', null from dual),
t2(c1, c2, c3, c4, c5) as (
select 'a', 'b', 'c', 'd', 'qwerty' from dual
union all
select 'e', 'f', 'g', 'h', 'asdfgh' from dual)
select c1,
c2,
c3,
nvl(nvl2(t1.c5, t1.c4, t2.c4), t1.c4) c4,
coalesce(t1.c5, t2.c5) c5
from t1
full outer join t2
using (c1, c2, c3);
C1 C2 C3 C4 C5
-- -- -- -- ------
a b c d qwerty
e f g h asdfgh
a x y s (NULL)
This is fast and dirty hack. Although it works on sample data you provided, it might behave unpredictable on your full dataset. This just gives you an idea how to accomplish your goal. I strongly recommend you to test it thoroughly before use.

Suppose that we got TABLE1 and TABLE2, in this query TAB1 returns same rows in tables and remove the null columns from that rows and TAB2 returns all rows from TABLE1 with out the rows which are the same with the rows in TABLE2 and TAB3 is same TAB2 but the tables changed, and finally UNION ALL TAB1 and TAB2 and TAB3:
SELECT * FROM
(SELECT
CASE WHEN T1.COL1 IS NULL THEN T2.COL1 ELSE T1.COL1 END COL1,
CASE WHEN T1.COL2 IS NULL THEN T2.COL2 ELSE T1.COL2 END COL2,
CASE WHEN T1.COL3 IS NULL THEN T2.COL3 ELSE T1.COL3 END COL3,
CASE WHEN T1.COL4 IS NULL THEN T2.COL4 ELSE T1.COL4 END COL4
FROM
TABLE2 T2 INNER JOIN TABLE1 T1 ON
(T1.COL1 = T2.COL1 OR T1.COL1 IS NULL OR T2.COL1 IS NULL) AND
(T1.COL2 = T2.COL2 OR T1.COL2 IS NULL OR T2.COL2 IS NULL) AND
(T1.COL3 = T2.COL3 OR T1.COL3 IS NULL OR T2.COL3 IS NULL) AND
(T1.COL4 = T2.COL4 OR T1.COL4 IS NULL OR T2.COL4 IS NULL))TAB1
UNION ALL
SELECT * FROM
(SELECT * FROM TABLE1
MINUS
SELECT T1.*
FROM
TABLE2 T2 INNER JOIN TABLE1 T1 ON
(T1.COL1 = T2.COL1 OR T1.COL1 IS NULL OR T2.COL1 IS NULL) AND
(T1.COL2 = T2.COL2 OR T1.COL2 IS NULL OR T2.COL2 IS NULL) AND
(T1.COL3 = T2.COL3 OR T1.COL3 IS NULL OR T2.COL3 IS NULL) AND
(T1.COL4 = T2.COL4 OR T1.COL4 IS NULL OR T2.COL4 IS NULL))TAB2
UNION ALL
SELECT * FROM
(SELECT * FROM TABLE2
MINUS
SELECT T2.*
FROM
TABLE2 T2 INNER JOIN TABLE1 T1 ON
(T1.COL1 = T2.COL1 OR T1.COL1 IS NULL OR T2.COL1 IS NULL) AND
(T1.COL2 = T2.COL2 OR T1.COL2 IS NULL OR T2.COL2 IS NULL) AND
(T1.COL3 = T2.COL3 OR T1.COL3 IS NULL OR T2.COL3 IS NULL) AND
(T1.COL4 = T2.COL4 OR T1.COL4 IS NULL OR T2.COL4 IS NULL))TAB3;
SQL Fiddle1
SQL Fiddle2

Related

How to get duplicate records in one table which are not in other table?

I have table1 and table2 as follows:
Table1:
col1 col2
-------------
a1 b1
a2 b1
a3 b2
a4 b3
a5 b3
a5 b4
a5 b2
Table2:
col2 col3
----------
b1 c1
b4 c2
To get all duplicate entries for col2 in table1, I have written following query:
SELECT x.col1,x.col2
FROM table1 x
JOIN (SELECT t.col2
FROM table1 t
GROUP BY t.col2
HAVING COUNT(t.col2) > 1) y ON y.col2 = x.col2
Now I want to remove the entries from above result which are in table2
Expected output:
col1 col2
----------
a3 b2
a4 b3
a5 b3
a5 b2
Query I wrote:
SELECT x.col1,x.col2
FROM table1 x
JOIN (SELECT t.col2
FROM table1 t
GROUP BY t.col2
HAVING COUNT(t.col2) > 1) y ON y.col2 = x.col2 where x.col2 not in (select col2 from table2)
I see the expected results using above query. Is there a more efficient of achieving the same result? and are there any cases that I could be missing?
Thanks
This script leaves b4 from Table2 because b4 in col2 is not a duplicate in col2 from Table1.
DROP TABLE IF EXISTS Table1
DROP TABLE IF EXISTS Table2
CREATE TABLE Table1
(
col1 VARCHAR(10),
col2 VARCHAR(10)
)
GO
CREATE TABLE Table2
(
col2 VARCHAR(10),
col3 VARCHAR(10)
)
GO
INSERT INTO Table1
VALUES
('a1', 'b1'),
('a2', 'b1'),
('a3', 'b2'),
('a4', 'b3'),
('a5', 'b3'),
('a5', 'b4'),
('a5', 'b2')
INSERT INTO Table2
VALUES
('b1', 'c1'),
('b4', 'c2')
SELECT T2.*
FROM Table2 T2
LEFT JOIN
(
SELECT col2
FROM Table1
GROUP BY col2
HAVING COUNT(*) > 1
) T1 ON T1.col2 = T2.col2
WHERE T1.col2 IS NULL
Depending on which DBMS you're actually using, you could use window functions and exist expressions.
SELECT
*
FROM
(
SELECT
*,
COUNT(*) OVER (PARTITION BY col2) AS occurences
FROM
table1
)
t1
WHERE
occurrences > 1
AND
NOT EXISTS (
SELECT *
FROM table2
WHERE col2 = t1.col2
)
If your DBMS supports CTE, Common Table Expressions you may try the following:
with cte as (
select col1,col2, count(*) over (partition by col2) as cn from Table1
)
select T.col1,T.col2 from cte T
left join Table2 D on T.col2=D.col2
where T.cn>1 and D.col2 is null
See a demo from here.
This works on MySQL 8.0 and above and PostgreSQL.

Incorrect result when use same alias name in HiveQL

We have encountered a weird problem, which Hive returned incorrect result while use same alias name in subquery.
Following 3 SQL will return "A, C":
SELECT * FROM
(
SELECT T1.C1, T2.C1 C2
FROM (SELECT 'A' C1) T1
LEFT JOIN (SELECT 'C' C1) T2 ON 1 = 1
WHERE T1.C1 = 'C') T1
SELECT * FROM (SELECT T1.C1, T2.C1 C2 FROM (SELECT 'A' C1) T1 LEFT JOIN (SELECT 'C' C1) T2 ON 1 = 1 WHERE T1.C1 = 'C') T2
SELECT * FROM (SELECT T1.C1, T2.C1 C2 FROM (SELECT 'A' C1) T1 LEFT JOIN (SELECT 'C' C1) T2 ON 1 = 1 WHERE T2.C1 = 'C') T1
Following 1 SQL will return "C, C":
SELECT * FROM (SELECT T1.C1, T2.C1 C2 FROM (SELECT 'A' C1) T1 LEFT JOIN (SELECT 'C' C1) T2 ON 1 = 1 WHERE T2.C1 = 'C') T2

if any value exists in any row than return it if not than return with null value

I have a table
Col1 Col2
300 Null
300 A
300 B
400 NULL
I need output if any value exists in any row than return it if not than return with null value
Output:
Col1 Col2
300 A
300 B
400 Null
Return a row if Col2 has a non-null value, or if same Col1 never has a non-null value for Col2.
select t1.*
from tablename t1
where t1.Col2 is not null
or not exists (select 1 from tablename t2
where t2.Col2 is not null
and t2.Col1 = t1.Col1)
You can do this as:
select t.*
from t
where t.col2 is not null
union all
select t.*
from t
where t.col2 is null and
not exists (select 1 from t t2 where t2.col1 = t.col1 and t2.co2 is not null);
CREATE TABLE #Tbl(Col1 INT, Col2 VARCHAR(100))INSERT INTO #Tbl(Col1 , Col2)
SELECT 300,Null UNION ALL SELECT 300,'A' UNION ALL SELECT 300,'B' UNION ALL
SELECT 400,NULL
SELECT Col1 , Col2
FROM #Tbl T1
WHERE ISNULL(Col2,'') <> ''
UNION ALL
SELECT Col1 , Col2
FROM #Tbl T1
WHERE NOT EXISTS(SELECT 1 FROM #Tbl T2 WHERE T1.Col1 = T2.Col1 AND ISNULL(T2.Col2,'') <> '')

Select Group data with one matching condition

Table:
Col1 Col2
1 2
1 3
1 4
2 2
2 3
first need to check all rows with col2 = 4
Then need to select all rows with values col1
The result should be:
1 2
1 3
1 4
Off the top of my head
SELECT A.* FROM MyTable A JOIN MyTable B ON A.Col1 = B.Col1 WHERE B.Col2 = 4
I think you want this:
select t.*
from t
where t.col1 in (select t2.col1 from t t2 where t2.col2 = 4);
This query checks on both columns, where col2 = 4 and col1 = 1, from what i can understand in your description.
SELECT t1.col1, t2.col2 FROM Table t1
WHERE t1.col2 = 4
UNION
SELECT t2.col1, t2.col2 FROM Table t2
WHERE t2.col1 = 1

SQL Query - Indirect joining of two tables

I have two tables like the following
Table1
COL1 COL2 COL3
A 10 ABC
A 11 ABC
A 1 DEF
A 2 DEF
B 10 ABC
B 11 ABC
B 1 DEF
C 3 DEF
C 12 ABC
C 21 GHI
Table2
COL1 GHI ABC DEF
A1 21 10 1
A2 21 12 1
A3 21 10 1
A4 23 10 1
A5 25 11 3
A6 21 14 3
A7 25 11 1
A8 23 10 1
A9 29 10 2
A10 21 12 3
I have created another temporary table that returns all the distinct values from tbl1.col1
The values of col3 in tbl1 are columns in tbl2, which are populated by some values.
What I need is for each of these distinct values of table1.column1, (A, B, C) in this case, return a combination of table2.column1 and table1.column1 such that
the ABC value of table2.column1 matches any of the ABC value of the "group" from table1,
AND the DEF value of table2.column1 matches any of the DEF value of the "group" from table1,
AND IF THE GROUP CONTAINS GHI VALUES, the GHI value of table2.column1 matches any of the GHI value of the "group" from table1
So, I would need something like the following
Output Table
Table2.COL1 Table1.Col1
A1 A
A3 A
A4 A
A7 A
A8 A
A9 A
A1 B
A3 B
A4 B
A7 B
A8 B
A10 C
I tried something like this, but Im not sure if this is the right way of approaching
select table2.col1, temp_distinct_table.column1
from table2, temp_distinct_table
where table2.def IN (SELECT col2
FROM table1
WHERE table1.col1 = temp_distinct_table.col1
AND table1.col3 = 'DEF')
AND table2.abc IN (SELECT col2
FROM table1
WHERE table1.col1 = temp_distinct_table.col1
AND table1.col3 = 'ABC')
AND (
table2.ghi IN (SELECT col2
FROM table1
WHERE table1.col1 = temp_distinct_table.col1
AND table1.col3 = 'GHI')
OR NOT EXISTS (SELECT col2
FROM table1
WHERE table1.col1 = temp_distinct_table.col1
AND table1.col3 = 'GHI')
)
where temp_distinct_table contains of all the distinct values from table1.col1
Could someone guide me on the matter?
Another approach, counting how many matches there are for each t1.col/t2.col combination after joining all the possible matches:
select distinct t2_col1, t1_col1
from (
select t2.col1 as t2_col1, t1.col1 as t1_col1, t1.ghi_count as t1_ghi_count,
count(case when t1.col3 = 'ABC' then 1 end)
over (partition by t1.col1, t2.col1) as abc_matches,
count(case when t1.col3 = 'DEF' then 1 end)
over (partition by t1.col1, t2.col1) as def_matches,
count(case when t1.col3 = 'GHI' then 1 end)
over (partition by t1.col1, t2.col1) as ghi_matches
from (
select t1.*,
count(case when t1.col3 = 'GHI' then 1 end)
over (partition by t1.col1) as ghi_count
from table1 t1
) t1
join table2 t2
on (t1.col3 = 'ABC' and t2.abc = t1.col2)
or (t1.col3 = 'DEF' and t2.def = t1.col2)
or (t1.col3 = 'GHI' and t2.ghi = t1.col2)
)
where abc_matches > 0
and def_matches > 0
and (t1_ghi_count = 0 or ghi_matches > 0)
order by t1_col1, t2_col1;
Which with your sample data gets:
T2_COL T1_COL
------ ------
A1 A
A3 A
A4 A
A7 A
A8 A
A9 A
A1 B
A3 B
A4 B
A7 B
A8 B
A10 C
Not sure if the efficiency of that will be significantly different to MTO's cross join with your real data.
This becomes quite simple when you use collections (and you only need to do one table scan for each table):
Oracle Setup:
CREATE TYPE intlist AS TABLE OF INT;
/
Query:
SELECT t2.col1 AS t2_col1,
t1.col1 AS t1_col1
FROM (
SELECT col1,
CAST( COLLECT( CASE col3 WHEN 'ABC' THEN col2 END ) AS INTLIST ) AS abc,
CAST( COLLECT( CASE col3 WHEN 'DEF' THEN col2 END ) AS INTLIST ) AS def,
CAST( COLLECT( CASE col3 WHEN 'GHI' THEN col2 END ) AS INTLIST ) AS ghi
FROM table1
GROUP BY col1
) t1
INNER JOIN table2 t2
ON ( t2.abc MEMBER OF t1.abc
AND t2.def MEMBER OF t1.def
AND ( t2.ghi MEMBER OF t1.ghi OR t1.ghi IS EMPTY ) );
Output:
t2_col1 t1_col1
------- -------
A1 A
A3 A
A4 A
A7 A
A8 A
A9 A
A1 B
A3 B
A4 B
A7 B
A8 B
A10 C
Update
An alternative query without using collections (it is going to be more efficient than your query but probably less efficient than collections):
SELECT t2.col1,
t1.col1
FROM table1 t1
CROSS JOIN
table2 t2
GROUP BY t1.col1, t2.col1
HAVING COUNT( CASE WHEN t1.col2 = t2.abc AND t1.col3 = 'ABC' THEN 1 END ) > 0
AND COUNT( CASE WHEN t1.col2 = t2.def AND t1.col3 = 'DEF' THEN 1 END ) > 0
AND ( COUNT( CASE WHEN t1.col2 = t2.ghi AND t1.col3 = 'GHI' THEN 1 END ) > 0
OR COUNT( CASE t1.col3 WHEN 'GHI' THEN 1 END ) = 0 )
ORDER BY t1.col1, t2.col1;
Update 2:
Changed from CROSS JOIN to INNER JOIN:
SELECT t2.col1 AS t2_col1,
t1.col1 AS t1_col1
FROM (
SELECT t1.*,
COUNT( CASE col3 WHEN 'GHI' THEN 1 END )
OVER ( PARTITION BY col1 ) AS has_ghi
FROM table1 t1
) t1
INNER JOIN table2 t2
ON ( t1.col3 = 'ABC' AND t2.abc = t1.col2 )
OR ( t1.col3 = 'DEF' AND t2.def = t1.col2 )
OR ( t1.col3 = 'GHI' AND t2.ghi = t1.col2 )
GROUP BY t1.col1, t2.col1, t1.has_ghi
HAVING COUNT( CASE t1.col3 WHEN 'ABC' THEN 1 END ) > 0
AND COUNT( CASE t1.col3 WHEN 'DEF' THEN 1 END ) > 0
AND ( COUNT( CASE t1.col3 WHEN 'GHI' THEN 1 END ) > 0 OR has_ghi = 0 )
ORDER BY t1.col1, t2.col1;