Oracle advanced union - sql

Is there any advanced Oracle SQL methods to solve this kind of situation?
Simplified:
Two queries returns primary_key_value and other_value.
Both queries always return primary_key_value but other_value might be null.
So how I can union those two queries so that it returns always those rows which has other_value, but if both queries are having other_value = null with same primary key, then only one row should be returned.
I know this is so stupid case. But specifications were like this :)
Example:
First query:
A | B
=======
1 | X
2 |
3 |
4 | Z
Second query:
A | B
=======
1 | Y
2 |
3 | Z
4 |
So result need to be like this:
A | B
=======
1 | X
1 | Y
2 |
3 | Z
4 | Z

You could use analytics:
SQL> WITH q1 AS (
2 SELECT 1 a, 'X' b FROM DUAL UNION ALL
3 SELECT 2 a, '' b FROM DUAL UNION ALL
4 SELECT 3 a, '' b FROM DUAL UNION ALL
5 SELECT 4 a, 'Z' b FROM DUAL
6 ), q2 AS (
7 SELECT 1 a, 'Y' b FROM DUAL UNION ALL
8 SELECT 2 a, '' b FROM DUAL UNION ALL
9 SELECT 3 a, 'Z' b FROM DUAL UNION ALL
10 SELECT 4 a, '' b FROM DUAL
11 )
12 SELECT a, b
13 FROM (SELECT a, b,
14 rank() over(PARTITION BY a
15 ORDER BY decode(b, NULL, 2, 1)) rnk
16 FROM (SELECT * FROM q1
17 UNION
18 SELECT * FROM q2))
19 WHERE rnk = 1;
A B
---------- -
1 X
1 Y
2
3 Z
4 Z

If you want use something really advanced, use model clause http://rwijk.blogspot.com/2007/10/sql-model-clause-tutorial-part-one.html
But, in real life, using such things usually means bad-designed data model

Another way to look at is that you want all possible values from the union of column A then left outer outer join these with the non-null values from column B, thus only showing null in B when there is no non-null value to display.
roughly:
WITH q1 as (whatever),
q2 as (whatever)
SELECT All_A.A, BVals.B
FROM (SELECT DISTINCT A FROM (SELECT A FROM q1 UNION SELECT A FROM q2)) All_A,
(SELECT A,B FROM q1 WHERE B IS NOT NULL
UNION
SELECT A,B FROM q2 WHERE B IS NOT NULL) BVals
WHERE All_A.A = BVals.A (+)
Also pruning the unwanted nulls explicitly could do the same job:
WITH q3 AS (q1_SELECT UNION q2_SELECT)
SELECT A,B
FROM q3 main
WHERE NOT ( B IS NULL AND
EXISTS (SELECT 1 FROM q3 x WHERE main.A = x.A and x.B IS NOT NULL) )

Related

How to combine two tables in Oracle SQL on a quantitative base

beacause of a really old db design I need some help. This might be quite simple I'm just not seeing the wood for the trees at the moment.
TABLE A:
ID
1
2
3
4
5
TABLE B:
ID
VALUE B
1
10
1
20
2
10
2
20
3
10
3
20
3
30
4
10
TABLE C:
ID
VALUE C
1
11
1
21
2
11
2
21
2
31
3
11
5
11
Expected result:
where ID = 1
ID
VALUE B
VALUE C
1
10
11
1
20
21
where ID = 2
ID
VALUE B
VALUE C
2
10
11
2
20
21
2
null
31
where ID = 3
ID
VALUE B
VALUE C
3
10
11
3
20
null
3
30
null
where ID = 4
ID
VALUE B
VALUE C
4
10
null
where ID = 5
ID
VALUE B
VALUE C
5
null
11
The entries in table B and C are optional and could be unlimited, the ID from table A is the connection.
B and C are not directly connected. I need a quantitative comparision to find gaps in the database. The number of entries of table B and C should be the same (but not the value), usually entries are missing in either B or C.
I tried it with outer joins but I'm getting two much rows, because I need B or C join only one time per single row.
I hope anybody understand my problem and can help me.
It looks like, for each distinct ID, you want the nth row (ordered by VALUE) from TABLE_A to match with the nth row from TABLE_B. And if one table - A or B - has more values, you want those to match to null.
Your solution will have two parts. First, use row_number() over ( partition by id order by value) to order the rows in both tables. Then, use FULL OUTER JOIN to join on (id, rownumber).
Here is a full example:
-- WITH clauses are just test data...+
with table_a (id) as (
SELECT 1 FROM DUAL UNION ALL
SELECT 2 FROM DUAL UNION ALL
SELECT 3 FROM DUAL UNION ALL
SELECT 4 FROM DUAL UNION ALL
SELECT 5 FROM DUAL ),
table_b (id, value) as (
SELECT 1,10 FROM DUAL UNION ALL
SELECT 1,20 FROM DUAL UNION ALL
SELECT 2,10 FROM DUAL UNION ALL
SELECT 2,20 FROM DUAL UNION ALL
SELECT 3,10 FROM DUAL UNION ALL
SELECT 3,20 FROM DUAL UNION ALL
SELECT 3,30 FROM DUAL UNION ALL
SELECT 4,10 FROM DUAL ),
table_c (id, value) as (
SELECT 1,11 FROM DUAL UNION ALL
SELECT 1,21 FROM DUAL UNION ALL
SELECT 2,11 FROM DUAL UNION ALL
SELECT 2,21 FROM DUAL UNION ALL
SELECT 2,31 FROM DUAL UNION ALL
SELECT 3,11 FROM DUAL UNION ALL
SELECT 5,11 FROM DUAL )
-- Solution begins here
SELECT id, b.value b_value, c.value c_value
FROM ( SELECT b.*,
row_number() OVER ( PARTITION BY b.id ORDER BY b.value ) rn
FROM table_b b ) b
FULL OUTER JOIN ( SELECT c.*,
row_number() OVER ( PARTITION BY c.id ORDER BY c.value ) rn
FROM table_c c ) c USING (id, rn)
ORDER BY id, b_value, c_value;
+----+---------+---------+
| ID | B_VALUE | C_VALUE |
+----+---------+---------+
| 1 | 10 | 11 |
| 1 | 20 | 21 |
| 2 | 10 | 11 |
| 2 | 20 | 21 |
| 2 | | 31 |
| 3 | 10 | 11 |
| 3 | 20 | |
| 3 | 30 | |
| 4 | 10 | |
| 5 | | 11 |
+----+---------+---------+

Gradually aggregating a string column in Oracle SQL

I would like to gradually aggregate a string column in Oracle sql.
From this table:
col_1 | col_2
-------------
1 | A
1 | B
1 | C
2 | C
2 | D
to:
col_1 | col_2
-------------
1 | A
1 | A,B
1 | A,B,C
2 | C
2 | C,D
I tried LISTAGG but it won't return all rows due to group by. I have about 2 million rows in the table.
Oracle doesn't support accumulating string concatenation with a single listagg() expression. However, you can use a subquery.
Just one note: SQL tables represent unordered sets. You seem to have an ordering in mind. The following code adds an ordering column:
with t as (
select 1 as id, 1 as x, 'A' as y from dual union all
select 2, 1 as x, 'B' as y from dual union all
select 3, 1 as x, 'C' as y from dual union all
select 4, 2 as x, 'C' as y from dual union all
select 5, 2 as x, 'D' as y from dual
)
select t.*,
(select listagg(t2.y, ',') within group (order by t2.id)
from t t2
where t2.x = t.x and t2.id <= t.id
)
from t;
A hierarchical query option looks like this:
SQL> with t as (
2 select 1 as id, 1 as x, 'A' as y from dual union all
3 select 2, 1 as x, 'B' as y from dual union all
4 select 3, 1 as x, 'C' as y from dual union all
5 select 4, 2 as x, 'C' as y from dual union all
6 select 5, 2 as x, 'D' as y from dual
7 )
8 select x,
9 ltrim(sys_connect_by_path(y, ','), ',') result
10 from (select x,
11 y,
12 row_number() over (partition by x order by y) rn
13 from t
14 )
15 start with rn = 1
16 connect by prior rn = rn - 1 and prior x = x;
X RESULT
---------- --------------------
1 A
1 A,B
1 A,B,C
2 C
2 C,D
SQL>

BigQuery: select the nth smallest value in window, ordered by another value

My table has two integer columns: a and b. For each row, I want to select the nth smallest value of b among the rows with smaller a values. Here's a sample input/output, with n=2.
Input:
a | b
-------
1 | 4
2 | 2
3 | 5
4 | 3
5 | 9
6 | 1
7 | 7
8 | 6
9 | 0
Output:
a | 2th min b
-------------
1 | null ← only 1 element in [4], no 2nd min
2 | 4 ← 2nd min between [4,2]
3 | 4 ← 2nd min between [4,2,5]
4 | 3 ← 2nd min between [4,2,5,3]
5 | 3 ← etc.
6 | 2
7 | 2
8 | 2
9 | 1
I used n=2 here to keep it simple, but in practice, I want the 2000th smallest value (or some other large-ish constant). The column a can be assumed to contain distinct integers (and even 1, 2, 3, … if that's easier).
The problem is that if I use ORDER BY b in my window clause and NTH_VALUE, it just computes the answer on the wrong set of values:
WITH data AS (
SELECT 1 AS a, 4 AS b
UNION ALL SELECT 2 AS a, 2 AS b
UNION ALL SELECT 3 AS a, 5 AS b
UNION ALL SELECT 4 AS a, 3 AS b
UNION ALL SELECT 5 AS a, 9 AS b
UNION ALL SELECT 6 AS a, 1 AS b
)
SELECT nth_value(b, 2) over (order by a)
from data
returns [null, 2, 2, 2, 2, 2]: the values are ordered by a (so in the same order than they appear), so the value b=2 is always the one in second place. I want to order by a and then take the nth smallest value of b. Any idea how to write this in BigQuery (preferably Standard SQL)?
Below is for BigQuery Standard SQL and produces correct result for given example.
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 a, 4 b UNION ALL
SELECT 2, 2 UNION ALL
SELECT 3, 5 UNION ALL
SELECT 4, 3 UNION ALL
SELECT 5, 9 UNION ALL
SELECT 6, 1 UNION ALL
SELECT 7, 7 UNION ALL
SELECT 8, 6 UNION ALL
SELECT 9, 0
)
SELECT
a,
(SELECT b FROM
(SELECT b FROM UNNEST(c) b ORDER BY b LIMIT 2)
ORDER BY b DESC LIMIT 1
) b2
FROM (
SELECT a, IF(ARRAY_LENGTH(c) > 1, c, [NULL]) c
FROM (
SELECT a, ARRAY_AGG(b) OVER (ORDER BY a) c
FROM `project.dataset.table`
)
)
-- ORDER BY a
with expected result as below
Row a b2
1 1 null
2 2 4
3 3 4
4 4 3
5 5 3
6 6 2
7 7 2
8 8 2
9 9 1
Note: to make it work for 2000th element you might change 2 to 2000 in LIMIT 2
meantime, i can admit it looks a little ugly/messy to me and not sure about scalability but you can give it a shot
Quick Update
Below is a little less ugly looking version (same output of course)
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 a, 4 b UNION ALL
SELECT 2, 2 UNION ALL
SELECT 3, 5 UNION ALL
SELECT 4, 3 UNION ALL
SELECT 5, 9 UNION ALL
SELECT 6, 1 UNION ALL
SELECT 7, 7 UNION ALL
SELECT 8, 6 UNION ALL
SELECT 9, 0
)
SELECT a, c[SAFE_ORDINAL(2)] b2 FROM (
SELECT x.a, ARRAY_AGG(y.b ORDER BY y.b LIMIT 2) c
FROM `project.dataset.table` x
CROSS JOIN `project.dataset.table` y
WHERE y.a <= x.a
GROUP BY x.a
)
-- ORDER BY a
For 2000th element replace 2 to 2000 in LIMIT 2 and SAFE_ORDINAL(2)
Still potentially same issue with scalability because of (now) explicit CROSS JOIN

SQL pad query result for missing groups

Assume the following table:
TableA:
ID GroupName SomeValue
1 C 1
2 C 1
2 B 1
2 A 1
I need to construct a query that selects the following result:
ID GroupName SomeValue
1 C 1
1 B 0
1 A 0
2 C 1
2 B 1
2 A 1
The GroupName is actually derived from TableA column's CASE expression and can take only 3 values: A, B, C.
Are the analytic functions the way to go?
EDIT
Sorry, for not mentioning it, but the ID could consist of multiple columns. Consider this example:
ID1 ID2 GroupName SomeValue
1 1 C 1
1 2 C 1
2 2 C 1
2 2 B 1
2 2 A 1
I need to pad SomeValue with 0 for each unique combination ID1+ID2. So the result should be like this:
ID1 ID2 GroupName SomeValue
1 1 C 1
1 1 B 0
1 1 A 0
1 2 C 1
1 2 B 0
1 2 A 0
2 2 C 1
2 2 B 1
2 2 A 1
EDIT2
Seems like solution, proposed by #Laurence should work even for multiple-column 'ID'. I couldn't rewrite the query proposed by #Nicholas Krasnov to conform to this requirement. But could somebody compare these solutions performance-wise? Will the analytic function work faster than 'cross join + left outer join'?
To fill in gaps, you could write a similar query using partition by clause of outer join:
SQL> with t1(ID,GroupName,SomeValue) as
2 (
3 select 1, 'C', 1 from dual union all
4 select 2, 'C', 1 from dual union all
5 select 2, 'B', 1 from dual union all
6 select 2, 'A', 1 from dual
7 ),
8 groups(group_name) as(
9 select 'A' from dual union all
10 select 'B' from dual union all
11 select 'C' from dual
12 )
13 select t1.ID
14 , g.group_name
15 , nvl(SomeValue, 0) SomeValue
16 from t1
17 partition by (t1.Id)
18 right outer join groups g
19 on (t1.GroupName = g.group_name)
20 order by t1.ID asc, g.group_name desc
21 ;
ID GROUP_NAME SOMEVALUE
---------- ---------- ----------
1 C 1
1 B 0
1 A 0
2 C 1
2 B 1
2 A 1
6 rows selected
UPDATE: Response to the comment.
Specify ID2 column in the partition by clause as well:
SQL> with t1(ID1, ID2, GroupName,SomeValue) as
2 (
3 select 1, 1, 'C', 1 from dual union all
4 select 1, 2, 'C', 1 from dual union all
5 select 2, 2, 'C', 1 from dual union all
6 select 2, 2, 'B', 1 from dual union all
7 select 2, 2, 'A', 1 from dual
8 ),
9 groups(group_name) as(
10 select 'A' from dual union all
11 select 'B' from dual union all
12 select 'C' from dual
13 )
14 select t1.ID1
15 , t1.ID2
16 , g.group_name
17 , nvl(SomeValue, 0) SomeValue
18 from t1
19 partition by (t1.Id1, t1.Id2)
20 right outer join groups g
21 on (t1.GroupName = g.group_name)
22 order by t1.ID1, t1.ID2 asc , g.group_name desc
23 ;
ID1 ID2 GROUP_NAME SOMEVALUE
---------- ---------- ---------- ----------
1 1 C 1
1 1 B 0
1 1 A 0
1 2 C 1
1 2 B 0
1 2 A 0
2 2 C 1
2 2 B 1
2 2 A 1
9 rows selected
Select
i.Id1,
i.Id2,
g.GroupName,
Coalesce(a.SomeValue, 0) As SomeValue
From
(select distinct ID1, ID2 from TableA) as i
cross join
(select distinct GroupName from TableA) as g
left outer join
tableA a
on i.ID = a.ID and g.GroupName = a.GroupName
Order By
1,
2,
3 Desc

Oracle SQL -- Combining two tables, but taking duplicates from one?

I have these tables:
Table A
Num Letter
1 A
2 B
3 C
Table B
Num Letter
2 C
3 D
4 E
I want to union these two tables, but I only want each number to appear once. If the same number appears in both tables, I want it from Table B instead of table A.
Result
Num Letter
1 A
2 C
3 D
4 E
How could I accomplish this? A union will keep duplicates and an intersect would only catch the same rows -- I consider a row a duplicate when it has the same number, regardless of the letter.
Try this: http://www.sqlfiddle.com/#!4/0b796/1
with a as
(
select Num, 'A' as src, Letter
from tblA
union
select Num, 'B' as src, Letter
from tblB
)
select
Num
,case when count(*) > 1 then
min(case when src = 'B' then Letter end)
else
min(Letter)
end as Letter
from a
group by Num
order by Num;
Output:
| NUM | LETTER |
----------------
| 1 | A |
| 2 | C |
| 3 | D |
| 4 | E |
And another one:
SELECT COALESCE(b.num, a.num) num, COALESCE(b.letter, a.letter) letter
FROM a FULL JOIN b ON a.num = b.num
ORDER BY 1;
With your data:
WITH a AS
(SELECT 1 num, 'A' letter FROM dual
UNION ALL SELECT 2, 'B' FROM dual
UNION ALL SELECT 3, 'C' FROM dual),
b AS
(SELECT 2 num, 'C' letter FROM dual
UNION ALL SELECT 3, 'D' FROM dual
UNION ALL SELECT 4, 'E' FROM dual)
SELECT COALESCE(b.num, a.num) num, COALESCE(b.letter, a.letter) letter
FROM a FULL JOIN b ON a.num = b.num
ORDER BY 1;
NUM L
---------- -
1 A
2 C
3 D
4 E
The efficiency might be lacking, but it produces the correct answer.
select nums.num, coalesce(b.letter, a.letter)
from
(select num from b
union
select num from a) nums
left outer join b
on (b.num = nums.num)
left outer join a
on (a.num = nums.num);
Or you can use Oracle-specific technique to make the code shorter: http://www.sqlfiddle.com/#!4/0b796/11
with a as
(
select Num, 'A' as src, Letter
from tblA
union
select Num, 'B' as src, Letter
from tblB
)
select Num, min(Letter) keep(dense_rank first order by src desc) as Letter
from a
group by Num
order by Num;
Output:
| NUM | LETTER |
----------------
| 1 | A |
| 2 | C |
| 3 | D |
| 4 | E |
The code works regardless of min(letter) or max(letter), it has the same output, it gives the same output. Important is you use keep dense_rank. Another important thing is, the order matter, we use order by src desc to give priority to source table B when keeping a row.
And to really make it shorter, use keep dense_rank last, and omit the desc on order by, asc is the default anyway http://www.sqlfiddle.com/#!4/0b796/12
with a as
(
select Num, 'A' as src, Letter
from tblA
union
select Num, 'B' as src, Letter
from tblB
)
select Num, min(Letter) keep(dense_rank last order by src) as Letter
from a
group by Num
order by Num;
Again, using min or max on Letter doesn't matter, as long as your keep dense_rank get the prioritized/preferred row
Another option is to combine the UNION and MINUS commands as follows:
SELECT
NUM, LETTER
FROM
TABLE B
UNION
( SELECT
NUM, LETTER
FROM
TABLE A
WHERE
NUM IN (SELECT
NUM
FROM
TABLE A
MINUS
SELECT
NUM
FROM
TABLE B ))
SELECT A.*
FROM A
WHERE A.NUM NOT IN
(SELECT A.NUM
FROM B
WHERE A.NUM=B.NUM
AND B.NUM IS NOT NULL
AND A.NUM IS NOT NULL
)
UNION
SELECT * FROM B;