union all two table instead of join - sql

I have several table which I can not join them as it gets really complicated and bigquery is not able to process it. So I am trying to union all tables and then group by. I have an issue during this process. I have two tables called t1 and t2 with below headers, they don't have null values:
a. b. c. d. a. b. c. e.
so in order to union all and group them I have below code:
WITH
all_tables_unioned AS (
SELECT
*,
NULL e
FROM
`t1`
UNION ALL
SELECT
*,
NULL d
FROM
`t2` )
SELECT
a,
b,
c,
MAX(d) AS d,
MAX(e) AS e
FROM
all_tables_unioned
GROUP BY
a,
b,
c
unfortunately when I run this I get a table a,b,c,d,e which e column is all null!
I tried to run query for each table before union all to make sure they are not null. I do not really know what is wrong with my query.

union all does not go by column names. Just list all the columns explicitly:
WITH all_tables_unioned AS (
SELECT a, b, c, d, NULL as e
FROM `t1`
UNION ALL
SELECT a, b, c, NULL as d, e
FROM `t2`
)
Regardless of the names you assign, the union all uses positions for matching columns.

Related

using if condition in snowflake

I have two tables as shown below with columns:
Table A
a, b, c, id
Table B
d, e, f, g, h, id
Now I need perform a query basically I will get a id from user, so I need to check if that id is present in table A or table B. So the record will be present in any of one table
SELECT * FROM tableA WHERE id = 123
OR
SELECT * FROM tableB WHERE id = 123
So the response will be either columns of tableA or columns of tableB. But I can't perform a union since the columns should be equal among two tables.
So it's basically a if condition, how can I get the desired output in Snowflake.
And using if is the best optimized approach or any other way is there
You can use union all -- assuming the types are compatible. Just pad the smaller table:
select a, b, c, null as g, null as h, id
from a
where id = 123
union all
select d, e, f, g, h, id
from b
where id = 123;
If you want the columns separated, then a full join accomplishes that:
select *
from a full join
b
using (id)
where a.id = 123 or b.id = 123;

Redshift Join VS. Union with Group By

Let's say I would like to pull the fields dim,a,b,c,d from 2 tables which one contains a,b and the other contains c,d.
I'm wondering if there's a preferred way (between the following) to do it - Performance wise:
1:
select t1.dim,a,b,c,d
from
(select dim,sum(a) as a,sum(b)as b from t1 group by dim)t1
join
(select dim,sum(c) as c,sum(d) as d from t2 group by dim)t2
on t1.dim=t2.dim;
2:
select dim,sum(a) as a,sum(b) as b,sum(c) as c,sum(d) as d
from
(
select dim,a,b,null as c, null as d from t1
union
select dim,null as a, null as b, c, d from t2
)a
group by dim
Of course when handling a large amount of data (5-30M records at the final query).
Thanks!
The first method filters would any dim values that are not in both tables. union is inefficient. So, neither is appealing.
I would go for:
select dim, sum(a) as a, sum(b) as b, sum(c) as c, sum(d) as d
from (select dim, a, b, null as c, null as d from t1
union all
select dim, null as a, null as b, c, d from t2
) a
group by dim;
You could also pre-aggregate the values in each subquery. Or use full outer join for the first method.

UNION without comparing one of the columns

I have two queries
select A, B, C, D from T1, T2
select A, B, C, D from T2, T3
I want to do a UNION of the two queries (no duplicates) but not comparing column D, that is if columns A B and C are the same then they are considered duplicates regardless of D. I do not want to select from joined tables T1, T2, and T3. Is this possible on a single statement?
(this is Oracle)
Use UNION and GROUP BY to do this, like following;)
select A, B, C
from(
select A, B, C, D from T1, T2
union
select A, B, C, D from T2, T3
)t
group by A, B, C
And you have to specify which D value do you want to get when A, B, C are the same, here I assume you get max(D), like this;
select A, B, C, max(D) as D
from(
select A, B, C, D from T1, T2
union
select A, B, C, D from T2, T3
)t
group by A, B, C
No matter which value you want to reserve, when you use group by in oracle, you only can select columns which appear in group by or some other columns with aggregation functions.

Reuse subquery results in multiple SELECT in Oracle (Can't create table)

I have multiple SELECT queries using same subset of data. I would like to reuse it so no repeated subqueries or WITH clause. However, I can't CREATE TABLE or VIEW because of insufficient privileges. So is there a workaround?
I'm using TOAD Oracle.
For example,
WITH LOCAL_RESULTS
AS (SELECT a, b, c, d...
FROM SURVEY )
SELECT A, B
FROM LOCAL_RESULTS
where condition=1
WITH LOCAL_RESULTS
AS (SELECT a, b, c, d...
FROM SURVEY )
SELECT A, C
FROM LOCAL_RESULTS
where condition=2
WITH LOCAL_RESULTS
AS (SELECT a, b, c, d...
FROM SURVEY )
SELECT B, D, A...
FROM LOCAL_RESULTS
where condition=3
Thanks for any help.
A union query might work.
with local_results as
(subquery goes here)
select a, b, c, 1 condition
from local_results
where whatever
union
select a, b, null c, 2 condition
from local_results
etc

select a field from the minus query

select a, b, c from tab1
minus
select d, e, f from tab2
Above is how my query looks like. How to I reformat my query to display a, b, c and f?
I tried the below but I keep getting invalid identifier.
select t.a, t.b, t.c, t.f from
(select a, b, c from tab1
minus
select d, e, f from tab2)t
thanks
First, recall that operator minus takes the results of the first query (i.e. rows with columns a, b, c from tab1) and removes all rows such that there is a corresponding row in tab2, such that the combination of its columns d, e, f matches a combination of a, b, c from tab1.
Now it should be clear that what you are trying to do does not make sense: all rows from the second select, i.e.
select d, e, f from tab2
are excluded from the rows returned by the first select. In other words, there would not be a single row in the result with a value that came from column f (or for that matter, from any of the columns of tab2).
The following query will display rows that are present in both tabl1 and tab2 tables, but with different values for tab1.c and tab2.f. If you want to display rows that are present in tab1, but even the a and b values are not present in tab2, then the WHERE clause can be modified accordingly.
select tab1.a, tab1.b, tab1.c, tab2.f
from tab1 FULL OUTER JOIN tab2
ON tab1.a = tab2.d AND tab1.b = tab2.e
WHERE
-- (tab1.a IS NOT NULL AND tab1.b IS NOT NULL) AND
-- (tab1.c IS NULL OR tab2.f IS NULL) OR
(NVL(tab1.c, 0) <> NVL(tab2.f, 0));
Here is a SQL Fiddle demo.