UNION without comparing one of the columns - sql

I have two queries
select A, B, C, D from T1, T2
select A, B, C, D from T2, T3
I want to do a UNION of the two queries (no duplicates) but not comparing column D, that is if columns A B and C are the same then they are considered duplicates regardless of D. I do not want to select from joined tables T1, T2, and T3. Is this possible on a single statement?
(this is Oracle)

Use UNION and GROUP BY to do this, like following;)
select A, B, C
from(
select A, B, C, D from T1, T2
union
select A, B, C, D from T2, T3
)t
group by A, B, C
And you have to specify which D value do you want to get when A, B, C are the same, here I assume you get max(D), like this;
select A, B, C, max(D) as D
from(
select A, B, C, D from T1, T2
union
select A, B, C, D from T2, T3
)t
group by A, B, C
No matter which value you want to reserve, when you use group by in oracle, you only can select columns which appear in group by or some other columns with aggregation functions.

Related

union all two table instead of join

I have several table which I can not join them as it gets really complicated and bigquery is not able to process it. So I am trying to union all tables and then group by. I have an issue during this process. I have two tables called t1 and t2 with below headers, they don't have null values:
a. b. c. d. a. b. c. e.
so in order to union all and group them I have below code:
WITH
all_tables_unioned AS (
SELECT
*,
NULL e
FROM
`t1`
UNION ALL
SELECT
*,
NULL d
FROM
`t2` )
SELECT
a,
b,
c,
MAX(d) AS d,
MAX(e) AS e
FROM
all_tables_unioned
GROUP BY
a,
b,
c
unfortunately when I run this I get a table a,b,c,d,e which e column is all null!
I tried to run query for each table before union all to make sure they are not null. I do not really know what is wrong with my query.
union all does not go by column names. Just list all the columns explicitly:
WITH all_tables_unioned AS (
SELECT a, b, c, d, NULL as e
FROM `t1`
UNION ALL
SELECT a, b, c, NULL as d, e
FROM `t2`
)
Regardless of the names you assign, the union all uses positions for matching columns.

Redundant use of distinct in group by?

I'm reviewing some SQL queries in SAS and I encountered the following query structure:
SELECT distinct A, B, Sum(C) FROM Table1 GROUP BY A, B;
I would like to know if it's strictly equivalent to:
SELECT A, B, Sum(C) FROM Table1 GROUP BY A, B;
Or if I'm missing a nuance, in the output or the way the computation is handled
The two queries are equivalent.
Generally,
SELECT DISTINCT a, b, c
FROM <something>
is equivalent to
SELECT a, b, c
FROM <something>
GROUP BY a, b, c
In your case, <something> happens to be a result of GROUP BY query, which has distinct columns A and B. This is enough to ensure that triples A, B, SUM(C) are going to be unique as well.

Redshift Join VS. Union with Group By

Let's say I would like to pull the fields dim,a,b,c,d from 2 tables which one contains a,b and the other contains c,d.
I'm wondering if there's a preferred way (between the following) to do it - Performance wise:
1:
select t1.dim,a,b,c,d
from
(select dim,sum(a) as a,sum(b)as b from t1 group by dim)t1
join
(select dim,sum(c) as c,sum(d) as d from t2 group by dim)t2
on t1.dim=t2.dim;
2:
select dim,sum(a) as a,sum(b) as b,sum(c) as c,sum(d) as d
from
(
select dim,a,b,null as c, null as d from t1
union
select dim,null as a, null as b, c, d from t2
)a
group by dim
Of course when handling a large amount of data (5-30M records at the final query).
Thanks!
The first method filters would any dim values that are not in both tables. union is inefficient. So, neither is appealing.
I would go for:
select dim, sum(a) as a, sum(b) as b, sum(c) as c, sum(d) as d
from (select dim, a, b, null as c, null as d from t1
union all
select dim, null as a, null as b, c, d from t2
) a
group by dim;
You could also pre-aggregate the values in each subquery. Or use full outer join for the first method.

Reuse subquery results in multiple SELECT in Oracle (Can't create table)

I have multiple SELECT queries using same subset of data. I would like to reuse it so no repeated subqueries or WITH clause. However, I can't CREATE TABLE or VIEW because of insufficient privileges. So is there a workaround?
I'm using TOAD Oracle.
For example,
WITH LOCAL_RESULTS
AS (SELECT a, b, c, d...
FROM SURVEY )
SELECT A, B
FROM LOCAL_RESULTS
where condition=1
WITH LOCAL_RESULTS
AS (SELECT a, b, c, d...
FROM SURVEY )
SELECT A, C
FROM LOCAL_RESULTS
where condition=2
WITH LOCAL_RESULTS
AS (SELECT a, b, c, d...
FROM SURVEY )
SELECT B, D, A...
FROM LOCAL_RESULTS
where condition=3
Thanks for any help.
A union query might work.
with local_results as
(subquery goes here)
select a, b, c, 1 condition
from local_results
where whatever
union
select a, b, null c, 2 condition
from local_results
etc

Where one or another column exists in a sub select

I'm looking to do something like this:
SELECT a, b, c, d FROM someTable WHERE
WHERE a in (SELECT testA FROM otherTable);
Only I want to be able to test if 2 columns exist in a sub select of 2 columns.
SELECT a, b, c, d FROM someTable WHERE
WHERE a OR b in (SELECT testA, testB FROM otherTable);
We are using MS SQL Server 2012
Try this
SELECT a, b, c, d
FROM someTable WHERE
WHERE a IN (SELECT testA FROM otherTable)
OR b IN (SELECT testB FROM otherTable)
or
SELECT a, b, c, d
FROM someTable WHERE
WHERE EXISTS
(SELECT NULL
FROM otherTable
WHERE testA = a OR testB = a
OR testA = b OR testB = b)
UPDATE:
Maybe you need to add index on testB column, if you have bad performance.
Also another option to use CROSS APPLY for MS SQL
SELECT a, b, c, d
FROM someTable ST
CROSS APPLY (
SELECT 1
FROM otherTable OT
WHERE OT.testA = ST.a OR OT.testB = ST.b
)
If any of this won't work, try using UNION. Mostly UNION gives better performance than OR
SELECT a, b, c, d
FROM someTable WHERE
WHERE a IN (SELECT testA FROM otherTable)
UNION
SELECT a, b, c, d
FROM someTable WHERE
WHERE b IN (SELECT testB FROM otherTable)
UPDATE 2:
For further reading on OR and UNION differences
Why is UNION faster than an OR statement
Try this..
SELECT a, b, c, d
FROM someTable
WHERE Exists
(
SELECT 1
FROM otherTable
Where a = testA OR b = testB
)
If I'm understanding your question correctly, LEFT JOIN is probably the way to go here:
SELECT a, b, c, d
FROM TableA ta
LEFT JOIN TableB tb
ON ta.a = tb.a
AND ta.b = tb.b
WHERE tb.a IS NOT NULL
AND tb.c IS NOT NULL
You could also use UNION and INNER JOIN:
SELECT a, b, c, d
FROM someTable
INNER JOIN OtherTable OT on someTable.B = OT.testB
UNION
SELECT a, b, c, d
FROM someTable
INNER JOIN OtherTable OT ON someTable.A= OT.testA
Note that the JOIN approach should be orders of magnitude faster if you have an index on the column
Joins seems to be one option, have you thought about using them with a Union?
SELECT a, b, c, d
FROM someTable
INNER JOIN OtherTable OT on someTable.B = OT.testB
UNION
SELECT a, b, c, d
FROM someTable
INNER JOIN OtherTable OT ON someTable.A= OT.testA