select a field from the minus query - sql

select a, b, c from tab1
minus
select d, e, f from tab2
Above is how my query looks like. How to I reformat my query to display a, b, c and f?
I tried the below but I keep getting invalid identifier.
select t.a, t.b, t.c, t.f from
(select a, b, c from tab1
minus
select d, e, f from tab2)t
thanks

First, recall that operator minus takes the results of the first query (i.e. rows with columns a, b, c from tab1) and removes all rows such that there is a corresponding row in tab2, such that the combination of its columns d, e, f matches a combination of a, b, c from tab1.
Now it should be clear that what you are trying to do does not make sense: all rows from the second select, i.e.
select d, e, f from tab2
are excluded from the rows returned by the first select. In other words, there would not be a single row in the result with a value that came from column f (or for that matter, from any of the columns of tab2).

The following query will display rows that are present in both tabl1 and tab2 tables, but with different values for tab1.c and tab2.f. If you want to display rows that are present in tab1, but even the a and b values are not present in tab2, then the WHERE clause can be modified accordingly.
select tab1.a, tab1.b, tab1.c, tab2.f
from tab1 FULL OUTER JOIN tab2
ON tab1.a = tab2.d AND tab1.b = tab2.e
WHERE
-- (tab1.a IS NOT NULL AND tab1.b IS NOT NULL) AND
-- (tab1.c IS NULL OR tab2.f IS NULL) OR
(NVL(tab1.c, 0) <> NVL(tab2.f, 0));
Here is a SQL Fiddle demo.

Related

union all two table instead of join

I have several table which I can not join them as it gets really complicated and bigquery is not able to process it. So I am trying to union all tables and then group by. I have an issue during this process. I have two tables called t1 and t2 with below headers, they don't have null values:
a. b. c. d. a. b. c. e.
so in order to union all and group them I have below code:
WITH
all_tables_unioned AS (
SELECT
*,
NULL e
FROM
`t1`
UNION ALL
SELECT
*,
NULL d
FROM
`t2` )
SELECT
a,
b,
c,
MAX(d) AS d,
MAX(e) AS e
FROM
all_tables_unioned
GROUP BY
a,
b,
c
unfortunately when I run this I get a table a,b,c,d,e which e column is all null!
I tried to run query for each table before union all to make sure they are not null. I do not really know what is wrong with my query.
union all does not go by column names. Just list all the columns explicitly:
WITH all_tables_unioned AS (
SELECT a, b, c, d, NULL as e
FROM `t1`
UNION ALL
SELECT a, b, c, NULL as d, e
FROM `t2`
)
Regardless of the names you assign, the union all uses positions for matching columns.

i want to display an extra column to be displayed among my select query results and that extra column result is also a select statement

select A,B,C, (select id from tbl2) as D from Tbl1
the D value might be same for each row.
result will be
the D column value is Dynamic (it will change every time), i dont want to pass the value manually
Usually this is done as below
select a.A, a.B, a.C, b.id as D
from Tbl1 a
left join tbl2 b
on a.some_field_1 = b.some_field_2

Redshift Join VS. Union with Group By

Let's say I would like to pull the fields dim,a,b,c,d from 2 tables which one contains a,b and the other contains c,d.
I'm wondering if there's a preferred way (between the following) to do it - Performance wise:
1:
select t1.dim,a,b,c,d
from
(select dim,sum(a) as a,sum(b)as b from t1 group by dim)t1
join
(select dim,sum(c) as c,sum(d) as d from t2 group by dim)t2
on t1.dim=t2.dim;
2:
select dim,sum(a) as a,sum(b) as b,sum(c) as c,sum(d) as d
from
(
select dim,a,b,null as c, null as d from t1
union
select dim,null as a, null as b, c, d from t2
)a
group by dim
Of course when handling a large amount of data (5-30M records at the final query).
Thanks!
The first method filters would any dim values that are not in both tables. union is inefficient. So, neither is appealing.
I would go for:
select dim, sum(a) as a, sum(b) as b, sum(c) as c, sum(d) as d
from (select dim, a, b, null as c, null as d from t1
union all
select dim, null as a, null as b, c, d from t2
) a
group by dim;
You could also pre-aggregate the values in each subquery. Or use full outer join for the first method.

UNION without comparing one of the columns

I have two queries
select A, B, C, D from T1, T2
select A, B, C, D from T2, T3
I want to do a UNION of the two queries (no duplicates) but not comparing column D, that is if columns A B and C are the same then they are considered duplicates regardless of D. I do not want to select from joined tables T1, T2, and T3. Is this possible on a single statement?
(this is Oracle)
Use UNION and GROUP BY to do this, like following;)
select A, B, C
from(
select A, B, C, D from T1, T2
union
select A, B, C, D from T2, T3
)t
group by A, B, C
And you have to specify which D value do you want to get when A, B, C are the same, here I assume you get max(D), like this;
select A, B, C, max(D) as D
from(
select A, B, C, D from T1, T2
union
select A, B, C, D from T2, T3
)t
group by A, B, C
No matter which value you want to reserve, when you use group by in oracle, you only can select columns which appear in group by or some other columns with aggregation functions.

Problem: Group BY clause showing results previously filtered out by where clause

I have something like this
select A, B, C
from tableA
where A = '123'
group by B
and the results include entries whose A is not '123'. Why is this not the results I expected it to be?
thanks
database has 16k entries
actual result (7k entries): a mixture of entries with A='123' and A='other'
expected results (5k entries): all entries with A='123'
Your query will not work as A and C are not inside group by condition. For C you have to use Min, Max, Avg, Count,... aggregate functions, while for A you can use either aggregate function or diretly value of A something like:
Select Max(A) as A, B, Max(C) as C
From Table
Where A='123'
Group by B
Or
Select '123' as A, B, Max(C) as C
From Table
Where A='123'
Group by B