Finding if a value does not exist in SQL - sql

My data is structured with 5 columns (let's call them a, b, c, d, e) and 1,000,000+ rows. Each value in b has the potential for ~50 possibilities in e - so there could be up to 50 lines for each unique b value. Every b should have a '-27' among their e values. I would like to query all UNIQUE b where it doesn't have the -27 e value, ignoring all other possibilities for e.
Code so far:
select a, b, c, d, e
from TestDB
where not exists (select count(distinct b) from TestDB where e = '-27')
Would this code be sufficient? In initial tests I've done it appears to be either a) working or b) returning nothing. I'm new to SQL so I appreciate any help or being pointed in the right direction!
**edited to make it clearer I was looking for unique 'b' values.

I would like to query all b where it doesn't have the -27 e value, ignoring all other possibilities for e.
If you just want the b values, then aggregation should work:
select b
from testdb t
group by b
having sum(case when e = '-27' then 1 else 0 end) = 0;

You need a correlated subquery with NOT EXISTS, i.e. a reference to the main query table.
select a, b, c, d, e
from TestDB t1
where not exists (select 1 from TestDB t2 where t2.e = '-27' and t1.b = t2.b)

This query:
select b from TestDB where e = '-27'
returns all the bs that you want filtered out.
Use it with the operator NOT IN:
select a, b, c, d, e
from TestDB
where b not in (select b from TestDB where e = '-27')

Related

union all two table instead of join

I have several table which I can not join them as it gets really complicated and bigquery is not able to process it. So I am trying to union all tables and then group by. I have an issue during this process. I have two tables called t1 and t2 with below headers, they don't have null values:
a. b. c. d. a. b. c. e.
so in order to union all and group them I have below code:
WITH
all_tables_unioned AS (
SELECT
*,
NULL e
FROM
`t1`
UNION ALL
SELECT
*,
NULL d
FROM
`t2` )
SELECT
a,
b,
c,
MAX(d) AS d,
MAX(e) AS e
FROM
all_tables_unioned
GROUP BY
a,
b,
c
unfortunately when I run this I get a table a,b,c,d,e which e column is all null!
I tried to run query for each table before union all to make sure they are not null. I do not really know what is wrong with my query.
union all does not go by column names. Just list all the columns explicitly:
WITH all_tables_unioned AS (
SELECT a, b, c, d, NULL as e
FROM `t1`
UNION ALL
SELECT a, b, c, NULL as d, e
FROM `t2`
)
Regardless of the names you assign, the union all uses positions for matching columns.

using if condition in snowflake

I have two tables as shown below with columns:
Table A
a, b, c, id
Table B
d, e, f, g, h, id
Now I need perform a query basically I will get a id from user, so I need to check if that id is present in table A or table B. So the record will be present in any of one table
SELECT * FROM tableA WHERE id = 123
OR
SELECT * FROM tableB WHERE id = 123
So the response will be either columns of tableA or columns of tableB. But I can't perform a union since the columns should be equal among two tables.
So it's basically a if condition, how can I get the desired output in Snowflake.
And using if is the best optimized approach or any other way is there
You can use union all -- assuming the types are compatible. Just pad the smaller table:
select a, b, c, null as g, null as h, id
from a
where id = 123
union all
select d, e, f, g, h, id
from b
where id = 123;
If you want the columns separated, then a full join accomplishes that:
select *
from a full join
b
using (id)
where a.id = 123 or b.id = 123;

Syntax error for WHERE clause when using proc sql

I am new to SAS and I need to recreate a query I had running using R.
The syntax rules may be different in SAS but I dont see where I am going wrong here
Table "Old" columns: A, B, C, D, E
Table "New" columns: A, B, C, D, E
PROC SQL;
create table delta as
SELECT *
FROM New
WHERE
(A, B, C)
IN(
SELECT (A, B, C)
FROM New
EXCEPT
SELECT A, B, C
FROM Old);
QUIT;
My code should find delta rows based on A, B, C variables.
Error Message on comma
WHERE(A, B, C): ERROR 79-322: Expecting a (.
I'm not in sas but could be that this db don't allow the use of tuple in WHERE IN clause.
in this case you could try refactoring your quesry as an inner join
SELECT *
FROM New N
INNER JOIN (
SELECT A, B, C
FROM New
EXCEPT
SELECT A, B, C
FROM Old
) T ON T.A = N.A
AND T.B = N.B
AND T.C = N.C

Redshift Join VS. Union with Group By

Let's say I would like to pull the fields dim,a,b,c,d from 2 tables which one contains a,b and the other contains c,d.
I'm wondering if there's a preferred way (between the following) to do it - Performance wise:
1:
select t1.dim,a,b,c,d
from
(select dim,sum(a) as a,sum(b)as b from t1 group by dim)t1
join
(select dim,sum(c) as c,sum(d) as d from t2 group by dim)t2
on t1.dim=t2.dim;
2:
select dim,sum(a) as a,sum(b) as b,sum(c) as c,sum(d) as d
from
(
select dim,a,b,null as c, null as d from t1
union
select dim,null as a, null as b, c, d from t2
)a
group by dim
Of course when handling a large amount of data (5-30M records at the final query).
Thanks!
The first method filters would any dim values that are not in both tables. union is inefficient. So, neither is appealing.
I would go for:
select dim, sum(a) as a, sum(b) as b, sum(c) as c, sum(d) as d
from (select dim, a, b, null as c, null as d from t1
union all
select dim, null as a, null as b, c, d from t2
) a
group by dim;
You could also pre-aggregate the values in each subquery. Or use full outer join for the first method.

select a field from the minus query

select a, b, c from tab1
minus
select d, e, f from tab2
Above is how my query looks like. How to I reformat my query to display a, b, c and f?
I tried the below but I keep getting invalid identifier.
select t.a, t.b, t.c, t.f from
(select a, b, c from tab1
minus
select d, e, f from tab2)t
thanks
First, recall that operator minus takes the results of the first query (i.e. rows with columns a, b, c from tab1) and removes all rows such that there is a corresponding row in tab2, such that the combination of its columns d, e, f matches a combination of a, b, c from tab1.
Now it should be clear that what you are trying to do does not make sense: all rows from the second select, i.e.
select d, e, f from tab2
are excluded from the rows returned by the first select. In other words, there would not be a single row in the result with a value that came from column f (or for that matter, from any of the columns of tab2).
The following query will display rows that are present in both tabl1 and tab2 tables, but with different values for tab1.c and tab2.f. If you want to display rows that are present in tab1, but even the a and b values are not present in tab2, then the WHERE clause can be modified accordingly.
select tab1.a, tab1.b, tab1.c, tab2.f
from tab1 FULL OUTER JOIN tab2
ON tab1.a = tab2.d AND tab1.b = tab2.e
WHERE
-- (tab1.a IS NOT NULL AND tab1.b IS NOT NULL) AND
-- (tab1.c IS NULL OR tab2.f IS NULL) OR
(NVL(tab1.c, 0) <> NVL(tab2.f, 0));
Here is a SQL Fiddle demo.