Snowflake, SQL where clause - sql

I need to write query with where clause:
where
pl.ods_site_id in (select id from table1 where ...)
But if subquery (table1) didn't return anything, where clause doesn't need to include in result query (like it returns TRUE).
How can I do it? (I have snowflake SQL dialect)

You could include a second condition:
where pl.ods_site_id in (select id from table1 where ...) or
not exists (select id from table1 where ...)
This explicitly checks for the subquery returning no rows.

If you are willing to use a join instead, Snowflake supports qualify clause which might come in handy here. You can run this on Snowflake to see how it works.
with
pl (ods_site_id) as (select 1 union all select 5),
table1 (id) as (select 5) --change this to 7 to test if it returns ALL on no match
select a.*
from pl a
left join table1 b on a.ods_site_id = b.id -- and other conditions you want to add
qualify b.id = a.ods_site_id --either match the join condition
or count(b.id) over () = 0; --or make sure there is 0 match from table1

Related

sql - ignore duplicates while joining

I have two tables.
Table1 is 1591 rows. Table2 is 270 rows.
I want to fetch specific column data from Table2 based on some condition between them and also exclude duplicates which are in Table2. Which I mean to join the tables but get only one value from Table2 even if the condition has occurred more than time. The result should be exactly 1591 rows.
I tried to make Left,Right, Inner joins but the data comes more than or less 1591.
Example
Table1
type,address,name
40,blabla,Adam
20,blablabla,Joe
Table2
type,currency
40,usd
40,gbp
40,omr
Joining on 'type'
Result
type,address,name,currency
40,blabla,name,usd
20,blblbla,Joe,null
try this it has to work
select *
from
Table1 h
inner join
(select type,currency,ROW_NUMBER()over (partition by type order by
currency) as rn
from
Table2
) sr on
sr.type=h.type
and rn=1
Try this. It's standard SQL, therefore, it should work on your rdbms system.
select * from Table1 AS t
LEFT OUTER JOIN Table2 AS y ON t.[type] = y.[type] and y.currency IN (SELECT MAX(currency) FROM Table2 GROUP BY [type])
If you want to control which currency is joined, consider altering Table2 by adding a new column active/non active and modifying accordingly the JOIN clause.
You can use outer apply if it's supported.
select a.type, a.address, a.name, b.currency
from Table1 a
outer apply (
select top 1 currency
from Table2
where Table2.type = a.type
) b
I typical way to do this uses a correlated subquery. This guarantees that all rows in the first table are kept. And it generates an error if more than one row is returned from the second.
So:
select t1.*,
(select t2.currency
from table2 t2
where t2.type = t1.type
fetch first 1 row only
) as currency
from table1 t1;
You don't specify what database you are using, so this uses standard syntax for returning one row. Some databases use limit or top instead.

Standard SQL: LEFT JOIN by two conditions using BETWEEN

I have the following query in BigQuery:
#Standard SQL
SELECT *
FROM `Table_1`
LEFT JOIN `Table_2` ON (timestamp BETWEEN TimeStampStart AND TimeStampEnd)
But I get the following Error:
Error: LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join.
If I use JOIN instead of LEFT JOIN, it works, but I want to keep all the rows from Table_1 (so also the ones which aren't matched to Table_2)
How to achieve this?
This is absolutely stupid... but the same query will work if you add a condition that matches a column from table1 with a column from table2:
WITH Table_1 AS (
SELECT CAST('2018-08-15' AS DATE) AS Timestamp, 'Foo' AS Foo
UNION ALL
SELECT CAST('2018-09-15' AS DATE), 'Foo'
), Table_2 AS (
SELECT CAST('2018-08-14' AS DATE) AS TimeStampStart, CAST('2018-08-16' AS DATE) AS TimeStampEnd, 'Foo' AS Bar
)
SELECT *
FROM Table_1
LEFT JOIN Table_2 ON Table_1.Foo = Table_2.Bar AND Table_1.Timestamp BETWEEN Table_2.TimeStampStart AND Table_2.TimeStampEnd
See if you have additional matching criteria that you can use (like another column that links table1 and table2 on equality).
A LEFT JOIN is always equivalent to the UNION of :
the INNER JOIN between the same two arguments on the same join predicate, and
the set of rows from the first argument for which no matching row is found (and properly extended with null values for all columns retained from the second argument)
That latter portion can be written as
SELECT T1.*, null as T2_C1, null as T2_C2, ...
FROM T1
WHERE NOT EXISTS (SELECT * FROM T2 WHERE )
So if you spell out the UNION you should be able to get there.
Interesting. This works for me in standard SQL:
select *
from (select 1 as x) a left join
(select 2 as a, 3 as b) b
on a.x between b.a and b.b
I suspect you are using legacy SQL. Such switch to standard SQL. (And drop the parentheses after the between.)
The problem is:
#(Standard SQL)#
This doesn't do anything. Use:
#StandardSQL
Hi as per the documentation, "(" has a special meaning, so please try without the brackets.
SELECT * FROM Table_1
LEFT JOIN Table_2 ON Table_1.timestamp >= Table_2.TimeStampStart AND Table_1.timestamp <= Table_2.TimeStampEnd
Documentation here

Filter by count from another table

This query works fine only without WHERE, otherwise there is an error:
column "cnt" does not exist
SELECT
*,
(SELECT count(*)
FROM B
WHERE A.id = B.id) AS cnt
FROM A
WHERE cnt > 0
Use a subquery:
SELECT a.*
FROM (SELECT A.*,
(SELECT count(*)
FROM B
WHERE A.id = B.id
) AS cnt
FROM A
) a
WHERE cnt > 0;
Column aliases defined in the SELECT cannot be used by the WHERE (or other clauses) for that SELECT.
Or, if the id on a is unique, you can more simply do:
SELECT a.*, COUNT(B.id)
FROM A LEFT JOIN
B
ON A.id = B.id
GROUP BY A.id
HAVING COUNT(B.id) > 0;
Or, if you don't really need the count, then:
select a.*
from a
where exists (select 1 from b where b.id = a.id);
Assumptions:
You need all columns from A in the result, plus the count from B. That's what your demonstrated query does.
You only want rows with cnt > 0. That's what started your question after all.
Most or all B.id exist in A. That's the typical case and certainly true if a FK constraint on B.id references to A.id.
Solution
Faster, shorter, correct:
SELECT * -- !
FROM (SELECT id, count(*) AS cnt FROM B) B
JOIN A USING (id) -- !
-- WHERE cnt > 0 -- this predicate is implicit now!
Major points
Aggregate before the join, that's typically (substantially) faster when processing the whole table or major parts of it. It also defends against problems if you join to more than one n-table. See:
Aggregate functions on multiple joined tables
You don't need to add the predicate WHERE cnt > 0 any more, that's implicit with the [INNER] JOIN.
You can simply write SELECT *, since the join only adds the column cnt to A.* when done with the USING clause - only one instance of the joining column(s) (id in the example) is added to the out columns. See:
How to drop one join key when joining two tables
Your added question in the comment
postgres really allows to have outside aggregate function attributes that are not behind group by?
That's only true if the PK column(s) is listed in the GROUP BY clause - which covers the whole row. Not the case for a UNIQUE or EXCLUSION constraint. See:
Return a grouped list with occurrences using Rails and PostgreSQL
SQL Fiddle demo (extended version of Gordon's demo).

Sum multiple columns using a subquery

I'm trying to play with Oracle's DB.
I'm trying to sum two columns from the same row and output a total on the fly.
However, I can't seem to get it to work. Here's the code I have so far.
SELECT a.name , SUM(b.sequence + b.length) as total
FROM (
SELECT a.name, a.sequence, b.length
FROM tbl1 a, tbl2 b
WHERE b.sequence = a.sequence
AND a.loc <> -1
AND a.id='10201'
ORDER BY a.location
)
The inner query works, but I can't seem to make the new query and the subquery work together.
Here's a sample table I'm using:
...[name][sequence][length]...
...['aa']['100000']['2000']...
...
...['za']['200000']['3001']...
And here's the output I'd like:
[name][ total ]
['aa']['102000']
...
['za']['203001']
Help much appreciated, thanks!
SUM() sums number across rows. Instead replace it with sequence + length.
...or if there is the possibility of NULL values occurring in either the sequence or length columns, use: COALESCE(sequence, 0) + COALESCE(length, 0).
Or, if your intention was indeed to produce a running total (i.e. aggregating the sum of all the totals and lengths for each user), add a GROUP BY a.name after the end of the subquery.
BTW: you shouldn't be referencing the internal aliases used inside a subquery from outside of that subquery. Some DB servers allow it (and I don't have convenient access to an Oracle server right now, so I can test it), but it's not really good practice.
I think what you are after is something like:
SELECT a.name,
SUM(B.sequence + B.length) AS total
FROM Tbl1 A
INNER JOIN Tbl2 B
ON B.sequence = A.sequence
WHERE A.loc <> -1
AND A.id = 10201
GROUP BY a.name
ORDER BY A.location
Your query with the subquery fails for several reasons:
You use the table alias a, but it is not defined.
You use the table alias b, but it is not defined.
You have a sum() in the select clause with unaggregated columns, but no group by.
In addition, you have an order by in the subquery which is allowed syntactically, but ignored.
Here is a better way to write the query without a subquery:
SELECT t1.name, (t1.sequence + t2.length) as total
FROM tbl1 t1 join
tbl2 t2
on t1.sequence = t2.sequence
where t1.loc <> -1 AND t1.id = '10201'
ORDER BY t1.location;
Note the use of proper join syntax, the use of aliases that make sense, and the simple calculation at this level.
Here is a version with a subquery:
select name, (sequence + length) as total
from (SELECT t1.name, t1.sequence, t2.length
FROM tbl1 t1 join
tbl2 t2
on t1.sequence = t2.sequence
where t1.loc <> -1 AND t1.id = '10201'
) t
ORDER BY location;
Note that the order by is going at the outer level. And, I gave the subquery an alias. This is not strictly required, but typically a good idea.

Outer Join with Where returning Nulls

Hi I have 2 tables. I want to list
all records in table1 which are present in
table2
all records in table2 which are not present in table1 with a where condition
Null rows will be returned by table1 in second condition but I am unable to get the query working correctly. It is only returning null rows
SELECT
A.CLMSRNO,A.CLMPLANO,A.GENCURRCODE,A.CLMNETLOSSAMT,
A.CLMLOSSAMT,A.CLMCLAIMPRCLLOSSSHARE
FROM
PAKRE.CLMCLMENTRY A
RIGHT OUTER JOIN (
SELECT
B.CLMSRNO,B.UWADVICETYPE,B.UWADVICENO,B.UWADVPREMCURRCODE,
B.GENSUBBUSICLASS,B.UWADVICENET,B.UWADVICEKIND,B.UWADVYEAR,
B.UWADVQTR,B.ISMANUAL,B.UWCLMNOREFNO
FROM
PAKRE.UWADVICE B
WHERE
B.ISMANUAL=1
) r
ON a.CLMSRNO=r.CLMSRNO
ORDER BY
A.CLMSRNO DESC;
Which OS are you using ?
Table aliases are case sensistive on some platforms, which is why your join condition ON a.CLMSRNO=r.CLMSRNO fails.
Try with A.CLMSRNO=r.CLMSRNO and see if that works
I'm not understanding your first attempt, but here's basically what you need, I think:
SELECT *
FROM TABLE1
INNER JOIN TABLE2
ON joincondition
UNION ALL
SELECT *
FROM TABLE2
LEFT JOIN TABLE1
ON joincondition
AND TABLE1.wherecondition
WHERE TABLE1.somejoincolumn IS NULL
I think you may want to remove the subquery and put its columns into the main query e.g.
SELECT A.CLMSRNO, A.CLMPLANO, A.GENCURRCODE, A.CLMNETLOSSAMT,
A.CLMLOSSAMT, A.CLMCLAIMPRCLLOSSSHARE,
B.CLMSRNO, B.UWADVICETYPE, B.UWADVICENO, B.UWADVPREMCURRCODE,
B.GENSUBBUSICLASS, B.UWADVICENET, B.UWADVICEKIND, B.UWADVYEAR,
B.UWADVQTR, B.ISMANUAL, B.UWCLMNOREFNO
FROM PAKRE.CLMCLMENTRY A
RIGHT OUTER JOIN PAKRE.UWADVICE B
ON A.CLMSRNO = B.CLMSRNO
WHERE B.ISMANUAL = 1
ORDER
BY A.CLMSRNO DESC;