Snowflake: Decimal or null input to function results in "Unsupported subquery type" - sql

Given the following function:
CREATE
OR REPLACE FUNCTION myfunction(a float, b float, c float)
RETURNS float AS
$$
select sum(1/(1+exp(-(series - c)/4)))
from (
select (a + ((row_number()) over(order by 0))*1) series
from table(generator(rowcount => 10000)) x
qualify series <= b
)
$$;
I get all the expected results when executing the following queries:
select
myfunction(1, 10, 1);
select
myfunction(1, 100, 1);
select
myfunction(1, 10, 1.1);
select
myfunction(0, 1, 89.87);
select
myfunction(0, 1, null);
However when I run the following query:
select
myfunction(a, b, c)
from
(
select
1 as a,
10 as b,
1.1 as c
union
select
0 as a,
1 as b,
null as c
);
I get an error:
"Unsupported subquery type cannot be evaluated".
While this query does work:
select
a, b, myfunction(a, b, c)
from
(
select
1 as a,
10 as b,
1 as c
union
select
1 as a,
100 as b,
1 as c
);
Why can't Snowflake handle null or decimal numbers in the 'c' column when I input multiple rows while individual rows weren't a problem?
And how can this function be rewritten to be able to handle these cases?

SQL UDFs are converted to subqueries (for now), and if Snowflake can not determine the data type returned from these subqueries, you get the "Unsupported subquery" error. The issue is not about decimals or null. The issue is A and C variables (which are used in SUM()) contain different values. For example, the following ones work:
select
myfunction(a, b, c )
from
(
select
1 as a,
1 as b,
1.1 as c
union
select
1 as a,
100 as b,
1.1 as c
);
select
myfunction(a, b, c )
from
(
select
1 as a,
1 as b,
null as c
union
select
1 as a,
100 as b,
null as c
);
You may hit these kinds of errors when you try to write complex functions with SQL UDFs. Sometimes rewriting them can help, but I don't see a way for this one. As a workaround, you may re-write it in JavaScript because JS UDFs are not converted to subqueries:
CREATE
OR REPLACE FUNCTION myfunction(a float, b float, c float)
RETURNS float
language javascript AS
$$
var res = 0.0;
for (let series = A + 1; series <= B; series++) {
res += (1/(1+Math.exp(-(series - C)/4)));
}
return res;
$$;
According to my tests, the above UDF returns the same result as the SQL version, and it doesn't hit "Unsupported subquery" error.

Weird one. Can you try selecting from the subquery and running it through a cast?
Like this:
select a, b, c
from
(select cast(a as float) as a, cast(b as float) as b, cast(c as float) as c from
(
select
1 as a,
10 as b,
1 as c
union
select
1 as a,
100 as b,
null as c
) as t) as x

In the end implementing it as a python function allowed for also handling all the edge cases:
CREATE
OR REPLACE FUNCTION myfunction(a float, b float, c float)
returns float
language python
runtime_version=3.8
handler='compute'
as
$$
def compute(a, b, c):
import math
if b < a:
return None
if c is None:
return None
res = []
step_size = 1
it = a
while it < b:
res.append(it)
it += step_size
res = sum([1/(1+math.exp(-1*(i-c)/4)) for i in res])
return res
$$;

Related

Is there something like Spark's unionByName in BigQuery?

I'd like to concatenate tables with different schemas, filling unknown values with null.
Simply using UNION ALL of course does not work like this:
WITH
x AS (SELECT 1 AS a, 2 AS b ),
y AS (SELECT 3 AS b, 4 AS c )
SELECT * FROM x
UNION ALL
SELECT * FROM y
a b
1 2
3 4
(unwanted result)
In Spark, I'd use unionByName to get the following result:
a b c
1 2
3 4
(wanted result)
Of course, I can manually create the needed query (adding nullss) in BigQuery like so:
SELECT a, b, NULL c FROM x
UNION ALL
SELECT NULL a, b, c FROM y
But I'd prefer to have a generic solution, not requiring me to generate something like that.
So, is there something like unionByName in BigQuery? Or can one come up with a generic SQL function for this?
Consider below approach (I think it is as generic as one can get)
create temp function json_extract_keys(input string) returns array<string> language js as """
return Object.keys(JSON.parse(input));
""";
create temp function json_extract_values(input string) returns array<string> language js as """
return Object.values(JSON.parse(input));
""";
create temp table temp_table as (
select json, key, value
from (
select to_json_string(t) json from table_x as t
union all
select to_json_string(t) from table_y as t
) t, unnest(json_extract_keys(json)) key with offset
join unnest(json_extract_values(json)) value with offset
using(offset)
order by key
);
execute immediate(select '''
select * except(json) from temp_table
pivot (any_value(value) for key in ("''' || string_agg(distinct key, '","') || '"))'
from temp_table
)
if applied to sample data in your question - output is

Record type comparison with different numbers of columns isn't failing

Why does the following query not trigger a "cannot compare record types with different numbers of columns" error in PostgreSQL 11.6?
with
s AS (SELECT 1)
, main AS (
SELECT (a) = (b) , (a) = (a), (b) = (b), a, b -- I expect (a) = (b) fails
FROM s
, LATERAL (select 1 as x, 2 as y) AS a
, LATERAL (select 5 as x) AS b
)
select * from main;
While this one does:
with
x AS (SELECT 1)
, y AS (select 1, 2)
select (x) = (y) from x, y;
See the note in the docs on row comparison
Errors related to the number or types of elements might not occur if the comparison is resolved using earlier columns.
In this case, because a.x=1 and b.x=5, it returns false without ever noticing that the number of columns doesn't match. Change them to match, and you will get the same exception (which is also why the 2nd query does have that exception).
testdb=# with
s AS (SELECT 1)
, main AS (
SELECT a = b , (a) = (a), (b) = (b), a, b -- I expect (a) = (b) fails
FROM s
, LATERAL (select 5 as x, 2 as y) AS a
, LATERAL (select 5 as x) AS b
)
select * from main;
ERROR: cannot compare record types with different numbers of columns

INSERT INTO ... RETURNING multiple columns (PostgreSQL)

I've searched around for an answer and it seems definitive but I figured I would double check with the Stack Overflow community:
Here's what I'm trying to do:
INSERT INTO my_table VALUES (a, b, c)
RETURNING (SELECT x, y, z FROM x_table, y_table, z_table
WHERE xid = a AND yid = b AND zid = c)
I get an error telling me I can't return more than one column.
It works if I tell it SELECT x FROM x_table WHERE xid = a.
Is this at all possible in a single query as opposed to creating a seperate SELECT query?
I'm using PostgreSQL 8.3.
Try this.
with aaa as (
INSERT INTO my_table VALUES(a, b, c)
RETURNING a, b, c)
SELECT x, y, z FROM x_table, y_table, z_table
WHERE xid = (select a from aaa)
AND yid = (select b from aaa)
AND zid = (select c from aaa);
In 9.3 similar query works.
#corvinusz answer was wrong for 8.3 but gave me a great idea that worked so thanks!
INSERT INTO my_table VALUES (a, b, c)
RETURNING (SELECT x FROM x_table WHERE xid = a),
(SELECT y FROM y_table WHERE yid = b),
(SELECT z FROM z_table WHERE zid = c)
I have no idea why the way it's stated in the question is invalid but at least this works.
I found this approach (within a function!)
DO $$
DECLARE
returner_ID int;
returner_Name text;
returner_Age int;
BEGIN
INSERT INTO schema.table
("ID", "Name", "Age")
VALUES
('1', 'Steven Grant', '30')
RETURNING
"ID",
"Name",
"Age"
INTO
returner_ID,
returner_Name,
returner_Ag
END; $$

Can I add multiple columns to Totals

Using MS SQL 2012
I want to do something like
select a, b, c, a+b+c d
However a, b, c are complex computed columns, lets take a simple example
select case when x > 4 then 4 else x end a,
( select count(*) somethingElse) b,
a + b c
order by c
I hope that makes sense
You can use a nested query or a common table expression (CTE) for that. The CTE syntax is slightly cleaner - here it is:
WITH CTE (a, b)
AS
(
select
case when x > 4 then 4 else x end a,
count(*) somethingElse b
from my_table
)
SELECT
a, b, (a+b) as c
FROM CTE
ORDER BY c
I would probably do this:
SELECT
sub.a,
sub.b,
(sub.a + sub.b) as c,
FROM
(
select
case when x > 4 then 4 else x end a,
(select count(*) somethingElse) b
FROM MyTable
) sub
ORDER BY c
The easiest way is to do this:
select a,b,c,a+b+c d
from (select <whatever your calcs are for a,b,c>) x
order by c
That just creates a derived table consisting of your calculations for a, b, and c, and allows you to easily reference and sum them up!

Using IN with convert in sql

I would like to use the IN clause, but with the convert function.
Basically, I have a table (A) with the column of type int.
But in the other table (B) I Have values which are of type varchar.
Essentially, what I am looking for something like this
select *
from B
where myB_Column IN (select myA_Columng from A)
However, I am not sure if the int from table A, would map / convert / evaluate properly for the varchar in B.
I am using SQL Server 2008.
You can use CASE statement in where clause like this and CAST only if its Integer.
else 0 or NULL depending on your requirements.
SELECT *
FROM B
WHERE CASE ISNUMERIC(myB_Column) WHEN 1 THEN CAST(myB_Column AS INT) ELSE 0 END
IN (SELECT myA_Columng FROM A)
ISNUMERIC will be 1 (true) for Decimal values as-well so ideally you should implement your own IsInteger UDF .To do that look at this question
T-sql - determine if value is integer
Option #1
Select * from B where myB_Column IN
(
Select Cast(myA_Columng As Int) from A Where ISNUMERIC(myA_Columng) = 1
)
Option #2
Select B.* from B
Inner Join
(
Select Cast(myA_Columng As Int) As myA_Columng from A
Where ISNUMERIC(myA_Columng) = 1
) T
On T.myA_Columng = B.myB_Column
Option #3
Select B.* from B
Left Join
(
Select Cast(myA_Columng As Int) As myA_Columng from A
Where ISNUMERIC(myA_Columng) = 1
) T
On T.myA_Columng = B.myB_Column
I will opt third one. Reason is below mentioned.
Disadvantages of IN Predicate
Suppose I have two list objects.
List 1 List 2
1 12
2 7
3 8
4 98
5 9
6 10
7 6
Using Contains, it will search for each List-1 item in List-2 that means iteration will happen 49 times !!!
You can also use exists caluse,
select *
from B
where EXISTS (select 1 from A WHERE CAST(myA_Column AS VARCHAR) = myB_Column)
You can use below query :
select B.*
from B
inner join (Select distinct MyA_Columng from A) AS X ON B.MyB_Column = CAST(x.MyA_Columng as NVARCHAR(50))
Try it by using CAST()
SELECT *
FROM B
WHERE CAST(myB_Column AS INT(11)) IN (
SELECT myA_Columng
FROM A
)