Snowflake UDF with variable number of inputs - sql

I want to pass a variable number of inputs to the following udf in Snowflake.
CREATE FUNCTION concat_ws_athena(s1 string, s2 string)
returns string
as
$$
array_to_string(array_construct_compact(s1, s2), '')
$$
;
How do you declare variable number of inputs?
Simply using an array does not work:
CREATE FUNCTION concat_ws_athena(s array)
returns string
as
$$
array_to_string(array_construct_compact(s), '')
$$
;
SELECT concat_ws_athena('a', 'b')

If you want to simulate exactly the output of this statement:
select array_to_string(array_construct_compact('a', 'b', 'c'), ',');
as seen here:
then your function should look like this:
CREATE OR REPLACE FUNCTION concat_ws_athena(s array)
returns string
as
$$
array_to_string(s, ',')
$$
;
and you would call it like this:
SELECT concat_ws_athena(['a', 'b', 'c']);
not passing 2 separate args but one arary with all args.

Right now you cannot define a UDF with a variable number of input parameters. You can; however, overload UDFs so you could create a UDF with a variable set of input parameters that way. There would have to be some reasonable limit where you cut off the overloads. For example here the overloads allow 2, 3, or 4 parameters. The number could go much higher.
CREATE or replace FUNCTION concat_ws_athena(s1 string, s2 string)
returns string
called on null input
as
$$
array_to_string(array_construct_compact(s1, s2), '')
$$
;
CREATE or replace FUNCTION concat_ws_athena(s1 string, s2 string, s3 string)
returns string
called on null input
as
$$
array_to_string(array_construct_compact(s1, s2, s3), '')
$$
;
CREATE or replace FUNCTION concat_ws_athena(s1 string, s2 string, s3 string, s4 string)
returns string
called on null input
as
$$
array_to_string(array_construct_compact(s1, s2, s3, s4), '')
$$
;
select concat_ws_athena('one','two',null,'three');
Also, most but not all Snowflake functions including UDFs will immediately return null if any input parameter is null. To override that behavior on UDFs, you can specify called on null input in the definition.

Related

how to transform a number in binary representation to a Snowflake number

I have a number in binary (base-2) representation:
"10100110"
How can I transform it to a number in Snowflake?
Snowflake does not provide number or Integer to Binary function out of the box, however these UDF function can be used instead
I also overloaded the UDF in the event a string gets passed.
CREATE OR REPLACE FUNCTION int_to_binary(NUM VARIANT)
RETURNS string
LANGUAGE JAVASCRIPT
AS $$
return (NUM >>> 0).toString(2);
$$;
CREATE OR REPLACE FUNCTION int_to_binary(NUM STRING)
RETURNS string
LANGUAGE JAVASCRIPT
AS $$
return (NUM >>> 0).toString(2);
$$;
I tried with a SQL pure UDF - it worked at first, but not when using it with data over a table.
So I had to create a Javascript UDF:
create or replace function bin_str_to_number(a string)
returns float
language javascript
as
$$
return parseInt(A, 2)
$$
;
select bin_str_to_number('110');
For the record, this is the error I got when attempting a pure SQL UDF for the same:
SQL compilation error: Unsupported subquery type cannot be evaluated
The UDF:
create or replace function bin_str_to_number(a string)
returns number
as
$$
(select sum(value::number*pow(2,index))::number
from table(flatten(input=>split_string_to_char(reverse(a)))))
$$
That was a fun challenge for this morning! If you want to do it in pure SQL:
with binary_numbers as (
select column1 as binary_string
from (values('10100110'), ('101101101'), ('1010011010'), ('1011110')) tab
)
select
binary_string,
sum(to_number(tab.value) * pow(2, (tab.index - 1))) decimal_number
from
binary_numbers,
table(split_to_table(trim(replace(replace(reverse(binary_numbers.binary_string), '1', '1,'), '0', '0,' ), ','), ',')) tab
group by binary_string
Produces:
BINARY_STRING
DECIMAL_NUMBER
10100110
166
101101101
365
1010011010
666
1011110
94

How can I split a string into character in Snowflake?

I need to split a string like "abc" into individual records, like "a", "b", "c".
This should be easy in Snowflake: SPLIT(str, delimiter)
But if the delimiter is null, or an empty string I get the full str, and not characters as I expected.
Update: SQL UDF
create or replace function split_string_to_char(a string)
returns array
as $$
split(regexp_replace(a, '.', ',\\0', 2), ',')
$$
;
select split_string_to_char('hello');
I found this problem while working on Advent of Code 2020.
Instead of just splitting a string a working solution is to add commas between all the characters, and then split that on the commas:
select split(regexp_replace('abc', '.', ',\\0', 2), ',')
If you want to create a table out of it:
select *
from table(split_to_table(regexp_replace('abc', '.', ',\\0', 2), ',')) y
As seen on https://github.com/fhoffa/AdventOfCodeSQL/blob/main/2020/6.sql
In addition to Felipe's approach, you could also use a JavaScript UDF:
create function TO_CHAR_ARRAY(STR string)
returns array
language javascript
as
$$
return STR.split('');
$$;
select to_char_array('hello world');

Postgres convert PATH type to ARRAY

Is there any way to convert Postgres PATH type to an ARRAY in order to have index access to it's points?
There is no way to do that with PostgreSQL alone - you'd have to write your own C function.
With the PostGIS extension, you can cast the path to geometry and perform the operation there:
SELECT array_agg(CAST(geom AS point))
FROM st_dumppoints(CAST(some_path AS geometry));
Try a variant of this..
CREATE OR REPLACE FUNCTION YADAMU.YADAMU_make_closed(point[])
returns point[]
STABLE RETURNS NULL ON NULL INPUT
as
$$
select case
when $1[1]::varchar = $1[array_length($1,1)]::varchar then
$1
else
array_append($1,$1[1])
end
$$
LANGUAGE SQL;
--
CREATE OR REPLACE FUNCTION YADAMU.YADAMU_AsPointArray(path)
returns point[]
STABLE RETURNS NULL ON NULL INPUT
as
$$
--
-- Array of Points from Path
--
select case
when isClosed($1) then
YADAMU.YADAMU_make_closed(array_agg(point(v)))
else
array_agg(point(v))
end
from unnest(string_to_array(left(right($1::VARCHAR,-2),-2),'),(')) v
$$
LANGUAGE SQL;
--
Eg
yadamu=# select * from unnest(YADAMU.YADAMU_asPointArray(Path '((0,1),(1,0),(4,0))'));
unnest
--------
(0,1)
(1,0)
(4,0)
(3 rows)

Simple word replacement in output from SQL

I'm running PostgreSQL 9.4.
Is there a replace string function which can take an array of words, or other similar function?
Ex.
SELECT REPLACE(my_column, ['blue', 'red'], ['ColorBlue', 'ColorRed']);
So blue becomes ColorBlue, and red becomes ColorRed?
It's not only such simple replacements, but for the example I'm using this.
One way is create it:
create or replace function rep_arr(str text, src text[], rep text[])
returns text as $$
begin
for i in 1..array_length(src, 1) loop
str := replace(str, src[i], rep[i]);
end loop;
return str;
end; $$ language plpgsql
Call:
select rep_arr('bla bla blue bla red bla', '{blue,red}' , '{ColorBlue,ColorRed}');
I agree with #OtoShavadze that you can write your own function.
Here is my solution:
I use generate_subscripts(array anyarray, dim int) function as suggested in Searching in Arrays documentation.
CREATE OR REPLACE FUNCTION translate(string text, from_array text[], to_array text[])
RETURNS text AS
$BODY$
DECLARE
output text;
BEGIN
SELECT INTO output
to_array[idx]
FROM
generate_subscripts(from_array, 1) AS idx
WHERE
from_array[idx] = string; -- here you can change the search condition
IF FOUND THEN
RETURN output;
ELSE
RETURN string;
END IF;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
It finds and replaces whole words but you can change it (line marked in the code) to find only a substring, match case-insensitive, etc...
You should also add parameter checking: arrays should not be null, multidimensional nor differ in size:
IF from_array IS NULL OR to_array IS NULL THEN
RAISE EXCEPTION 'NULL parameters';
END IF;
IF array_ndims(from_array) != 1 OR array_ndims(to_array) != 1 THEN
RAISE EXCEPTION 'Multidimensional parameters';
END IF;
IF array_length(from_array, 1) != array_length(to_array, 1) THEN
RAISE EXCEPTION 'Parameters size differ';
END IF;
SELECT translate('red', ARRAY['blue', 'red'], ARRAY['ColorBlue', 'ColorRed']);
returns
ColorRed

How to cut varchar/text before n'th occurence of delimiter? PostgreSQL

I have strings (saved in database as varchar) and I have to cut them just before n'th occurence of delimiter.
Example input:
String: 'My-Example-Awesome-String'
Delimiter: '-'
Occurence: 2
Output:
My-Example
I implemented this function for fast prototype:
CREATE OR REPLACE FUNCTION find_position_delimiter(fulltext varchar, delimiter varchar, occurence integer)
RETURNS varchar AS
$BODY$
DECLARE
result varchar = '';
arr text[] = regexp_split_to_array( fulltext, delimiter);
word text;
counter integer := 0;
BEGIN
FOREACH word IN ARRAY arr LOOP
EXIT WHEN ( counter = occurence );
IF (counter > 0) THEN result := result || delimiter;
END IF;
result := result || word;
counter := counter + 1;
END LOOP;
RETURN result;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT find_position_delimiter('My-Example-Awesome-String', '-', 2);
For now it assumes that string is not empty (provided by query where I will call function) and delimiter string contains at least one delimiter of provided pattern.
But now I need something better for performance test. If it is possible, I would love to see the most universal solution, because not every user of my system is working on PostgreSQL database (few of them prefer Oracle, MySQL or SQLite), but it is not the most importatnt. But performance is - because on specific search, that function can be called even few hundreds times.
I didn't find anything about fast and easy using varchar as a table of chars and checking for occurences of delimiter (I could remember position of occurences and then create substring from first char to n'th delimiter position-1). Any ideas? Are smarter solutions?
# EDIT: yea, I know that function in every database will be a bit different, but body of function can be very similliar or the same. Generality is not a main goal :) And sorry for that bad function working-name, I just saw it has not right meaning.
you can try doing something based on this:
select
varcharColumnName,
INSTR(varcharColumnName,'-',1,2),
case when INSTR(varcharColumnName,'-',1,2) <> 0
THEN SUBSTR(varcharColumnName, 1, INSTR(varcharColumnName,'-',1,2) - 1)
else '...'
end
from tableName;
of course, you have to handle "else" the way you want. It works on postgres and oracle (tested), it should work on other dbms's because these are standard sql functions
//edit - as a function, however this way it's rather hard to make it cross-dbms
CREATE OR REPLACE FUNCTION find_position_delimiter(fulltext varchar, delimiter varchar, occurence integer)
RETURNS varchar as
$BODY$
DECLARE
result varchar := '';
delimiterPos integer := 0;
BEGIN
delimiterPos := INSTR(fulltext,delimiter,1,occurence);
result := SUBSTR(fulltext, 1, delimiterPos - 1);
RETURN result;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT find_position_delimiter('My-Example-Awesome-String', '-', 2);
create or replace function trunc(string text, delimiter char, occurence int) returns text as $$
return delimiter.join(string.split(delimiter)[:occurence])
$$ language plpythonu;
# select trunc('My-Example-Awesome-String', '-', 2);
trunc
------------
My-Example
(1 row)