I need to split a string like "abc" into individual records, like "a", "b", "c".
This should be easy in Snowflake: SPLIT(str, delimiter)
But if the delimiter is null, or an empty string I get the full str, and not characters as I expected.
Update: SQL UDF
create or replace function split_string_to_char(a string)
returns array
as $$
split(regexp_replace(a, '.', ',\\0', 2), ',')
$$
;
select split_string_to_char('hello');
I found this problem while working on Advent of Code 2020.
Instead of just splitting a string a working solution is to add commas between all the characters, and then split that on the commas:
select split(regexp_replace('abc', '.', ',\\0', 2), ',')
If you want to create a table out of it:
select *
from table(split_to_table(regexp_replace('abc', '.', ',\\0', 2), ',')) y
As seen on https://github.com/fhoffa/AdventOfCodeSQL/blob/main/2020/6.sql
In addition to Felipe's approach, you could also use a JavaScript UDF:
create function TO_CHAR_ARRAY(STR string)
returns array
language javascript
as
$$
return STR.split('');
$$;
select to_char_array('hello world');
Related
I want to pass a variable number of inputs to the following udf in Snowflake.
CREATE FUNCTION concat_ws_athena(s1 string, s2 string)
returns string
as
$$
array_to_string(array_construct_compact(s1, s2), '')
$$
;
How do you declare variable number of inputs?
Simply using an array does not work:
CREATE FUNCTION concat_ws_athena(s array)
returns string
as
$$
array_to_string(array_construct_compact(s), '')
$$
;
SELECT concat_ws_athena('a', 'b')
If you want to simulate exactly the output of this statement:
select array_to_string(array_construct_compact('a', 'b', 'c'), ',');
as seen here:
then your function should look like this:
CREATE OR REPLACE FUNCTION concat_ws_athena(s array)
returns string
as
$$
array_to_string(s, ',')
$$
;
and you would call it like this:
SELECT concat_ws_athena(['a', 'b', 'c']);
not passing 2 separate args but one arary with all args.
Right now you cannot define a UDF with a variable number of input parameters. You can; however, overload UDFs so you could create a UDF with a variable set of input parameters that way. There would have to be some reasonable limit where you cut off the overloads. For example here the overloads allow 2, 3, or 4 parameters. The number could go much higher.
CREATE or replace FUNCTION concat_ws_athena(s1 string, s2 string)
returns string
called on null input
as
$$
array_to_string(array_construct_compact(s1, s2), '')
$$
;
CREATE or replace FUNCTION concat_ws_athena(s1 string, s2 string, s3 string)
returns string
called on null input
as
$$
array_to_string(array_construct_compact(s1, s2, s3), '')
$$
;
CREATE or replace FUNCTION concat_ws_athena(s1 string, s2 string, s3 string, s4 string)
returns string
called on null input
as
$$
array_to_string(array_construct_compact(s1, s2, s3, s4), '')
$$
;
select concat_ws_athena('one','two',null,'three');
Also, most but not all Snowflake functions including UDFs will immediately return null if any input parameter is null. To override that behavior on UDFs, you can specify called on null input in the definition.
I want to replace the very first letter after a comma(,) with uppercase of it in snowflake database. Below given is what I tried, but it did not work.
eg:
Apple,ball,cat --> Apple,Ball,Cat
Bulb,LED,tube --> Bulb,LED,Tube
SELECT REGEXP_REPLACE('Apple,ball,cat',',(\\\w)',UPPER('\\\1'));
,(\\\w) captures letters after the comma, but UPPER('\\\1') does not convert it to uppercase.
I am not sure if you can use functions inside REGEXP_REPLACE at all.
Please use the built-in INITCAP function
SELECT INITCAP('Apple,ball,cat', ',');
Reference: INITCAP
Or maybe like this:
SELECT LISTAGG(UPPER(LEFT(VALUE, 1)) || SUBSTRING(VALUE, 2, LEN(VALUE)), ',')
FROM TABLE(SPLIT_TO_TABLE('Apple,ball,cat', ',')) as t(val);
Not "regex", but if you're interested in a Javascript UDF to do what you need...
CREATE OR REPLACE FUNCTION fx_replaceInitOnly(
input varchar)
returns varchar
language javascript
as '
//logic from https://www.freecodecamp.org/news/how-to-capitalize-words-in-javascript/
var words = INPUT.split(",");
for (let i = 0; i < words.length; i++) {
words[i] = words[i][0].toUpperCase() + words[i].substr(1);
}
output = words.join(",");
return output;
';
SELECT
'Apple,ball,cat,Bulb,LED,Tube' as str,
fx_replaceInitOnly(str) as new,
case WHEN str <> new THEN 'Changed' ELSE 'Same' END as test;
--STR NEW TEST
--Apple,ball,cat,Bulb,LED,Tube Apple,Ball,Cat,Bulb,LED,Tube Changed
Regexp will not help you to upper your chars, so you may combine split_to_table and initcap:
SELECT LISTAGG( INITCAP(VALUE) ,',' )
FROM TABLE(SPLIT_TO_TABLE('Apple,ball,cat',','));
I have a number in binary (base-2) representation:
"10100110"
How can I transform it to a number in Snowflake?
Snowflake does not provide number or Integer to Binary function out of the box, however these UDF function can be used instead
I also overloaded the UDF in the event a string gets passed.
CREATE OR REPLACE FUNCTION int_to_binary(NUM VARIANT)
RETURNS string
LANGUAGE JAVASCRIPT
AS $$
return (NUM >>> 0).toString(2);
$$;
CREATE OR REPLACE FUNCTION int_to_binary(NUM STRING)
RETURNS string
LANGUAGE JAVASCRIPT
AS $$
return (NUM >>> 0).toString(2);
$$;
I tried with a SQL pure UDF - it worked at first, but not when using it with data over a table.
So I had to create a Javascript UDF:
create or replace function bin_str_to_number(a string)
returns float
language javascript
as
$$
return parseInt(A, 2)
$$
;
select bin_str_to_number('110');
For the record, this is the error I got when attempting a pure SQL UDF for the same:
SQL compilation error: Unsupported subquery type cannot be evaluated
The UDF:
create or replace function bin_str_to_number(a string)
returns number
as
$$
(select sum(value::number*pow(2,index))::number
from table(flatten(input=>split_string_to_char(reverse(a)))))
$$
That was a fun challenge for this morning! If you want to do it in pure SQL:
with binary_numbers as (
select column1 as binary_string
from (values('10100110'), ('101101101'), ('1010011010'), ('1011110')) tab
)
select
binary_string,
sum(to_number(tab.value) * pow(2, (tab.index - 1))) decimal_number
from
binary_numbers,
table(split_to_table(trim(replace(replace(reverse(binary_numbers.binary_string), '1', '1,'), '0', '0,' ), ','), ',')) tab
group by binary_string
Produces:
BINARY_STRING
DECIMAL_NUMBER
10100110
166
101101101
365
1010011010
666
1011110
94
Suppose the function is taking two inputs 4 and '3:66,8:54,4:23' , the function should search for 4 in the key value pair '3:66,8:54,4:23' , if key 4 is found then it should return corresponding value of it i,e:23 else it should return empty. How to write the function for it in sql.
You didn't tell us the database product you are using.
In Postgres you can do that using string_to_array() combined with split_part():
select split_part(x.kv, ':', 2)
from unnest(string_to_array('3:66,8:54,4:23', ',')) as x(kv)
where split_part(x.kv, ':', 1) = '4';
You can put that into a SQL function:
create function get_key_value(p_input text, p_key text)
returns text
as
$$
select split_part(x.kv, ':', 2)
from unnest(string_to_array(p_input, ',')) as x(kv)
where split_part(x.kv, ':', 1) = p_key;
$$
language sql;
Then use:
select get_key_value(''3:66,8:54,4:23'', '4');
I need to create an Oracle DB function that takes a string as parameter. The string contains letters and numbers. I need to extract all the numbers from this string. For example, if I have a string like RO1234, I need to be able to use a function, say extract_number('RO1234'), and the result would be 1234.
To be even more precise, this is the kind of SQL query which this function would be used in.
SELECT DISTINCT column_name, extract_number(column_name)
FROM table_name
WHERE extract_number(column_name) = 1234;
QUESTION: How do I add a function like that to my Oracle database, in order to be able to use it like in the example above, using any of Oracle SQL Developer or SQLTools client applications?
You'd use REGEXP_REPLACE in order to remove all non-digit characters from a string:
select regexp_replace(column_name, '[^0-9]', '')
from mytable;
or
select regexp_replace(column_name, '[^[:digit:]]', '')
from mytable;
Of course you can write a function extract_number. It seems a bit like overkill though, to write a funtion that consists of only one function call itself.
create function extract_number(in_number varchar2) return varchar2 is
begin
return regexp_replace(in_number, '[^[:digit:]]', '');
end;
You can use regular expressions for extracting the number from string. Lets check it. Suppose this is the string mixing text and numbers 'stack12345overflow569'. This one should work:
select regexp_replace('stack12345overflow569', '[[:alpha:]]|_') as numbers from dual;
which will return "12345569".
also you can use this one:
select regexp_replace('stack12345overflow569', '[^0-9]', '') as numbers,
regexp_replace('Stack12345OverFlow569', '[^a-z and ^A-Z]', '') as characters
from dual
which will return "12345569" for numbers and "StackOverFlow" for characters.
This works for me, I only need first numbers in string:
TO_NUMBER(regexp_substr(h.HIST_OBSE, '\.*[[:digit:]]+\.*[[:digit:]]*'))
the field had the following string: "(43 Paginas) REGLAS DE PARTICIPACION".
result field: 43
If you are looking for 1st Number with decimal as string has correct decimal places, you may try regexp_substr function like this:
regexp_substr('stack12.345overflow', '\.*[[:digit:]]+\.*[[:digit:]]*')
To extract charecters from a string
SELECT REGEXP_REPLACE(column_name,'[^[:alpha:]]') alpha FROM DUAL
In order to extract month and a year from a string 'A0807' I did the following in PL/SQL:
DECLARE
lv_promo_code VARCHAR2(10) := 'A0807X';
lv_promo_num VARCHAR2(5);
lv_promo_month NUMBER(4);
lv_promo_year NUMBER(4);
BEGIN
lv_promo_num := REGEXP_SUBSTR(lv_promo_code, '(\d)(\d)(\d)(\d)');
lv_promo_month := EXTRACT(month from to_date(lv_promo_num, 'MMYY'));
DBMS_OUTPUT.PUT_LINE(lv_promo_month);
lv_promo_year := EXTRACT(year from to_date(lv_promo_num, 'MMYY'));
DBMS_OUTPUT.PUT_LINE(lv_promo_year);
END;