If you want to replace multiple strings in one go, you can of course nest the REPLACE function, eg. like this:
SELECT REPLACE(REPLACE(REPLACE(foo, 'apple', 'fruit'), 'banana', 'fruit'), 'lettuce', 'vegetable')
FROM bar
If you have to do a lot of replacing, your code will become ugly and hard to read. Is there such a thing as a multi-replace function? Which would maybe take 2 arrays as arguments? To be sure, I'm familiar with the TRANSLATE function, but as my example indicates I want to replace whole words, not just single characters.
I would implement such a function in the following way:
CREATE OR REPLACE FUNCTION multi_replace(_string text, variadic _param_args text[])
RETURNS TEXT
AS
$BODY$
DECLARE
_index integer;
BEGIN
FOR _index IN 1 .. cardinality(_param_args) - 1 by 2 loop
_string := replace(_string, _param_args[_index], _param_args[_index+1]);
end loop;
RETURN _string;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
cardinality() returns the length of the parameter array, and by 2 increases the loop index by 2 for every iteration so that it's safe to use _index and _index + 1 inside the loop to access the pairs that belong together.
Online example: https://rextester.com/LITG61720
I was looking for cleanliness, not performance, so I went with your suggestion 404.
This is the function I wrote:
CREATE OR REPLACE FUNCTION multi_replace(
_string TEXT,
VARIADIC _param_args TEXT[]
)
RETURNS TEXT AS
$BODY$
DECLARE
_old RECORD;
_new TEXT;
BEGIN
FOR _old in (
SELECT
_param_args[i] AS tekst,
i AS indexnummer
FROM generate_series(1, 5, 2) i
)
LOOP
_new := COALESCE(_param_args[_old.indexnummer+1], '');
_string := REPLACE(_string, _old.tekst, _new);
END LOOP;
RETURN _string;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
And this is a sample call:
select multi_replace('I love to drink milk in the morning', 'milk', 'beer', 'morning', 'evening', 'I', 'we');
Odd that this isn't a standard function, maybe a good idea for the next release? Thanks everybody for the suggestions!
Related
I have a number in binary (base-2) representation:
"10100110"
How can I transform it to a number in Snowflake?
Snowflake does not provide number or Integer to Binary function out of the box, however these UDF function can be used instead
I also overloaded the UDF in the event a string gets passed.
CREATE OR REPLACE FUNCTION int_to_binary(NUM VARIANT)
RETURNS string
LANGUAGE JAVASCRIPT
AS $$
return (NUM >>> 0).toString(2);
$$;
CREATE OR REPLACE FUNCTION int_to_binary(NUM STRING)
RETURNS string
LANGUAGE JAVASCRIPT
AS $$
return (NUM >>> 0).toString(2);
$$;
I tried with a SQL pure UDF - it worked at first, but not when using it with data over a table.
So I had to create a Javascript UDF:
create or replace function bin_str_to_number(a string)
returns float
language javascript
as
$$
return parseInt(A, 2)
$$
;
select bin_str_to_number('110');
For the record, this is the error I got when attempting a pure SQL UDF for the same:
SQL compilation error: Unsupported subquery type cannot be evaluated
The UDF:
create or replace function bin_str_to_number(a string)
returns number
as
$$
(select sum(value::number*pow(2,index))::number
from table(flatten(input=>split_string_to_char(reverse(a)))))
$$
That was a fun challenge for this morning! If you want to do it in pure SQL:
with binary_numbers as (
select column1 as binary_string
from (values('10100110'), ('101101101'), ('1010011010'), ('1011110')) tab
)
select
binary_string,
sum(to_number(tab.value) * pow(2, (tab.index - 1))) decimal_number
from
binary_numbers,
table(split_to_table(trim(replace(replace(reverse(binary_numbers.binary_string), '1', '1,'), '0', '0,' ), ','), ',')) tab
group by binary_string
Produces:
BINARY_STRING
DECIMAL_NUMBER
10100110
166
101101101
365
1010011010
666
1011110
94
I'm running PostgreSQL 9.4.
Is there a replace string function which can take an array of words, or other similar function?
Ex.
SELECT REPLACE(my_column, ['blue', 'red'], ['ColorBlue', 'ColorRed']);
So blue becomes ColorBlue, and red becomes ColorRed?
It's not only such simple replacements, but for the example I'm using this.
One way is create it:
create or replace function rep_arr(str text, src text[], rep text[])
returns text as $$
begin
for i in 1..array_length(src, 1) loop
str := replace(str, src[i], rep[i]);
end loop;
return str;
end; $$ language plpgsql
Call:
select rep_arr('bla bla blue bla red bla', '{blue,red}' , '{ColorBlue,ColorRed}');
I agree with #OtoShavadze that you can write your own function.
Here is my solution:
I use generate_subscripts(array anyarray, dim int) function as suggested in Searching in Arrays documentation.
CREATE OR REPLACE FUNCTION translate(string text, from_array text[], to_array text[])
RETURNS text AS
$BODY$
DECLARE
output text;
BEGIN
SELECT INTO output
to_array[idx]
FROM
generate_subscripts(from_array, 1) AS idx
WHERE
from_array[idx] = string; -- here you can change the search condition
IF FOUND THEN
RETURN output;
ELSE
RETURN string;
END IF;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
It finds and replaces whole words but you can change it (line marked in the code) to find only a substring, match case-insensitive, etc...
You should also add parameter checking: arrays should not be null, multidimensional nor differ in size:
IF from_array IS NULL OR to_array IS NULL THEN
RAISE EXCEPTION 'NULL parameters';
END IF;
IF array_ndims(from_array) != 1 OR array_ndims(to_array) != 1 THEN
RAISE EXCEPTION 'Multidimensional parameters';
END IF;
IF array_length(from_array, 1) != array_length(to_array, 1) THEN
RAISE EXCEPTION 'Parameters size differ';
END IF;
SELECT translate('red', ARRAY['blue', 'red'], ARRAY['ColorBlue', 'ColorRed']);
returns
ColorRed
I have a column flag_acumu in a table in PostgreSQL with values like:
'SSNSSNNNNNNNNNNNNNNNNNNNNNNNNNNNNSNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN'
I need to show all positions with an 'S'. With this code, I only get the first such position, but not the later ones.
SELECT codn_conce, flag_acumu, position('S' IN flag_acumu) AS the_pos
FROM dh12
WHERE position('S' IN flag_acumu) != 0
ORDER BY the_pos ASC;
How to get all of them?
In Postgres 9.4 or later you can conveniently use unnest() in combination with WITH ORDINALITY:
SELECT *
FROM dh12 d
JOIN unnest(string_to_array(d.flag_acumu, NULL))
WITH ORDINALITY u(elem, the_pos) ON u.elem = 'S'
WHERE d.flag_acumu LIKE '%S%' -- optional, see below
ORDER BY d.codn_conce, u.the_pos;
This returns one row per match.
WHERE d.flag_acumu LIKE '%S%' is optional to quickly eliminate source rows without any matches. Pays if there are more than a few such rows.
Detailed explanation and alternatives for older versions:
PostgreSQL unnest() with element number
Since you didn't specify your needs to a point in which one could answer properly, I'm going with my assumption that you want a list of positions of occurence of a substring (can be more than 1 character long).
Here's the function to do that using:
FOR .. LOOP control structure,
function substr(text, int, int).
CREATE OR REPLACE FUNCTION get_all_positions_of_substring(text, text)
RETURNS text
STABLE
STRICT
LANGUAGE plpgsql
AS $$
DECLARE
output_text TEXT := '';
BEGIN
FOR i IN 1..length($1)
LOOP
IF substr($1, i, length($2)) = $2 THEN
output_text := CONCAT(output_text, ';', i);
END IF;
END LOOP;
-- Remove first semicolon
output_text := substr(output_text, 2, length(output_text));
RETURN output_text;
END;
$$;
Sample call and output
postgres=# select * from get_all_positions_of_substring('soklesocmxsoso','so');
get_all_positions_of_substring
--------------------------------
1;6;11;13
This works too. And a bit faster I think.
create or replace function findAllposition(_pat varchar, _tar varchar)
returns int[] as
$body$
declare _poslist int[]; _pos int;
begin
_pos := position(_pat in _tar);
while (_pos>0)
loop
if array_length(_poslist,1) is null then
_poslist := _poslist || (_pos);
else
_poslist := _poslist || (_pos + _poslist[array_length(_poslist,1)] + 1);
end if;
_tar := substr(_tar, _pos + 1, length(_tar));
_pos := position(_pat in _tar);
end loop;
return _poslist;
end;
$body$
language plpgsql;
Will return a position list which is an int array.
{position1, position2, position3, etc.}
I have strings (saved in database as varchar) and I have to cut them just before n'th occurence of delimiter.
Example input:
String: 'My-Example-Awesome-String'
Delimiter: '-'
Occurence: 2
Output:
My-Example
I implemented this function for fast prototype:
CREATE OR REPLACE FUNCTION find_position_delimiter(fulltext varchar, delimiter varchar, occurence integer)
RETURNS varchar AS
$BODY$
DECLARE
result varchar = '';
arr text[] = regexp_split_to_array( fulltext, delimiter);
word text;
counter integer := 0;
BEGIN
FOREACH word IN ARRAY arr LOOP
EXIT WHEN ( counter = occurence );
IF (counter > 0) THEN result := result || delimiter;
END IF;
result := result || word;
counter := counter + 1;
END LOOP;
RETURN result;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT find_position_delimiter('My-Example-Awesome-String', '-', 2);
For now it assumes that string is not empty (provided by query where I will call function) and delimiter string contains at least one delimiter of provided pattern.
But now I need something better for performance test. If it is possible, I would love to see the most universal solution, because not every user of my system is working on PostgreSQL database (few of them prefer Oracle, MySQL or SQLite), but it is not the most importatnt. But performance is - because on specific search, that function can be called even few hundreds times.
I didn't find anything about fast and easy using varchar as a table of chars and checking for occurences of delimiter (I could remember position of occurences and then create substring from first char to n'th delimiter position-1). Any ideas? Are smarter solutions?
# EDIT: yea, I know that function in every database will be a bit different, but body of function can be very similliar or the same. Generality is not a main goal :) And sorry for that bad function working-name, I just saw it has not right meaning.
you can try doing something based on this:
select
varcharColumnName,
INSTR(varcharColumnName,'-',1,2),
case when INSTR(varcharColumnName,'-',1,2) <> 0
THEN SUBSTR(varcharColumnName, 1, INSTR(varcharColumnName,'-',1,2) - 1)
else '...'
end
from tableName;
of course, you have to handle "else" the way you want. It works on postgres and oracle (tested), it should work on other dbms's because these are standard sql functions
//edit - as a function, however this way it's rather hard to make it cross-dbms
CREATE OR REPLACE FUNCTION find_position_delimiter(fulltext varchar, delimiter varchar, occurence integer)
RETURNS varchar as
$BODY$
DECLARE
result varchar := '';
delimiterPos integer := 0;
BEGIN
delimiterPos := INSTR(fulltext,delimiter,1,occurence);
result := SUBSTR(fulltext, 1, delimiterPos - 1);
RETURN result;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT find_position_delimiter('My-Example-Awesome-String', '-', 2);
create or replace function trunc(string text, delimiter char, occurence int) returns text as $$
return delimiter.join(string.split(delimiter)[:occurence])
$$ language plpythonu;
# select trunc('My-Example-Awesome-String', '-', 2);
trunc
------------
My-Example
(1 row)
Im currently studying sql statements and got curious of this specific exercise in the net.
the problem is
"How can I encrypt every letter in a row that contain two or more spaces?( with spaces not included)"
I have created a sample table here
sample
-----------------------
first
first second
first second third
first second third fourth
Here is what I would like to get:
sample
-----------------------
first
first second
first ****** *****
first ****** ***** fourth
and this is what I've tried so far:
select name, substring(name, E'(\\s\\w+\\s.*)') from sample ;
It is too complicated to do it in simple select so I used function with rich PostgreSQL string functions:
CREATE OR REPLACE FUNCTION hide_middle(s varchar)
RETURNS varchar AS
$BODY$
DECLARE
r varchar;
arr varchar[];
BEGIN
r := s;
arr := regexp_matches(s, '^(\\S+ )(.*)( \\S+)$');
IF array_length(arr, 1) = 3 THEN
r := arr[1] || regexp_replace(arr[2], '\\S', '*', 'g') || arr[3];
END IF;
RETURN r;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
You can use it with:
select hide_middle('first second third fourth')