Simple word replacement in output from SQL - sql

I'm running PostgreSQL 9.4.
Is there a replace string function which can take an array of words, or other similar function?
Ex.
SELECT REPLACE(my_column, ['blue', 'red'], ['ColorBlue', 'ColorRed']);
So blue becomes ColorBlue, and red becomes ColorRed?
It's not only such simple replacements, but for the example I'm using this.

One way is create it:
create or replace function rep_arr(str text, src text[], rep text[])
returns text as $$
begin
for i in 1..array_length(src, 1) loop
str := replace(str, src[i], rep[i]);
end loop;
return str;
end; $$ language plpgsql
Call:
select rep_arr('bla bla blue bla red bla', '{blue,red}' , '{ColorBlue,ColorRed}');

I agree with #OtoShavadze that you can write your own function.
Here is my solution:
I use generate_subscripts(array anyarray, dim int) function as suggested in Searching in Arrays documentation.
CREATE OR REPLACE FUNCTION translate(string text, from_array text[], to_array text[])
RETURNS text AS
$BODY$
DECLARE
output text;
BEGIN
SELECT INTO output
to_array[idx]
FROM
generate_subscripts(from_array, 1) AS idx
WHERE
from_array[idx] = string; -- here you can change the search condition
IF FOUND THEN
RETURN output;
ELSE
RETURN string;
END IF;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
It finds and replaces whole words but you can change it (line marked in the code) to find only a substring, match case-insensitive, etc...
You should also add parameter checking: arrays should not be null, multidimensional nor differ in size:
IF from_array IS NULL OR to_array IS NULL THEN
RAISE EXCEPTION 'NULL parameters';
END IF;
IF array_ndims(from_array) != 1 OR array_ndims(to_array) != 1 THEN
RAISE EXCEPTION 'Multidimensional parameters';
END IF;
IF array_length(from_array, 1) != array_length(to_array, 1) THEN
RAISE EXCEPTION 'Parameters size differ';
END IF;
SELECT translate('red', ARRAY['blue', 'red'], ARRAY['ColorBlue', 'ColorRed']);
returns
ColorRed

Related

Is it possible to perform multiple replacements in one function call?

If you want to replace multiple strings in one go, you can of course nest the REPLACE function, eg. like this:
SELECT REPLACE(REPLACE(REPLACE(foo, 'apple', 'fruit'), 'banana', 'fruit'), 'lettuce', 'vegetable')
FROM bar
If you have to do a lot of replacing, your code will become ugly and hard to read. Is there such a thing as a multi-replace function? Which would maybe take 2 arrays as arguments? To be sure, I'm familiar with the TRANSLATE function, but as my example indicates I want to replace whole words, not just single characters.
I would implement such a function in the following way:
CREATE OR REPLACE FUNCTION multi_replace(_string text, variadic _param_args text[])
RETURNS TEXT
AS
$BODY$
DECLARE
_index integer;
BEGIN
FOR _index IN 1 .. cardinality(_param_args) - 1 by 2 loop
_string := replace(_string, _param_args[_index], _param_args[_index+1]);
end loop;
RETURN _string;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
cardinality() returns the length of the parameter array, and by 2 increases the loop index by 2 for every iteration so that it's safe to use _index and _index + 1 inside the loop to access the pairs that belong together.
Online example: https://rextester.com/LITG61720
I was looking for cleanliness, not performance, so I went with your suggestion 404.
This is the function I wrote:
CREATE OR REPLACE FUNCTION multi_replace(
_string TEXT,
VARIADIC _param_args TEXT[]
)
RETURNS TEXT AS
$BODY$
DECLARE
_old RECORD;
_new TEXT;
BEGIN
FOR _old in (
SELECT
_param_args[i] AS tekst,
i AS indexnummer
FROM generate_series(1, 5, 2) i
)
LOOP
_new := COALESCE(_param_args[_old.indexnummer+1], '');
_string := REPLACE(_string, _old.tekst, _new);
END LOOP;
RETURN _string;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
And this is a sample call:
select multi_replace('I love to drink milk in the morning', 'milk', 'beer', 'morning', 'evening', 'I', 'we');
Odd that this isn't a standard function, maybe a good idea for the next release? Thanks everybody for the suggestions!

SQL function to convert NUMERIC to BYTEA and BYTEA to NUMERIC

In PostgreSQL, how can I convert a NUMERIC value to a BYTEA value? And BYTEA to NUMERIC? Using TEXT values I can use CONVERT_TO() and CONVERT_FROM(). Is there anything simmilar? If not, how would it be the SQL function code?
Here are functions tested with PG 11. Note that numeric2bytea handles only nonnegative numbers.
CREATE OR REPLACE FUNCTION bytea2numeric(_b BYTEA) RETURNS NUMERIC AS $$
DECLARE
_n NUMERIC := 0;
BEGIN
FOR _i IN 0 .. LENGTH(_b)-1 LOOP
_n := _n*256+GET_BYTE(_b,_i);
END LOOP;
RETURN _n;
END;
$$ LANGUAGE PLPGSQL IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION numeric2bytea(_n NUMERIC) RETURNS BYTEA AS $$
DECLARE
_b BYTEA := '\x';
_v INTEGER;
BEGIN
WHILE _n > 0 LOOP
_v := _n % 256;
_b := SET_BYTE(('\x00' || _b),0,_v);
_n := (_n-_v)/256;
END LOOP;
RETURN _b;
END;
$$ LANGUAGE PLPGSQL IMMUTABLE STRICT;
Example:
=> select bytea2numeric('\xdeadbeef00decafbad00cafebabe');
bytea2numeric
------------------------------------
4516460495214885311638200605653694
(1 row)
=> select numeric2bytea(4516460495214885311638200605653694);
numeric2bytea
--------------------------------
\xdeadbeef00decafbad00cafebabe
(1 row)
I think that VARBINARY is used to store in sql for bytea.
so that convert to numeric to byte use the flowing script
select CONVERT(VARBINARY,10)
and answer will be 0x0000000A
and VARBINARY to numeric
select CONVERT(int,0x0000000A)
and answer will be 10

How get all matching positions in a string?

I have a column flag_acumu in a table in PostgreSQL with values like:
'SSNSSNNNNNNNNNNNNNNNNNNNNNNNNNNNNSNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN'
I need to show all positions with an 'S'. With this code, I only get the first such position, but not the later ones.
SELECT codn_conce, flag_acumu, position('S' IN flag_acumu) AS the_pos
FROM dh12
WHERE position('S' IN flag_acumu) != 0
ORDER BY the_pos ASC;
How to get all of them?
In Postgres 9.4 or later you can conveniently use unnest() in combination with WITH ORDINALITY:
SELECT *
FROM dh12 d
JOIN unnest(string_to_array(d.flag_acumu, NULL))
WITH ORDINALITY u(elem, the_pos) ON u.elem = 'S'
WHERE d.flag_acumu LIKE '%S%' -- optional, see below
ORDER BY d.codn_conce, u.the_pos;
This returns one row per match.
WHERE d.flag_acumu LIKE '%S%' is optional to quickly eliminate source rows without any matches. Pays if there are more than a few such rows.
Detailed explanation and alternatives for older versions:
PostgreSQL unnest() with element number
Since you didn't specify your needs to a point in which one could answer properly, I'm going with my assumption that you want a list of positions of occurence of a substring (can be more than 1 character long).
Here's the function to do that using:
FOR .. LOOP control structure,
function substr(text, int, int).
CREATE OR REPLACE FUNCTION get_all_positions_of_substring(text, text)
RETURNS text
STABLE
STRICT
LANGUAGE plpgsql
AS $$
DECLARE
output_text TEXT := '';
BEGIN
FOR i IN 1..length($1)
LOOP
IF substr($1, i, length($2)) = $2 THEN
output_text := CONCAT(output_text, ';', i);
END IF;
END LOOP;
-- Remove first semicolon
output_text := substr(output_text, 2, length(output_text));
RETURN output_text;
END;
$$;
Sample call and output
postgres=# select * from get_all_positions_of_substring('soklesocmxsoso','so');
get_all_positions_of_substring
--------------------------------
1;6;11;13
This works too. And a bit faster I think.
create or replace function findAllposition(_pat varchar, _tar varchar)
returns int[] as
$body$
declare _poslist int[]; _pos int;
begin
_pos := position(_pat in _tar);
while (_pos>0)
loop
if array_length(_poslist,1) is null then
_poslist := _poslist || (_pos);
else
_poslist := _poslist || (_pos + _poslist[array_length(_poslist,1)] + 1);
end if;
_tar := substr(_tar, _pos + 1, length(_tar));
_pos := position(_pat in _tar);
end loop;
return _poslist;
end;
$body$
language plpgsql;
Will return a position list which is an int array.
{position1, position2, position3, etc.}

How to cut varchar/text before n'th occurence of delimiter? PostgreSQL

I have strings (saved in database as varchar) and I have to cut them just before n'th occurence of delimiter.
Example input:
String: 'My-Example-Awesome-String'
Delimiter: '-'
Occurence: 2
Output:
My-Example
I implemented this function for fast prototype:
CREATE OR REPLACE FUNCTION find_position_delimiter(fulltext varchar, delimiter varchar, occurence integer)
RETURNS varchar AS
$BODY$
DECLARE
result varchar = '';
arr text[] = regexp_split_to_array( fulltext, delimiter);
word text;
counter integer := 0;
BEGIN
FOREACH word IN ARRAY arr LOOP
EXIT WHEN ( counter = occurence );
IF (counter > 0) THEN result := result || delimiter;
END IF;
result := result || word;
counter := counter + 1;
END LOOP;
RETURN result;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT find_position_delimiter('My-Example-Awesome-String', '-', 2);
For now it assumes that string is not empty (provided by query where I will call function) and delimiter string contains at least one delimiter of provided pattern.
But now I need something better for performance test. If it is possible, I would love to see the most universal solution, because not every user of my system is working on PostgreSQL database (few of them prefer Oracle, MySQL or SQLite), but it is not the most importatnt. But performance is - because on specific search, that function can be called even few hundreds times.
I didn't find anything about fast and easy using varchar as a table of chars and checking for occurences of delimiter (I could remember position of occurences and then create substring from first char to n'th delimiter position-1). Any ideas? Are smarter solutions?
# EDIT: yea, I know that function in every database will be a bit different, but body of function can be very similliar or the same. Generality is not a main goal :) And sorry for that bad function working-name, I just saw it has not right meaning.
you can try doing something based on this:
select
varcharColumnName,
INSTR(varcharColumnName,'-',1,2),
case when INSTR(varcharColumnName,'-',1,2) <> 0
THEN SUBSTR(varcharColumnName, 1, INSTR(varcharColumnName,'-',1,2) - 1)
else '...'
end
from tableName;
of course, you have to handle "else" the way you want. It works on postgres and oracle (tested), it should work on other dbms's because these are standard sql functions
//edit - as a function, however this way it's rather hard to make it cross-dbms
CREATE OR REPLACE FUNCTION find_position_delimiter(fulltext varchar, delimiter varchar, occurence integer)
RETURNS varchar as
$BODY$
DECLARE
result varchar := '';
delimiterPos integer := 0;
BEGIN
delimiterPos := INSTR(fulltext,delimiter,1,occurence);
result := SUBSTR(fulltext, 1, delimiterPos - 1);
RETURN result;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT find_position_delimiter('My-Example-Awesome-String', '-', 2);
create or replace function trunc(string text, delimiter char, occurence int) returns text as $$
return delimiter.join(string.split(delimiter)[:occurence])
$$ language plpythonu;
# select trunc('My-Example-Awesome-String', '-', 2);
trunc
------------
My-Example
(1 row)

Intersection of multiple arrays in PostgreSQL

I have a view defined as:
CREATE VIEW View1 AS
SELECT Field1, Field2, array_agg(Field3) AS AggField
FROM Table1
GROUP BY Field1, Field2;
What I would like to do is get the intersection of the arrays in AggField with something like:
SELECT intersection(AggField) FROM View1 WHERE Field2 = 'SomeValue';
Is this at all possible, or is there a better way to achieve what I want?
The closest thing to an array intersection that I can think of is this:
select array_agg(e)
from (
select unnest(a1)
intersect
select unnest(a2)
) as dt(e)
This assumes that a1 and a2 are single dimension arrays with the same type of elements. You could wrap that up in a function something like this:
create function array_intersect(a1 int[], a2 int[]) returns int[] as $$
declare
ret int[];
begin
-- The reason for the kludgy NULL handling comes later.
if a1 is null then
return a2;
elseif a2 is null then
return a1;
end if;
select array_agg(e) into ret
from (
select unnest(a1)
intersect
select unnest(a2)
) as dt(e);
return ret;
end;
$$ language plpgsql;
Then you could do things like this:
=> select array_intersect(ARRAY[2,4,6,8,10], ARRAY[1,2,3,4,5,6,7,8,9,10]);
array_intersect
-----------------
{6,2,4,10,8}
(1 row)
Note that this doesn't guarantee any particular order in the returned array but you can fix that if you care about it. Then you could create your own aggregate function:
-- Pre-9.1
create aggregate array_intersect_agg(
sfunc = array_intersect,
basetype = int[],
stype = int[],
initcond = NULL
);
-- 9.1+ (AFAIK, I don't have 9.1 handy at the moment
-- see the comments below.
create aggregate array_intersect_agg(int[]) (
sfunc = array_intersect,
stype = int[]
);
And now we see why array_intersect does funny and somewhat kludgey things with NULLs. We need an initial value for the aggregation that behaves like the universal set and we can use NULL for that (yes, this smells a bit off but I can't think of anything better off the top of my head).
Once all this is in place, you can do things like this:
> select * from stuff;
a
---------
{1,2,3}
{1,2,3}
{3,4,5}
(3 rows)
> select array_intersect_agg(a) from stuff;
array_intersect_agg
---------------------
{3}
(1 row)
Not exactly simple or efficient but maybe a reasonable starting point and better than nothing at all.
Useful references:
array_agg
create aggregate
create function
PL/pgSQL
unnest
The accepted answer did not work for me. This is how I fixed it.
create or replace function array_intersect(a1 int[], a2 int[]) returns int[] as $$
declare
ret int[];
begin
-- RAISE NOTICE 'a1 = %', a1;
-- RAISE NOTICE 'a2 = %', a2;
if a1 is null then
-- RAISE NOTICE 'a1 is null';
return a2;
-- elseif a2 is null then
-- RAISE NOTICE 'a2 is null';
-- return a1;
end if;
if array_length(a1,1) = 0 then
return '{}'::integer[];
end if;
select array_agg(e) into ret
from (
select unnest(a1)
intersect
select unnest(a2)
) as dt(e);
if ret is null then
return '{}'::integer[];
end if;
return ret;
end;
$$ language plpgsql;
It is bit late to answer this question but maybe somebody will need it so I decided to share something I wrote cause did not found any ready solution for intersection of any number of arrays. So here it is. This function receives array of arrays, if it is only single array, function returns first array, if there are 2 arrays function intersects 2 arrays and returns result, if it is more that 2 arrays, function takes intersection of 2 first arrays, stores it in some variable and loops through all other arrays, intersect each next array with stored result and stores result in variable. if result is null it exists with null. In the and the variable that stores array with interacted data returned from the function.
CREATE OR REPLACE FUNCTION array_intersected(iarray bigint[][])
RETURNS bigint[] AS
$BODY$
declare out_arr bigint[]; set1 bigint[]; set2 bigint[];
BEGIN
--RAISE NOTICE '%', array_length(iarray, 1);
if array_length(iarray, 1) = 1 then
SELECT ARRAY(SELECT unnest(iarray[1:1])) into out_arr;
elseif array_length( iarray, 1) = 2 then
set1 := iarray[1:1];
set2 := iarray[2:2];
SELECT ARRAY(SELECT unnest(set1) INTERSECT SELECT unnest(set2))into out_arr;
elseif array_length(iarray, 1) > 2 then
set1 := iarray[1:1];
set2 := iarray[2:2];
--exit if no common numbers exists int 2 first arrays
SELECT ARRAY(SELECT unnest(set1) INTERSECT SELECT unnest(set2))into out_arr;
if out_arr = NULL then
EXIT;
END IF;
FOR i IN 3 .. array_upper(iarray, 1)
LOOP
set1 := iarray[i:i];
SELECT ARRAY(SELECT unnest(set1) INTERSECT SELECT unnest(out_arr))into out_arr;
if out_arr = NULL then
EXIT;
END IF;
END LOOP;
end if;
return out_arr;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
Here is the code to validate it works.
select array_intersected(array[[1, 2]]::bigint[][]);
select array_intersected(array[[1, 2],[2, 3]]::bigint[][]);
select array_intersected(array[[1, 2],[2, 3], [2, 4]]::bigint[][]);
select array_intersected(array[[1, 2, 3, 4],[null, null, 4, 3], [3, 1, 4, null]]::bigint[][]);