I'm looking for some tips to get this PostgreSQL plpgsql function to work. I would like to do a couple of regex's on each record and update only one time as doing them one by one takes 15 min. ...
CREATE OR REPLACE FUNCTION clean_column() RETURNS void AS $$
DECLARE
r record;
reg text[] := array[
['search','replace'],
['search','replace'],
['search','replace'],
['search','replace']
];
var text[];
tmp text;
BEGIN
for r in
select column from mytable
loop -- loop over all records
tmp = r;
FOREACH var SLICE 1 IN ARRAY reg
LOOP -- loop over all changes
tmp = regexp_replace(tmp,var[1],var[2]);
END LOOP;
UPDATE mytable SET r = tmp;
end loop;
END
$$ LANGUAGE plpgsql;
... there is a problem with r as it is not assigned. Probably my lack of understanding of how plpgsql works.
Maybe there is an other way to do multiple changes on a record field?
Your function would update the row repeatedly, writing a new row version every time. That would still be hugely inefficient.
The point must be to update every row only once. And only if anything actually changes.
CREATE OR REPLACE FUNCTION clean_column(INOUT _text text)
LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE AS
$func$
DECLARE
_reg CONSTANT text[] := ARRAY[
['search','replace']
, ['search','replace']
, ['search','replace']];
_var text[];
BEGIN
FOREACH _var SLICE 1 IN ARRAY _reg
LOOP
_text := regexp_replace(_text, _var[1], _var[2]);
END LOOP;
END
$func$;
The function does not run the UPDATE itself, just the string processing. Use that function in your UPDATE like so:
UPDATE tbl
SET col = clean_column(col)
WHERE col IS DISTINCT FROM clean_column(col) -- ① !
AND col IS NOT NULL -- ② ?
① Skip updates that would not change anything.
② Skip rows with NULL early (without even evaluating the function). Only relevant if column can be NULL, of course.
Performance will differ by orders of magnitude.
I have a stored procedure which I'm trying to use to delete several rows of a table based on an array of id's, from the rows that are deleted I want to return those id's and store them in a variable so that it can be used in another delete statement. Here's a reduced segment of my function.
create or replace function "table_a"."deletes"
(p_ids int[]
)
returns int
as $$
declare
v_directories int[];
begin
delete from "table_a"."foo" where "hoo_id" = any(unnest(p_ids)) returning id into v_dirs;
delete from "table_a"."bar" where "foo_id" = any(unnest(v_dirs));
return 1;
exception
when others
then raise exception '% %', sqlstate, sqlerrm;
end;
$$ LANGUAGE plpgsql;
This gives me an error of -
'set-returning functions are not allowed in WHERE'
What am I missing?
Use a CTE instead:
with ids as (
delete from "table_a"."foo"
where "hoo_id" = any(unnest(p_ids))
returning id
)
delete from "table_a"."bar"
where "foo_id" in (select id from ids);
I have 2 tables .The first table contains rows with placeholders and the second table contains those placeholders values.
I want a query which fetches data from the first table and replaces placeholders with actual values which are stored in the second table.
Ex:
Table1 Data
id value
608CB424-90BF-4B08-8CF8-241C7635434F jdbc:postgresql://{POSTGRESIP}:{POSTGRESPORT}/{TESTDB}
CDA4C3D4-72B5-4422-8071-A29D32BD14E0 https://{SERVICEIP}/svc/{TESTSERVICE}/
Table2 Data
id placeolder value
201FEBFE-DF92-4474-A945-A592D046CA02 POSTGRESIP 1.2.3.4
20D9DE14-643F-4CE3-B7BF-4B7E01963366 POSTGRESPORT 5432
45611605-F2D9-40C8-8C0C-251E300E183C TESTDB mytest
FA8E2E4E-014C-4C1C-907E-64BAE6854D72 SERVICEIP 10.90.30.40
45B76C68-8A0F-4FD3-882F-CA579EC799A6 TESTSERVICE mytest-service
Required output is
id value
608CB424-90BF-4B08-8CF8-241C7635434F jdbc:postgresql://1.2.3.4:5432/mytest
CDA4C3D4-72B5-4422-8071-A29D32BD14E0 https://10.90.30.40/svc/mytest-service/
If you want to use Python-like named placeholders then you need the helper function written on plpythonu:
create extension plpythonu;
create or replace function formatpystring( str text, a json ) returns text immutable language plpythonu as $$
import json
d = json.loads(a)
return str.format(**d)
$$;
Then simple test:
select formatpystring('{foo}.{bar}', '{"foo": "win", "bar": "amp"}');
formatpystring
----------------
win.amp
Finally you need to compose those arguments from your tables. It is simple:
select t1.id, formatpystring(t1.value, json_object_agg(t2.placeholder, t2.value)) as value
from table1 as t1, table2 as t2
group by t1.id, t1.value;
(Query was not tested but you have the direction)
(Clumsy) dynamic SQL implementation, featuring an outer join, but generating a recursive function call:
This function will not be very efficient, but probably the translation table is relatively small.
CREATE TABLE xlat_table (aa text ,bb text);
INSERT INTO xlat_table (aa ,bb ) VALUES( 'BBB', '/1.2.3.4/')
,( 'ccc', 'OMG') ,( 'ddd', '/4.3.2.1/') ;
CREATE FUNCTION dothe_replacements(_arg1 text) RETURNS text
AS
$func$
DECLARE
script text;
braced text;
res text;
found record; -- (aa text, bb text, xx text);
BEGIN
script := '';
res := format('%L', _arg1);
for found IN SELECT xy.aa,xy.bb
, regexp_matches(_arg1, '{\w+}','g' ) AS xx
FROM xlat_table xy
LOOP
-- RAISE NOTICE '#xx=%', found.xx[1];
-- RAISE NOTICE 'aa=%', found.aa;
-- RAISE NOTICE 'bb=%', found.bb;
braced := '{'|| found.aa || '}';
IF (found.xx[1] = braced ) THEN
-- RAISE NOTICE 'Res=%', res;
script := format ('replace(%s, %L, %L)'
,res,braced,found.bb);
res := format('%s', script);
END IF;
END LOOP;
if(length(script) =0) THEN return res; END IF;
script :='Select '|| script;
-- RAISE NOTICE 'script=%', script;
EXECUTE script INTO res;
return res;
END;
$func$
LANGUAGE plpgsql;
SELECT dothe_replacements( 'aaa{BBB}ccc{ddd}eee' );
SELECT dothe_replacements( '{AAA}bbb{CCC}DDD}{EEE}' );
Results:
CREATE TABLE
INSERT 0 3
CREATE FUNCTION
dothe_replacements
-----------------------------
aaa/1.2.3.4/ccc/4.3.2.1/eee
(1 row)
dothe_replacements
--------------------------
'{AAA}bbb{CCC}DDD}{EEE}'
(1 row)
The above method has quadratic behaviour(wrt the numberof xlat-entries); which is horrible.
But,we could dynamically create a function (once) and call it multiple times
(a poor man's generator)
Selecting only the relevant entries from the xlat table should probably be added.
And, you should of course re-create the function everytime the xlat table is changed.
CREATE FUNCTION create_replacement_function(_name text) RETURNS void
AS
$func$
DECLARE
argname text;
res text;
script text;
braced text;
found record; -- (aa text, bb text, xx text);
BEGIN
script := '';
argname := '_arg1';
res :=format('%I', argname);
for found IN SELECT xy.aa,xy.bb
FROM xlat_table xy
LOOP
-- RAISE NOTICE 'aa=%', found.aa;
-- RAISE NOTICE 'bb=%', found.bb;
-- RAISE NOTICE 'Res=%', res;
braced := '{'|| found.aa || '}';
script := format ('replace(%s, %L, %L)'
,res,braced,found.bb);
res := format('%s', script);
END LOOP;
script :=FORMAT('CREATE FUNCTION %I (_arg1 text) RETURNS text AS
$omg$
BEGIN
RETURN %s;
END;
$omg$ LANGUAGE plpgsql;', _name, script);
RAISE NOTICE 'script=%', script;
EXECUTE script ;
return ;
END;
$func$
LANGUAGE plpgsql;
SELECT create_replacement_function( 'my_function');
SELECT my_function('aaa{BBB}ccc{ddd}eee' );
SELECT my_function( '{AAA}bbb{CCC}DDD}{EEE}' );
And the result:
CREATE FUNCTION
NOTICE: script=CREATE FUNCTION my_function (_arg1 text) RETURNS text AS
$omg$
BEGIN
RETURN replace(replace(replace(_arg1, '{BBB}', '/1.2.3.4/'), '{ccc}', 'OMG'), '{ddd}', '/4.3.2.1/');
END;
$omg$ LANGUAGE plpgsql;
create_replacement_function
-----------------------------
(1 row)
my_function
-----------------------------
aaa/1.2.3.4/ccc/4.3.2.1/eee
(1 row)
my_function
------------------------
{AAA}bbb{CCC}DDD}{EEE}
(1 row)
The following offers a plpgsql solution in a with a single function.
You'll notice I've 'renamed' the value column. It's bad practice using rserved/key words as object names. Also soq is the schema I use for all SO code.
The process first takes the holder-values from table2 and generates a set of key-value pairs (in this case hstore, but jsonb would also work). It then builds an array from the value column (my column name: val_string) containing the place_holder name from the value. Finally, it iterates that array replacing the actual holder-name with the value from the key-values using the array value as the lookup key.
The performance would not be great with a larger volume from either table. If you need to process a large volume at a time to a single row temp table may yield better performance.
create or replace function soq.replace_holders( place_holder_line_in text)
returns text
language plpgsql
as $$
declare
l_holder_values hstore;
l_holder_line text;
l_holder_array text[];
l_indx integer;
begin
-- transform cloumns to key-value pairs of holder-value
select string_agg(place,',')::hstore
into l_holder_values
from (
select concat( '"',place_holder,'"=>"',place_value,'"') place
from soq.table2
) p;
-- raise notice 'holder_array_in==%',l_holder_values;
-- extract the text line and build array of place_holder names
select phv, string_to_array (string_agg(v,','),',')
into l_holder_line,l_holder_array
from (
select replace(replace(place_holder_line_in,'{',''),'}','') phv
, replace(replace(replace(regexp_matches(place_holder_line_in,'({[^}]+})','g')::text ,'{',''),'}',''),'"','') v
) s
group by phv;
-- raise notice 'Array==%',l_holder_array::text;
-- replace each key from text line with the corresponding value
for l_indx in 1 .. array_length(l_holder_array,1)
loop
l_holder_line = replace(l_holder_line,l_holder_array[l_indx],l_holder_values -> l_holder_array[l_indx]);
end loop;
-- done
return l_holder_line;
end;
$$;
-- Test driver
select id, soq.replace_holders(val_string) result_value from soq.table1;
I have created a simple query for this solution and it working as required.
WITH RECURSIVE cte(id, value, level) AS (
SELECT id,value, 0 as level
FROM Table1
UNION
SELECT ts.id,replace(ts.value,'{'||tp.placeholder||'}',tp.value) as value, level+1
FROM cte ts, Table2 tp WHERE ts.value LIKE CONCAT('%',tp.placeholder, '%')
)
SELECT id, value FROM cte c
where level =
(
select Max(level)
from cte c2 where c.id=c2.id
)
Output is
id value
CDA4C3D4-72B5-4422-8071-A29D32BD14E0 https://10.90.30.40/svc/mytest-service/
608CB424-90BF-4B08-8CF8-241C7635434F jdbc:postgresql://1.2.3.4:5432/mytest
I am creating a function in pgsql script language, and what I want to do in this point is iterate over the results of a query and for each row do something specific. My current try is the following, where temprow is declared as temprow user_data.users%rowtype. The code in question is the following:
FOR temprow IN
SELECT * FROM user_data.users ORDER BY user_seasonpts DESC LIMIT 10
LOOP
SELECT user_id,user_seasonpts INTO player_idd,season_ptss FROM temprow;
INSERT INTO user_data.leaderboards (season_num,player_id,season_pts) VALUES (old_seasonnum,player_idd,season_ptss);
END LOOP;
However I get the following error from this: ERROR: relation "temprow" does not exist. If it's clear what I want to be done, could you point to me the right way to do it?
temprow is a record variable which is bound in turn to each record of the first SELECT.
So you should write:
FOR temprow IN
SELECT * FROM user_data.users ORDER BY user_seasonpts DESC LIMIT 10
LOOP
INSERT INTO user_data.leaderboards (season_num,player_id,season_pts) VALUES (old_seasonnum,temprow.userd_id,temprow.season_ptss);
END LOOP;
This loop could be further simplified as a single query:
INSERT INTO user_data.leaderboards (season_num,player_id,season_pts)
SELECT old_seasonnum,player_idd,season_ptss FROM user_data.users ORDER BY user_seasonpts DESC LIMIT 10
A function that loop through the select and use loop item values to filter and calculate other values,
CREATE FUNCTION "UpdateTable"() RETURNS boolean
LANGUAGE plpgsql
AS
$$
DECLARE
TABLE_RECORD RECORD;
BasePrice NUMERIC;
PlatformFee NUMERIC;
MarketPrice NUMERIC;
FinalAmount NUMERIC;
BEGIN
FOR TABLE_RECORD IN SELECT * FROM "SchemaName1"."TableName1" -- can select required fields only
LOOP
SELECT "BasePrice", "PlatformFee" INTO BasePrice, PlatformFee
FROM "SchemaName2"."TableName2" WHERE "UserID" = TABLE_RECORD."UserRID";
SELECT "MarketPrice" / 100 INTO MarketPrice FROM "SchemaName3"."TableName3" WHERE "DateTime" = TABLE_RECORD."DateTime";
FinalAmount = TABLE_RECORD."Qty" * (BasePrice + PlatformFee - MarketPrice);
UPDATE "SchemaName1"."TableName1" SET "MarketPrice" = MarketPrice, "Amount" = CFDAmount
WHERE "ID" = CFD_RECORD."ID"; -- can update other schema tables also
END LOOP;
RETURN TRUE;
END
$$;
For future reference, I want to emphasise Thushara comment on the accepted answer
On Postgres#12 the following would work:
DO $$
DECLARE temprow RECORD;
BEGIN FOR temprow IN
SELECT * FROM user_data.users ORDER BY user_seasonpts DESC LIMIT 10
LOOP
INSERT INTO user_data.leaderboards (season_num,player_id,season_pts) VALUES (old_seasonnum,temprow.userd_id,temprow.season_ptss);
END LOOP;
END $$
I have the following plpgsql function in PostgreSQL:
CREATE OR REPLACE FUNCTION func1()
RETURNS SETOF type_a AS
$BODY$
declare
param text;
sqls varchar;
row type_a;
begin
code.....
sqls='select * from func3(' || param || ') ';
for row in execute sqls LOOP
return next row;
END LOOP;
end if;
return;
end
$BODY$
LANGUAGE plpgsql VOLATILE
I want to add an insert statment into the loop, so that the loop will work as it is now but also all rows will be saved in a table.
for row in execute sqls LOOP
INSERT INTO TABLE new_tab(id, name)
return next row;
the thing is that I don't know how to do that... the insert statment normaly has syntax of:
INSERT INTO new_tab(id, name)
SELECT x.id, x.name
FROM y
but this syntax doesn't fit here. There is no query to select rows from.... the rows are in the loop.
Basic insert with values looks like this:
INSERT INTO table_name (column1,column2,column3,...)
VALUES (value1,value2,value3,...);
Based on the additional comments you need to use cursor instead of execute sqls.
No need for a loop, you can use insert .. select ... returning in dynamic SQL just as well:
create or replace function func1()
returns table (id integer, name text)
as
$$
declare
param text;
begin
param := ... ;
return query execute
'insert into new_tab (id, name)
select id, name
from func3($1)
returning *'
using param;
end;
$$
language plpgsql;
Note that I used a parameter placeholder and the USING clause instead of concatenating the parameter into the query - much more robust.