Scope of column names, aliases, and OUT parameters in PL/pgSQL function - sql

I have a hard time understanding why I can refer to the output columns in returns table(col type).
There is a subtle bug in the below code, the order by var refers to res in returns, not to data1 which we aliased to res. res in where is always null and we get 0 rows.
Why can I refer to the column name in output?
In what cases do I want this?
CREATE OR REPLACE FUNCTION public.test(var INTEGER)
RETURNS table(res int )
LANGUAGE plpgsql
AS $function$
begin
return query
select data1 res
from table_with_data
where res < var;
end
$function$

Why can I refer to the column name in output
From the manual, the section about function parameters:
column_name The name of an output column in the RETURNS TABLE syntax. This is effectively another way of declaring a named OUT parameter, except that RETURNS TABLE also implies RETURNS SETOF.
What this means is that in your case res is effectively a writeable variable, which type you plan to return a set of. As any other variable without a default value assigned, it starts off as null.
In what case do I want this
You can return multiple records from a function of this type with a single return query, but another way is by a series of multiple return query or return next - in the second case, filling out the fields in a record of your output table each time. You could have expected a return statement to end the function, but in this scenario only a single return; without anything added would have that effect.
create table public.test_res (data integer);
CREATE OR REPLACE FUNCTION public.test(var INTEGER)
RETURNS table(res int )
LANGUAGE plpgsql
AS $function$
begin
insert into public.test_res select res;--to inspect its initial value later
select 1 into res;
return next;
return next;--note that res isn't reset after returning next
return query select 2;--doesn't affect the current value of res
return next;--returning something else earlier didn't affect res either
return;--it will finish here
select 3 into res;
return next;
end
$function$;
select * from test(0);
-- res
-------
-- 1
-- 1
-- 2
-- 1
--(4 rows)
table public.test_res; --this was the initial value of res within the function
-- data
--------
-- null
--(1 row)
Which is the most useful with LOOPs
CREATE OR REPLACE FUNCTION public.test(var INTEGER)
RETURNS table(comment text,res int) LANGUAGE plpgsql AS $function$
declare rec record;
array_slice int[];
begin
return query select 'return query returned these multiple records in one go', a from generate_series(1,3,1) a(a);
res:=0;
comment:='loop exit when res>4';
loop exit when res>4;
select res+1 into res;
return next;
end loop;
comment:='while res between 5 and 8 loop';
while res between 5 and 8 loop
select res+2 into res;
return next;
end loop;
comment:='for element in reverse 3 .. -3 by 2 loop';
for element in reverse 3 .. -3 by 2 loop
select element into res;
return next;
end loop;
comment:='for <record> in <expression> loop';
for rec in select pid from pg_stat_activity where state<>'idle' loop
select rec.pid into res;
return next;
end loop;
comment:='foreach array_slice slice 1 in array arr loop';
foreach array_slice SLICE 1 in array ARRAY[[1,2,3],[11,12,13],[21,22,23]] loop
select array_slice[1] into res;
return next;
end loop;
end
$function$;
Example results
select * from public.test(0);
-- comment | res
----------------------------------------------------------+--------
-- return query returned these multiple records in one go | 1
-- return query returned these multiple records in one go | 2
-- return query returned these multiple records in one go | 3
-- loop exit when res>4 | 1
-- loop exit when res>4 | 2
-- loop exit when res>4 | 3
-- loop exit when res>4 | 4
-- loop exit when res>4 | 5
-- while res between 5 and 8 loop | 7
-- while res between 5 and 8 loop | 9
-- for element in reverse 3 .. -3 by 2 loop | 3
-- for element in reverse 3 .. -3 by 2 loop | 1
-- for element in reverse 3 .. -3 by 2 loop | -1
-- for element in reverse 3 .. -3 by 2 loop | -3
-- for <record> in <expression> loop | 118786
-- foreach array_slice slice 1 in array arr loop | 1
-- foreach array_slice slice 1 in array arr loop | 11
-- foreach array_slice slice 1 in array arr loop | 21
--(18 rows)

True, OUT parameters (including field names in a RETURNS TABLE (...) clause) are visible in all SQL DML statements in a PL/pgSQL function body, just like other variables. Find details in the manual chapters Variable Substitution and Returning from a Function for PL/pgSQL.
However, a more fundamental misunderstanding comes first here. The syntax of your nested SELECT is invalid to begin with. The PL/pgSQL variable happens to mask this problem (with a different problem). In SQL, you cannot refer to output column names (column aliases in the SELECT clause) in the WHERE clause. This is invalid:
select data1 res
from table_with_data
where res < var;
The manual:
An output column's name can be used to refer to the column's value in
ORDER BY and GROUP BY clauses, but not in the WHERE or HAVING clauses;
there you must write out the expression instead.
This is different for ORDER BY, which you mention in the text, but don't include in the query. See:
GROUP BY + CASE statement
Fixing immediate issue
Could be repaired like this:
CREATE OR REPLACE FUNCTION public.test1(var int)
RETURNS TABLE(res int)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY
SELECT data1 AS res -- column alias is just noise (or documentation)
FROM table_with_data
WHERE data1 < var; -- original column name!
END
$func$
fiddle
See:
Real number comparison for trigram similarity
The column alias is just noise in this case. The name of the column returned from the function is res in any case - as defined in the RETURNS TABLE clause.
Aside: It's recommended not to omit the AS keyword for column aliases (unlike table aliases). See:
Query to ORDER BY the number of rows returned from another SELECT
If there was actual ambiguity between column and variable name - say, you declared an OUT parameter or variable named data1 - you'd get an error message like this:
ERROR: column reference "data1" is ambiguous
LINE 2: select data1
^
DETAIL: It could refer to either a PL/pgSQL variable or a table column.
Brute force fix
Could be fixed with a special command at the start of the function body:
CREATE OR REPLACE FUNCTION public.test3(var int)
RETURNS TABLE(data1 int)
LANGUAGE plpgsql AS
$func$
#variable_conflict use_column -- ! to resolve conflicts
BEGIN
RETURN QUERY
SELECT data1
FROM table_with_data
WHERE data1 < var; -- !
END
$func$
See:
Naming conflict between function parameter and result of JOIN with USING clause
Proper fix
Table-qualify column names, and avoid conflicting variable names to begin with.
CREATE OR REPLACE FUNCTION public.test4(_var int)
RETURNS TABLE(res int)
LANGUAGE plpgsql STABLE AS
$func$
BEGIN
RETURN QUERY
SELECT t.data1 -- table-qualify column name
FROM table_with_data t
WHERE t.data1 < _var; -- !
END
$func$
Example:
Calling a PostgreSQL function from Java

Related

combine results from calling same function multiple times

I have a function func1(integer);
it returns rows of: partid,qty
for example:
select * from func1(1);
partid,qty
10 50
20 30
select * from func1(2);
partid,qty
10 5
20 30
11 10
I need to write a function that calls func1 with array and group by the results. func2(integer[]);
for example:
select * from func2(array[1,2]); should give:
partid,qty
10 55
20 60
11 10
I wrote this:
CREATE OR REPLACE FUNCTION func2(listx integer[])
RETURNS SETOF records_d AS
$BODY$
declare
item integer;
begin
foreach item in array listx loop
select * from func1(item);
end loop;
end;
$BODY$
LANGUAGE plpgsql VOLATILE
record_d is type (integer,integer)
This function doesn't work... I don't know how to combine the results from diffrent iterations of func1() and then return them.
You don't need another function for this, you can do that in a single SQL statement:
select f.partid, sum(f.qty)
from unnest(array[1,2]) i, func1(i) f
group by f.partid;
SQLFiddle example: http://sqlfiddle.com/#!15/8b083/1
As it is often with postgres, it have RETURN NEXT clause, which is just what You want.
I think You will have to adjust the code to cast results of innes select into records_d type. But RETURN NEXT is what You want.
CREATE OR REPLACE FUNCTION getAllFoo() RETURNS SETOF foo AS
$BODY$
DECLARE
r foo%rowtype;
BEGIN
FOR r IN SELECT * FROM foo
WHERE fooid > 0
LOOP
-- can do some processing here
RETURN NEXT r; -- return current row of SELECT
END LOOP;
RETURN;
END
$BODY$
LANGUAGE 'plpgsql' ;

PL/pgSQL: Add static column to query result

I have this function:
CREATE OR REPLACE FUNCTION func2(a integer[])
RETURNS SETOF newset AS
$BODY$
declare
x int;
begin
FOREACH x IN ARRAY $1
LOOP
RETURN QUERY SELECT * FROM func1(x);
END LOOP;
return;
end;
$BODY$
LANGUAGE plpgsql VOLATILE
func2 simply append all rows from all calls to func1. if first call to func1 gave 2 rows and second call gave 3 rows, func2 will return in total 5 rows (the rows themselves).
func1 returns a schema of 3 columns so currently func2 return same schema.
I want to change func2 so it will return 4 columns. the 3 from func1 and another column which contains the value of x.
for example:
calling func2(ARRAY[500,200])
and assume func1(500) return 2 rows and func1(200) return 3 rows.
I will get:
first second third forth
a b c 500
d e f 500
g h i 200
j k l 200
m n o 200
I created a newset2 which is newset with another column of int for func2
CREATE OR REPLACE FUNCTION func2(a integer[])
RETURNS SETOF newset2 AS
How do I add the desired column to the function?
You could just return the extra column:
RETURN QUERY SELECT *, x FROM func1(x);
This can be substantially more efficient with a plain SELECT query using unnest() a LATERAL join. Applying any sort order is simpler, too.
SELECT f.*, x
FROM unnest(ARRAY[500,200]) x, func1(x) f -- implicit LATERAL join
ORDER BY x;
That's all, including the additionally requested sort order.
It's a drop-in replacement for your SELECT * FROM func2(ARRAY[500,200]), no func2() needed.
You can still wrap this into a function, of course. I suggest a simple SQL function:
CREATE OR REPLACE FUNCTION func2(a integer[])
RETURNS SETOF newset AS
$func$
SELECT f.*, x
FROM unnest(ARRAY[500,200]) x, func1(x) f
ORDER BY x;
$func$ LANGUAGE sql
The row type newset has to be pre-defined somehow.
Related:
PostgreSQL unnest() with element number
What is the difference between LATERAL and a subquery in PostgreSQL?

Simple update set postgres stored procedure

I've a problem trying to make my stored procedure work.
This is my problem:
I have a table with a columns called a, in this column there are telephone numbers.
I have to add 0039 if the number starts with 1,8,3 or 0 (or leave it as is if not) and store the new number in the column b.
This is my code:
CREATE OR REPLACE FUNCTION upg_table() RETURNS void AS $$
BEGIN
IF (substring(a from 0 for 2)!='00')
AND (substring( a from 0 for 1)='3')
OR (substring(a from 0 for 1)='0')
OR (substring(a from 0 for 1)='1')
OR ( substring(a from 0 for 1)='8')
THEN
UPDATE cdr
SET
b = '0039'||a;
ELSE
UPDATE cdr
SET
b = a;
END IF;
END;
$$ LANGUAGE plpgsql;
The error is:
ERROR: the column "a" does not exist
ROW 1: SELECT substring(a from 0 for 2)!='00' AND ...
You code has two errors:
You cannot reference the column a like it was a (non-existent) plpgsql variable. You would have to loop, too. Your approach does not work at all.
You got operator precedence wrong. AND binds before OR.
But most importantly, you don't need a plpgsql function. A plain UPDATE will do the job:
UPDATE cdr
SET b = CASE WHEN left(a, 1) IN ('0', '1', '3', '8')
AND left(a, 2) <> '00'
THEN '0039' || a ELSE a END;
This updates b in all rows, but only some with a changed a.

How can I retrieve a column value of a specific row

I'm using PostgreSQL 9.3.
The table partner.partner_statistic contains the following columns:
id reg_count
serial integer
I wrote the function convert(integer):
CREATE FUNCTION convert(d integer) RETURNS integer AS $$
BEGIN
--Do something and return integer result
END
$$ LANGUAGE plpgsql;
And now I need to write a function returned array of integers as follows:
CREATE FUNCTION res() RETURNS integer[] AS $$
<< outerblock >>
DECLARE
arr integer[]; --That array of integers I need to fill in depends on the result of query
r partner.partner_statistic%rowtype;
table_name varchar DEFAULT 'partner.partner_statistic';
BEGIN
FOR r IN
SELECT * FROM partner.partner_statistic offset 0 limit 100
LOOP
--
-- I need to add convert(r[reg_count]) to arr where r[id] = 0 (mod 5)
--
-- How can I do that?
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
You don't need (and shouldn't use) PL/PgSQL loops for this. Just use an aggregate. I'm kind of guessing about what you mean by "where r[id] = 0 (mod 5) but I'm assuming you mean "where id is evenly divisible by 5". (Note that this is NOT the same thing as "every fifth row" because generated IDs have gaps).
Something like:
SELECT array_agg(r.reg_count)
FROM partner.partner_statistic
WHERE id % 5 = 0
LIMIT 100
probably meets your needs.
If you want to return the value, use RETURN QUERY SELECT ... or preferably use a simple sql language function.
If you want a dynamic table name, use:
RETURN QUERY EXECUTE format('
SELECT array_agg(r.reg_count)
FROM %I
WHERE id % 5 = 0
LIMIT 100', table_name::regclass);

Postgresql record to array

I need to convert from an array to rows and back to an array for filtering records.
I'm using information_schema._pg_expandarray in a SELECT query to get one row per value in the array.
Given the following array :
"char"[]
{i,i,o,t,b}
_pg_expandarray retuns 5 rows with 1 column of type record :
record
(i,1)
(i,2)
(o,3)
(t,4) <= to be filtered out later
(b,5)
I need to filter this result set to exclude record(s) that contains 't'.
How can I do that ? Should I convert back to an array ?
Is there a way to filter on the array directly ?
Thanks in advance.
If your objective is to produce a set of rows as above but with the row containing 't' removed, then this does the trick:
test=> select *
from information_schema._pg_expandarray(array['i','i','o','t','b']) as a(i)
where a.i!='t';
i | n
---+---
i | 1
i | 2
o | 3
b | 5
(4 rows)
As an aside, unless you particularly want the index returned as a second column, I'd be inclined to use unnest() over information_schema._pg_expandarray(), which does not appear to be documented and judging by the leading '_' in the name is probably intended for internal usage.
There does not seem to be any built in function to filter arrays. Your question implies you may want the result in array form - if that is the case then writing a simple function is trivial. Here is an example:
CREATE OR REPLACE FUNCTION array_filter(anyarray, anyelement) RETURNS anyarray
AS $$
DECLARE
inArray ALIAS FOR $1;
filtValue ALIAS FOR $2;
outArray ALIAS FOR $0;
outIndex int=0;
BEGIN
FOR I IN array_lower(inArray, 1)..array_upper(inArray, 1) LOOP
IF inArray[I] != filtValue THEN
outArray[outIndex] := inArray[I];
outIndex=outIndex+1;
END IF;
END LOOP;
RETURN outArray;
END;
$$ LANGUAGE plpgsql
STABLE
RETURNS NULL ON NULL INPUT;
Usage:
test=> select array_filter(array['i','i','o','t','b'],'t');
array_filter
-----------------
[0:3]={i,i,o,b}
(1 row)