grouping by parameters in postgres sql function and returning column name identical to the parameter - sql

I have a query
select count(distinct id), usertype from mytable group by usertype;
which I want to wrap in a function where usertype is a parameter
create or replace function myfun(facet text)
RETURNS TABLE(total_accounts bigint, facet text)
LANGUAGE plpgsql
AS $$
select count(distinct id), facet from mytable group by facet;
$$;
select * from myfun('usertype');
so that output column name also depends on the parameter I send.
But it shows an error 'parameter facet is used more than once'.
how do I make the output identical to the original query when I pass 'usertype' as a parameter?

You can use a dynamic query in a plpgsql function :
create or replace function myfun(OUT total_accounts bigint, INOUT facet text)
RETURNS setof record LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY EXECUTE '
select count(distinct id),' || quote_ident(facet) || ' from mytable group by ' || quote_ident(facet) ;
END ;
$$;
select * from myfun('usertype');
test result in dbfiddle

Related

PostgreSQL call function returning setof record with table and additional columns

I am trying to call a function returning all columns of an existing table + some unrelated additional ones and retrieve the returned data using the following syntax:
select *
from test_func(...)
as (a_table my_table_name, rows_count numeric);
The function has the following format:
CREATE OR REPLACE FUNCTION public.test_func(...)
RETURNS SETOF record
LANGUAGE plpgsql
AS $function$
DECLARE
_sql VARCHAR;
begin
_sql := 'SELECT mtn.*, count(*) over() as rows_count
from public.my_table_name mtn
... inner joins and other stuff';
return query
execute _sql using ..._sql params...;
END;
$function$
;
However, it doesn't work. The error I receive:
ERROR: structure of query does not match function result type
Detail: Returned type numeric does not match expected type "my_table_name" in column 1.
If the query in your function should return a my_table_name (a “whole-row reference), you should write it like
SELECT mtn, count(*) OVER () ...

Postgresql, function returns query by calling another function

Postgresql 12. Want a function to return query by calling another function but don't know how to call.
create or replace function getFromA()
returns table(_id bigint, _name varchar) as $$
begin
RETURN QUERY SELECT id, name from groups;
end; $$ language plpgsql;
create or replace function getFromB()
returns table(_id bigint, _name varchar) as $$
begin
return query select getFromA();
end; $$ language plpgsql;
select getFromB();
gets error:
SQL Error [42804]: ERROR: structure of query does not match function result type
Detail: Returned type record does not match expected type bigint in column 1.
Where: PL/pgSQL function getfromb() line 3 at RETURN QUERY
How to fix this?
The problem is in getFromB():
return query select getFromA();
Unlike some other databases, Postgres allows set-returning functions directly in the select clause. This works, but can be tricky: this returns a set, hence not the expected structure.
You would need to select ... from getFromA() instead: this way it returns the proper data structure.
create or replace function getFromB()
returns table(_id bigint, _name varchar) as $$
begin
return query select * from getFromA();
end; $$ language plpgsql;
Demo on DB Fiddle

Multiple ALTER TABLE ADD COLUMN in one SQL function call

I came across some weird behaviour I'd like to understand.
I create a plpgsql function doing nothing except of ALTER TABLE ADD COLUMN. I call it 2 times on the same table:
A) In a single SELECT sentence
B) In a SQL function with same SELECT as in A)
Results are different: A) creates two columns, while B) creates only one column. Why?
Code:
CREATE FUNCTION add_text_column(table_name text, column_name text) RETURNS VOID
LANGUAGE plpgsql
AS $fff$
BEGIN
EXECUTE '
ALTER TABLE ' || table_name || '
ADD COLUMN ' || column_name || ' text;
';
END;
$fff$
;
-- this function is called only in B
CREATE FUNCTION add_many_text_columns(table_name text) RETURNS VOID
LANGUAGE SQL
AS $fff$
WITH
col_names (col_name) AS (
VALUES
( 'col_1' ),
( 'col_2' )
)
SELECT add_text_column(table_name, col_name)
FROM col_names
;
$fff$
;
-- A)
CREATE TABLE a (id integer);
WITH
col_names (col_name) AS (
VALUES
( 'col_1' ),
( 'col_2' )
)
SELECT add_text_column('a', col_name)
FROM col_names
;
SELECT * FROM a;
-- B)
CREATE TABLE b (id integer);
SELECT add_many_text_columns('b');
SELECT * FROM b;
Result:
CREATE FUNCTION
CREATE FUNCTION
CREATE TABLE
add_text_column
-----------------
(2 rows)
id | col_1 | col_2
----+-------+-------
(0 rows)
CREATE TABLE
add_many_text_columns
-----------------------
(1 row)
id | col_1
----+-------
(0 rows)
I'm using PostgreSQL 10.4. Please note that this is only a minimal working example, not the full functionality I need.
CREATE OR REPLACE FUNCTION g(i INTEGER)
RETURNS VOID AS $$
BEGIN
RAISE NOTICE 'g called with %', i;
END
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION t(i INTEGER)
RETURNS VOID AS $$
SELECT g(id)
FROM generate_series(1, i) id;
$$ LANGUAGE SQL;
What do you think happens when I run SELECT t(4)? The only statement printed from g() is g called with 1.
The reason for this is your add_many_text_columns function returns a single result (void). Because it's SQL and is simply returning the result of a SELECT statement, it seems to stop executing after getting the first result, which makes sense if you think of it - it can only return one result after all.
Now change the function to:
CREATE OR REPLACE FUNCTION t(i INTEGER)
RETURNS SETOF VOID AS $$
SELECT g(id)
FROM generate_series(1, i) id;
$$ LANGUAGE SQL;
And run SELECT t(4) again, and now this is printed:
g called with 1
g called with 2
g called with 3
g called with 4
Because the function now returns SETOF VOID, it doesn't stop after the first result and executes it fully.
So back to your functions, you could change your SQL function to return SETOF VOID, but it doesn't really make much sense - better I think to change it to plpgsql and have it do a PERFORM:
CREATE OR REPLACE FUNCTION t(i INTEGER)
RETURNS VOID AS $$
BEGIN
PERFORM g(id)
FROM generate_series(1, i) id;
END
$$ LANGUAGE plpgsql;
That will execute the statement fully and it still returns a single VOID.
eurotrash provided a good explanation.
Alternative solution 1
CREATE OR REPLACE FUNCTION t(i INTEGER)
RETURNS VOID AS
$func$
SELECT g(id)
FROM generate_series(1, i) id;
SELECT null::void;
$func$ LANGUAGE sql;
Because, quoting the manual:
SQL functions execute an arbitrary list of SQL statements, returning
the result of the last query in the list. In the simple (non-set)
case, the first row of the last query's result will be returned.
By adding a dummy SELECT at the end we avoid that Postgres stops after processing the the first row of the query with multiple rows.
Alternative solution 2
CREATE OR REPLACE FUNCTION t(i INTEGER)
RETURNS VOID AS
$func$
SELECT count(g(id))
FROM generate_series(1, i) id;
$func$ LANGUAGE sql;
By using an aggregate function, all underlying rows are processed in any case. The function returns bigint (that's what count() returns), so we get the number of rows as result.
Alternative solution 3
If you need to return void for some unknown reason, you can cast:
CREATE OR REPLACE FUNCTION t(i INTEGER)
RETURNS VOID AS
$func$
SELECT count(g(id))::text::void
FROM generate_series(1, i) id;
$func$ LANGUAGE sql;
The cast to text is a stepping stone because the cast from bigint to void is not defined.

Return a query from a function?

I am using PostgreSQL 8.4 and I want to create a function that returns a query with many rows.
The following function does not work:
create function get_names(varchar) returns setof record AS $$
declare
tname alias for $1;
res setof record;
begin
select * into res from mytable where name = tname;
return res;
end;
$$ LANGUAGE plpgsql;
The type record only allows single row.
How to return an entire query? I want to use functions as query templates.
CREATE OR REPLACE FUNCTION get_names(_tname varchar)
RETURNS TABLE (col_a integer, col_b text) AS
$func$
BEGIN
RETURN QUERY
SELECT t.col_a, t.col_b -- must match RETURNS TABLE
FROM mytable t
WHERE t.name = _tname;
END
$func$ LANGUAGE plpgsql;
Call like this:
SELECT * FROM get_names('name')
Major points:
Use RETURNS TABLE, so you don't have to provide a list of column names with every call.
Use RETURN QUERY, much simpler.
Table-qualify column names to avoid naming conflicts with identically named OUT parameters (including columns declared with RETURNS TABLE).
Use a named variable instead of ALIAS. Simpler, doing the same, and it's the preferred way.
A simple function like this could also be written in LANGUAGE sql:
CREATE OR REPLACE FUNCTION get_names(_tname varchar)
RETURNS TABLE (col_a integer, col_b text) AS
$func$
SELECT t.col_a, t.col_b --, more columns - must match RETURNS above
FROM mytable t
WHERE t.name = $1;
$func$ LANGUAGE sql;

Can I have a postgres plpgsql function return variable-column records?

I want to create a postgres function that builds the set of columns it
returns on-the-fly; in short, it should take in a list of keys, build
one column per-key, and return a record consisting of whatever that set
of columns was. Briefly, here's the code:
CREATE OR REPLACE FUNCTION reports.get_activities_for_report() RETURNS int[] AS $F$
BEGIN
RETURN ARRAY(SELECT activity_id FROM public.activity WHERE activity_id NOT IN (1, 2));
END;
$F$
LANGUAGE plpgsql
STABLE;
CREATE OR REPLACE FUNCTION reports.get_amount_of_time_query(format TEXT, _activity_id INTEGER) RETURNS TEXT AS $F$
DECLARE
_label TEXT;
BEGIN
SELECT label INTO _label FROM public.activity WHERE activity_id = _activity_id;
IF _label IS NOT NULL THEN
IF lower(format) = 'percentage' THEN
RETURN $$TO_CHAR(100.0 *$$ ||
$$ (SUM(CASE WHEN activity_id = $$ || _activity_id || $$ THEN EXTRACT(EPOCH FROM ended - started) END) /$$ ||
$$ SUM(EXTRACT(EPOCH FROM ended - started))),$$ ||
$$ '990.99 %') AS $$ || quote_ident(_label);
ELSE
RETURN $$SUM(CASE WHEN activity_id = $$ || _activity_id || $$ THEN ended - started END)$$ ||
$$ AS $$ || quote_ident(_label);
END IF;
END IF;
END;
$F$
LANGUAGE plpgsql
STABLE;
CREATE OR REPLACE FUNCTION reports.build_activity_query(format TEXT, activities int[]) RETURNS TEXT AS $F$
DECLARE
_activity_id INT;
query TEXT;
_activity_count INT;
BEGIN
_activity_count := array_upper(activities, 1);
query := $$SELECT agent_id, portal_user_id, SUM(ended - started) AS total$$;
FOR i IN 1.._activity_count LOOP
_activity_id := activities[i];
query := query || ', ' || reports.get_amount_of_time_query(format, _activity_id);
END LOOP;
query := query || $$ FROM public.activity_log_final$$ ||
$$ LEFT JOIN agent USING (agent_id)$$ ||
$$ WHERE started::DATE BETWEEN actual_start_date AND actual_end_date$$ ||
$$ GROUP BY agent_id, portal_user_id$$ ||
$$ ORDER BY agent_id$$;
RETURN query;
END;
$F$
LANGUAGE plpgsql
STABLE;
CREATE OR REPLACE FUNCTION reports.get_agent_activity_breakdown(format TEXT, start_date DATE, end_date DATE) RETURNS SETOF RECORD AS $F$
DECLARE
actual_end_date DATE;
actual_start_date DATE;
query TEXT;
_rec RECORD;
BEGIN
actual_start_date := COALESCE(start_date, '1970-01-01'::DATE);
actual_end_date := COALESCE(end_date, now()::DATE);
query := reports.build_activity_query(format, reports.get_activities_for_report());
FOR _rec IN EXECUTE query LOOP
RETURN NEXT _rec;
END LOOP;
END
$F$
LANGUAGE plpgsql;
This builds queries that look (roughly) like this:
SELECT agent_id,
portal_user_id,
SUM(ended - started) AS total,
SUM(CASE WHEN activity_id = 3 THEN ended - started END) AS "Label 1"
SUM(CASE WHEN activity_id = 4 THEN ended - started END) AS "Label 2"
FROM public.activity_log_final
LEFT JOIN agent USING (agent_id)
WHERE started::DATE BETWEEN actual_start_date AND actual_end_date
GROUP BY agent_id, portal_user_id
ORDER BY agent_id
When I try to call the get_agent_activity_breakdown() function, I get this error:
psql:2009-10-22_agent_activity_report_test.sql:179: ERROR: a column definition list is required for functions returning "record"
CONTEXT: SQL statement "SELECT * FROM reports.get_agent_activity_breakdown('percentage', NULL, NULL)"
PL/pgSQL function "test_agent_activity" line 92 at SQL statement
The trick is, of course, that the columns labeled 'Label 1' and 'Label
2' are dependent on the set of activities defined in the contents of the
activity table, which I cannot predict when calling the function. How
can I create a function to access this information?
If you really want to create such table dynamically, maybe just create a temporary table within the function so it can have any columns you want. Let the function insert all rows into the table instead of returning them. The function can return only the name of the table or you can just have one exact table name that you know. After running that function you can just select data from the table. The function should also check if the temporary table exists so it should delete or truncate it.
Simon's answer might be better overall in the end, I'm just telling you how to do it without changing what you've got.
From the docs:
from_item can be one of:
...
function_name ( [ argument [, ...] ] ) [ AS ] alias [ ( column_alias [, ...] | column_definition [, ...] ) ]
function_name ( [ argument [, ...] ] ) AS ( column_definition [, ...] )
In other words, later it says:
If the function has been defined as
returning the record data type, then
an alias or the key word AS must be
present, followed by a column
definition list in the form (
column_name data_type [, ... ] ). The
column definition list must match the
actual number and types of columns
returned by the function.
I think the alias thing is only an option if you've predefined a type somewhere (like if you're mimicing the output of a predefined table, or have actually used CREATE TYPE...don't quote me on that, though.)
So, I think you would need something like:
SELECT *
FROM reports.get_agent_activity_breakdown('percentage', NULL, NULL)
AS (agent_id integer, portal_user_id integer, total something, ...)
The problem for you lies in the .... You'll need to know before you execute the query the names and types of all the columns--so you'll end up selecting on public.activity twice.
Both Simon's and Kev's answers are good ones, but what I ended up doing was splitting the calls to the database into two queries:
Build the query using the query constructor methods I included in the question, returning that to the application.
Call the query directly, and return that data.
This is safe in my case because the dynamic column list is not subject to frequent change, so I don't need to worry about the query's target data changing in between these calls. Otherwise, though, my method might not work.
you cannot change number of output columns, but you can to use refcursor, and you can return opened cursor.
more on http://okbob.blogspot.com/2008/08/using-cursors-for-generating-cross.html