SQL function return-type: TABLE vs SETOF records - sql

What's the difference between a function that returns TABLE vs SETOF records, all else equal.
CREATE FUNCTION events_by_type_1(text) RETURNS TABLE(id bigint, name text) AS $$
SELECT id, name FROM events WHERE type = $1;
$$ LANGUAGE SQL STABLE;
CREATE FUNCTION events_by_type_2(text) RETURNS SETOF record AS $$
SELECT id, name FROM events WHERE type = $1;
$$ LANGUAGE SQL STABLE;
These functions seem to return the same results. See this SQLFiddle.

When returning SETOF record the output columns are not typed and not named. Thus this form can't be used directly in a FROM clause as if it was a subquery or a table.
That is, when issuing:
SELECT * from events_by_type_2('social');
we get this error:
ERROR: a column definition list is required for functions returning
"record"
It can be "casted" into the correct column types by the SQL caller though. This form does work:
SELECT * from events_by_type_2('social') as (id bigint, name text);
and results in:
id | name
----+----------------
1 | Dance Party
2 | Happy Hour
...
For this reason SETOF record is considered less practical. It should be used only when the column types of the results are not known in advance.

This answer is only to remember alternative context where TABLE and SETOF are equivalent.
As #a_horse_with_no_name pointed, it is not a RETURNS SETOF "unknown record", is a defined one.
In this example, the types table and setof are equivalent,
CREATE TYPE footype AS (score int, term text);
CREATE FUNCTION foo() RETURNS SETOF footype AS $$
SELECT * FROM ( VALUES (1,'hello!'), (2,'Bye') ) t;
$$ language SQL immutable;
CREATE FUNCTION foo_tab() RETURNS TABLE (score int, term text) AS $$
SELECT * FROM ( VALUES (1,'hello!'), (2,'Bye') ) t;
$$ language SQL immutable;
SELECT * FROM foo(); -- works fine!
SELECT * FROM foo_tab(); -- works fine and is equivalent.
The RETURNS SETOF have the advantage of reuse type (see footype), that is impossible with RETURNS TABLE.

Related

Returning dynamic SQL statement in PL/pgSQL Functions

I have a table called "points" with a column called "geom" of type geometry.
I want to create a function that returns a table with a column of "geometry" data type. I have been successful in returning a table with my the correct data type when the name of the target table (points) is hardcoded in the "RETURN QUERY" clause.
I want to have the name of the table as an input of the function (in a dynamic way). How can I change this code to accept the name of the target table (called points in this code) as an input?
CREATE OR REPLACE FUNCTION milad_points()
RETURNS TABLE (geom points.geom%TYPE)
AS $$
BEGIN
RETURN QUERY SELECT points.geom FROM points;
END;
$$ LANGUAGE PLPGSQL;
I know that for managing the dynamic queries we have to make it as a string and run it as EXECUTE sql_string. However, I could not get it work in the above-mentioned example.
The only way to do something like that is to use the anyelement data type.
But in order to use anyelement as a return type, you have to specify an anyelement parameter too. It is not important what value you use as argument, but its data type determines the actual data type returned.
See the following example:
CREATE FUNCTION anyfun(tabname name, typdef anyelement) RETURNS SETOF anyelement
LANGUAGE plpgsql STABLE AS
$$BEGIN
RETURN QUERY EXECUTE format('SELECT id FROM %I', tabname);
END;$$;
Now let's test it with two different tables:
CREATE TABLE anytab (id integer);
INSERT INTO anytab VALUES (1), (42);
SELECT * FROM anyfun('anytab', NULL::integer);
anyfun
--------
1
42
(2 rows)
CREATE TABLE anothertab (id text);
INSERT INTO anothertab VALUES ('one'), ('two');
SELECT * FROM anyfun('anothertab', NULL::text);
anyfun
--------
one
two
(2 rows)

<column> is ambiguous in column comparison between two tables

I want this postgres function to work :
CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results()
returns table(match_id BIGINT)
as
$$
BEGIN
return QUERY
SELECT *
FROM sports.match_history
WHERE match_id NOT IN (SELECT match_id
FROM sports.match_results);
END $$
LANGUAGE 'plpgsql';
This stand alone query works just fine:
SELECT *
FROM sports.match_history
WHERE match_id NOT IN (SELECT match_id FROM sports.match_results);
But when I put it into this function and try to run it like this:
select *
from difference_of_match_ids_in_match_history_and_match_results();
I get this:
SQL Error [42702]: ERROR: column reference "match_id" is ambiguous
Detail: It could refer to either a PL/pgSQL variable or a table
column. Where: PL/pgSQL function
difference_of_match_ids_in_match_history_and_match_results() line 3 at
RETURN QUERY
I've seen other questions with this same error, and they suggest naming the sub queries to specify which instance of a column you're referring to, however, those examples use joins and my query works fine outside of the function.
If I do need to name the column, how would I do so with only one sub-query?
If that isn't the issue, then I'm assuming that there's something wrong with the way I'm defining a function.
You query is fine. The ambiguity is on the match_id in returns table(match_id BIGINT) rename it or prefix the columns with the table name in your query
CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results()
returns table(new_name BIGINT)
as
$$
BEGIN
return QUERY
SELECT *
FROM sports.match_history
WHERE match_id NOT IN (SELECT match_id
FROM sports.match_results);
END $$
LANGUAGE 'plpgsql';
or
CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results()
returns table(match_id BIGINT)
as
$$
BEGIN
return QUERY
SELECT sports.match_history.match_id
FROM sports.match_history
WHERE sports.match_history.match_id NOT IN (SELECT sports.match_results.match_id
FROM sports.match_results);
END $$
LANGUAGE 'plpgsql';
Didn't test the code.
The structure of the result set must match the function result type. If you want to get only match_ids:
CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results()
RETURNS TABLE(m_id BIGINT) -- !!
AS
$$
BEGIN
RETURN QUERY
SELECT match_id -- !!
FROM sports.match_history
WHERE match_id NOT IN (SELECT match_id
FROM sports.match_results);
END $$
LANGUAGE 'plpgsql';
If you want to get whole rows as a result:
DROP FUNCTION difference_of_match_ids_in_match_history_and_match_results();
CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results()
RETURNS SETOF sports.match_history -- !!
AS
$$
BEGIN
RETURN QUERY
SELECT * -- !!
FROM sports.match_history
WHERE match_id NOT IN (SELECT match_id
FROM sports.match_results);
END $$
LANGUAGE 'plpgsql';
As others have answerd, it's an ambiguity between the result definition and PL/pgSQL variables. The column name in a set returning function is in fact also a variable inside the function.
But you don't need PL/pgSQL for this in the first place. If you use a plain SQL function it will be more efficient and the problem will go away as well:
CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results()
returns table(match_id BIGINT)
as
$$
SELECT match_id --<< do not return * - only return one column
FROM sports.match_history
WHERE match_id NOT IN (SELECT match_id
FROM sports.match_results);
$$
LANGUAGE sql;
Note that the language name is an identifier and should not be quoted at all.
The naming conflict between column names and plpgsql OUT parameters has been addressed. More details here:
Postgresql - INSERT RETURNING INTO ambiguous column reference
I would also use a different query style. NOT IN (SELECT ...) is typically slowest and carries traps with NULL values. Use NOT EXISTS instead:
SELECT match_id
FROM sports.match_history h
WHERE NOT EXISTS (
SELECT match_id
FROM sports.match_results
WHERE match_id = h.match_id
);
More:
Select rows which are not present in other table

Shouldn't this PostgreSQL function return zero rows?

Given the schema
CREATE TABLE users (
id bigserial PRIMARY KEY,
email varchar(254) NOT NULL
);
CREATE UNIQUE INDEX on users (lower(email));
CREATE FUNCTION all_users() RETURNS users AS $$
SELECT * FROM users;
$$ LANGUAGE SQL STABLE;
, shouldn't SELECT * FROM all_users() (assuming the users table is empty) return no rows, not a row with all null values?
See the SQL Fiddle here: http://sqlfiddle.com/#!15/b5ba8/2
That's because your function is broken by design. It should be:
CREATE FUNCTION all_users() RETURNS SETOF users AS
'SELECT * FROM users' LANGUAGE sql STABLE;
Or alternatively, the more flexible form RETURNS TABLE (...) like #Clodoaldo posted. But it's generally wiser to use RETURNS SETOF users for a query with SELECT * FROM users.
Your original function always returns a single value (a composite type), it has been declared that way. It will break in a more spectacular fashion if you insert some rows.
Consider this SQL Fiddle demo.
For better understanding, your function call does the same as this plain SELECT query:
SELECT (SELECT u from users u).*;
Returns:
id | email
-------+------
<NULL> | <NULL>
The difference: Plain SQL will raise an exception if the subquery returns more than one row, while a function will just return the first row and discard the rest.
As always, details in the manual.
Your function returns records. So it must return at least one record. If you want an empty result set do return a table:
CREATE or replace FUNCTION all_users()
RETURNS table (id bigint, email varchar(254)) AS $$
SELECT id, email FROM users;
$$ LANGUAGE SQL STABLE;

Passing a ResultSet into a Postgresql Function

Is it possible to pass the results of a postgres query as an input into another function?
As a very contrived example, say I have one query like
SELECT id, name
FROM users
LIMIT 50
and I want to create a function my_function that takes the resultset of the first query and returns the minimum id. Is this possible in pl/pgsql?
SELECT my_function(SELECT id, name FROM Users LIMIT 50); --returns 50
You could use a cursor, but that very impractical for computing a minimum.
I would use a temporary table for that purpose, and pass the table name for use in dynamic SQL:
CREATE OR REPLACE FUNCTION f_min_id(_tbl regclass, OUT min_id int) AS
$func$
BEGIN
EXECUTE 'SELECT min(id) FROM ' || _tbl
INTO min_id;
END
$func$ LANGUAGE plpgsql;
Call:
CREATE TEMP TABLE foo ON COMMIT DROP AS
SELECT id, name
FROM users
LIMIT 50;
SELECT f_min_id('foo');
Major points
The first parameter is of type regclass to prevent SQL injection. More info in this related answer on dba.SE.
I made the temp table ON COMMIT DROP to limit its lifetime to the current transaction. May or may not be what you want.
You can extend this example to take more parameters. Search for code examples for dynamic SQL with EXECUTE.
-> SQLfiddle demo
I would take the problem on the other side, calling an aggregate function for each record of the result set. It's not as flexible but can gives you an hint to work on.
As an exemple to follow your sample problem:
CREATE OR REPLACE FUNCTION myMin ( int,int ) RETURNS int AS $$
SELECT CASE WHEN $1 < $2 THEN $1 ELSE $2 END;
$$ LANGUAGE SQL STRICT IMMUTABLE;
CREATE AGGREGATE my_function ( int ) (
SFUNC = myMin, STYPE = int, INITCOND = 2147483647 --maxint
);
SELECT my_function(id) from (SELECT * FROM Users LIMIT 50) x;
It is not possible to pass an array of generic type RECORD to a plpgsql function which is essentially what you are trying to do.
What you can do is pass in an array of a specific user defined TYPE or of a particular table row type. In the example below you could also swap out the argument data type for the table name users[] (though this would obviously mean getting all data in the users table row).
CREATE TYPE trivial {
"ID" integer,
"NAME" text
}
CREATE OR REPLACE FUNCTION trivial_func(data trivial[])
RETURNS integer AS
$BODY$
DECLARE
BEGIN
--Implementation here using data
return 1;
END$BODY$
LANGUAGE 'plpgsql' VOLATILE;
I think there's no way to pass recordset or table into function (but I'd be glad if i'm wrong). Best I could suggest is to pass array:
create or replace function my_function(data int[])
returns int
as
$$
select min(x) from unnest(data) as x
$$
language SQL;
sql fiddle demo

plpgsql - column type auto detect by returning a resultset

I got Postgresql 8.4
I have a user table
user_id INT,
user_name VARCHAR(255),
user_email VARCHAR(255),
user_salt VARCHAR(255)
and 2 functions:
one with SETOF:
CREATE FUNCTION test ()
RETURNS SETOF "user"
AS
$BODY$
BEGIN
RETURN QUERY SELECT
*
FROM
"user";
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
one with TABLE:
CREATE FUNCTION test ()
RETURNS TABLE (id INT, name VARCHAR, email VARCHAR)
AS
$BODY$
BEGIN
RETURN QUERY SELECT
"user".user_id, "user".user_name, "user".user_email
FROM
"user";
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
By SETOF the column type is filled automatically, but I cannot set the column name and which columns to select in the result. By TABLE I can cut off the user_ prefix and set the exact column names, but I have to set the column types manually.
Is it possible to got the advantages of both?
Type handling with in PostgreSQL (and SQL in general) is strict. Defining the RETURN type of a function can be tricky.
There are simple solutions with plain SQL:
Select and rename all columns:
SELECT * FROM t AS t(id, name, email, salt);
Select and rename some columns:
SELECT user_id AS id, user_name AS name FROM t;
Combine with function
If you need a server side function for some reason, you can still combine these SQL features with a function. Given a table t like defined in your question and this simple function:
CREATE OR REPLACE FUNCTION f_test()
RETURNS SETOF t AS
$func$
SELECT * FROM t -- do stuff here
$func$ LANGUAGE sql STABLE;
To only rename columns (no need to provide types):
SELECT * FROM f_test() t(id, name, email, salt)
Select and rename some columns:
SELECT user_id AS id, user_name AS name FROM f_test() t;
You could possibly combine this with various different functions on different tables:
Refactor a PL/pgSQL function to return the output of various SELECT queries
But I am not entirely sure which problem in particular you want to tackle here.
Asides
Never quote the language name. plpgsql is an identifier. Quoting can lead to unexpected problems.
A function that only selects from tables can be declared STABLE. May be beneficial with repeated calls when nested.
Generally I would advice to upgrade to a current version of Postgres. 8.4 is rather old and will be lose support in less than a year. There have been a number of improvements in this area since 8.4.
I don't think this is possible in pl/pgsql because, it strongly depends on user defined types. Sadly this language is not smart enough for type auto detection... I think my first possible solution I'll use, it solves the problem partially because at least I won't need to refactor every function manually by type change of a table column.
1.) Possible solution with asking column types:
CREATE FUNCTION test ()
RETURNS TABLE (id "user".user_id%TYPE, name "user".user_name%TYPE, email "user".user_email%TYPE)
AS
$BODY$
BEGIN
return QUERY SELECT
"user".user_id, "user".user_name, "user".user_email
FROM
"user";
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
With this at least the type is not redundant. Not the best, but acceptable.
2.) Possible solution with SETOF RECORD:
CREATE FUNCTION test ()
RETURNS SETOF RECORD
AS
$BODY$
BEGIN
RETURN QUERY SELECT
"user".user_id AS id, "user".user_name AS name, "user".user_email AS email
FROM
"user";
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
Without column type definition list I got the following error:
ERROR: a column definition list is required for functions returning
"record"
So I have to use it like this:
SELECT * FROM test() AS (id INT, name VARCHAR, email VARCHAR);
Instead of this:
SELECT * FROM test()
I got every column in string by the php client, so the column type definition is more than useless for me... This solution would be the best without column type definition, but with it not acceptable.
It is possible to use this similar to table:
CREATE FUNCTION test (OUT id "user".user_id%TYPE, OUT name "user".user_name%TYPE, OUT email "user".user_email%TYPE)
RETURNS SETOF RECORD
AS
$BODY$
BEGIN
RETURN QUERY SELECT
"user".user_id, "user".user_name, "user".user_email
FROM
"user";
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
For example I could set everything to TEXT:
CREATE FUNCTION test (OUT id TEXT, OUT name TEXT , OUT email TEXT )
RETURNS SETOF RECORD
AS
$BODY$
BEGIN
RETURN QUERY SELECT
"user".user_id::TEXT , "user".user_name::TEXT , "user".user_email::TEXT
FROM
"user";
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
This works, but this is far from type auto detection, and it would result a lot of redundant text converter code...
3.) Possible solution with refcursors:
CREATE FUNCTION test ()
RETURNS SETOF "user"
AS
$BODY$
DECLARE
refc "user";
BEGIN
FOR refc IN
SELECT
"user".user_id, "user".user_name, "user".user_email
FROM
"user"
LOOP
RETURN NEXT refc;
END LOOP ;
RETURN ;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
This fills out the lacking columns with null values, I cannot name the columns, and sql loops are very slow... So this is not acceptable.
By refcursors there is another way: to return the refcursor itself, but it is not acceptable because I cannot use it as a normal variable, I have to give a string as cursor name... Btw I did not manage to use the refcursor itself as result in phpstorm. I got jdbc cursor not found error. Maybe I set the name wrong, I don't know, I don't think it's worth more effort.