How fix problem "column reference "id" is ambiguous" in PostgreSQL function? - sql

In PostgreSQL database I have 2 table: services and services_organizations_relationship. Each organization has a specific list of services.
My next function need to create new records in services table, then create relationship between services and organizations and finally return list of all new created services.
CREATE OR REPLACE FUNCTION test (
SERVICE_NAME_ARRAY VARCHAR[],
ACTIVE_ARRAY BOOLEAN[],
DESCRIPTION_ARRAY TEXT[],
ORGANIZATION_ID_ARRAY INT[]
) RETURNS TABLE (
ID UUID,
NAME VARCHAR,
ACTIVE BOOLEAN,
DESCRIPTION TEXT
) AS $$
BEGIN
RETURN QUERY
WITH RESULTS AS (
INSERT INTO SERVICES (NAME, ACTIVE, DESCRIPTION)
SELECT
UNNEST(ARRAY[SERVICE_NAME_ARRAY]) AS NAME,
UNNEST(ARRAY[ACTIVE_ARRAY]) AS ACTICE,
UNNEST(ARRAY[DESCRIPTION_ARRAY]) AS DESCRIPTION
RETURNING ID, NAME, ACTIVE, DESCRIPTION
),
GENERATE_SERVICES_ORGANIZATIONS_RELATIONSHIP AS
(
INSERT INTO SERVICES_ORGANIZATIONS_RELATIONSHIP (SERVICE_ID, ORGANIZATION_ID)
SELECT
UNNEST(ARRAY_AGG(ID)) AS SERVICE_ID,
UNNEST(ARRAY[ORGANIZATION_ID_ARRAY]) AS ORGANIZATION_ID
FROM RESULTS
ON CONFLICT ON CONSTRAINT SERVICES_ORGANIZATIONS_RELATIONSHIP_UNIQUE_KEY DO NOTHING
)
SELECT ID, NAME, ACTIVE, DESCRIPTION FROM RESULTS;
END;
$$ LANGUAGE plpgsql;
When I call this function:
SELECT * FROM test(ARRAY['SLOT', 'JTC'], ARRAY[TRUE, FALSE], ARRAY['SLOT', 'JTC'], ARRAY[30572, 30573]);
I see such error:
SQL Error [42702]: ERROR: column reference "id" is ambiguous
Details: It could refer to either a PL/pgSQL variable or a table column.
Where: PL/pgSQL function test(character varying[],boolean[],text[],integer[]) line 3 at RETURN QUERY
How to fix this problem?

The final line of the query should be
SELECT result.id, result.name,... FROM result
To avoid such collisions, you can use different names for the columns in the RETURNS TABLE clause (which are variables) and the columns in the queries (e.g. by using aliases).

Try this
GENERATE_SERVICES_ORGANIZATIONS_RELATIONSHIP AS
(
INSERT INTO SERVICES_ORGANIZATIONS_RELATIONSHIP (SERVICE_ID, ORGANIZATION_ID)
SELECT
UNNEST(ARRAY_AGG(t1.ID)) AS SERVICE_ID,
UNNEST(ARRAY[ORGANIZATION_ID_ARRAY]) AS ORGANIZATION_ID
FROM RESULTS t1
ON CONFLICT ON CONSTRAINT SERVICES_ORGANIZATIONS_RELATIONSHIP_UNIQUE_KEY DO NOTHING
)

Related

Is SELECT "faster" than function with nested INSERT?

I'm using a function that inserts a row to a table if it doesn't exist, then returns the id of the row.
Whenever I put the function inside a SELECT statement, with values that don't exist in the table yet, e.g.:
SELECT * FROM table WHERE id = function(123);
... it returns an empty row. However, running it again with the same values will return the row with the values I want to see.
Why does this happen? Is the INSERT running behind the SELECT speed? Or does PostgreSQL cache the table when it didn't exist, and at next run, it displays the result?
Here's a ready to use example of how this issue can occur:
CREATE TABLE IF NOT EXISTS test_table(
id INTEGER,
tvalue boolean
);
CREATE OR REPLACE FUNCTION test_function(user_id INTEGER)
RETURNS integer
LANGUAGE 'plpgsql'
AS $$
DECLARE
__user_id INTEGER;
BEGIN
EXECUTE format('SELECT * FROM test_table WHERE id = $1')
USING user_id
INTO __user_id;
IF __user_id IS NOT NULL THEN
RETURN __user_id;
ELSE
INSERT INTO test_table(id, tvalue)
VALUES (user_id, TRUE)
RETURNING id
INTO __user_id;
RETURN __user_id;
END IF;
END;
$$;
Call:
SELECT * FROM test_table WHERE id = test_function(4);
To reproduce the issue, pass any integer that doesn't exist in the table, yet.
The example is broken in multiple places.
No need for dynamic SQL with EXECUTE.
SELECT * in the function is wrong.
Your table definition should have a UNIQUE or PRIMARY KEY constraint on (id).
Most importantly, the final SELECT statement is bound to fail. Since the function is VOLATILE (has to be), it is evaluated once for every existing row in the table. Even if that worked, it would be a performance nightmare. But it does not. Like #user2864740 commented, there is also a problem with visibility. Postgres checks every existing row against the result of the function, which in turn adds 1 or more rows, and those rows are not yet in the snapshot the SELECT is operating on.
SELECT * FROM test_table WHERE id = test_function(4);
This would work (but see below!):
CREATE TABLE test_table (
id int PRIMARY KEY --!
, tvalue bool
);
CREATE OR REPLACE FUNCTION test_function(_user_id int)
RETURNS test_table LANGUAGE sql AS
$func$
WITH ins AS (
INSERT INTO test_table(id, tvalue)
VALUES (_user_id, TRUE)
ON CONFLICT DO NOTHING
RETURNING *
)
TABLE ins
UNION ALL
SELECT * FROM test_table WHERE id = _user_id
LIMIT 1
$func$;
And replace your SELECT with just:
SELECT * FROM test_function(1);
db<>fiddle here
Related:
Return a value if no record is found
How to use RETURNING with ON CONFLICT in PostgreSQL?
There is still a race condition for concurrent calls. If that can happen, consider:
Is SELECT or INSERT in a function prone to race conditions?

Function to select existing value or insert new row

I'm new to working with PL/pgSQL, and I'm attempting to create a function that will either find the ID of an existing row, or will insert a new row if it is not found, and return the new ID.
The query contained in the function below works fine on its own, and the function gets created fine. However, when I try to run it, I get an error stating "ERROR: column reference "id" is ambiguous". Can anybody identify my problem, or suggest a more appropriate way to do this?
create or replace function sp_get_insert_company(
in company_name varchar(100)
)
returns table (id int)
as $$
begin
with s as (
select
id
from
companies
where name = company_name
),
i as (
insert into companies (name)
select company_name
where not exists (select 1 from s)
returning id
)
select id
from i
union all
select id
from s;
end;
$$ language plpgsql;
This is how I call the function:
select sp_get_insert_company('TEST')
And this is the error that I get:
SQL Error [42702]: ERROR: column reference "id" is ambiguous
Detail: It could refer to either a PL/pgSQL variable or a table column.
Where: PL/pgSQL function sp_get_insert_company(character varying) line 3 at SQL statement
As the messages says, id is in there twice. Once in the queries, once in the table definition of the return type. Somehow this clashes.
Try qualifying the column expressions, everywhere.
...
with s as (
select
companies.id
from
companies
where name = company_name
),
i as (
insert into companies (name)
select company_name
where not exists (select 1 from s)
returning companies.id
)
select i.id
from i
union all
select s.id
from s;
...
By qualifying the column expression the DBMS does no longer confuse id of a table with the id in the return type definition.
The next problem will be, that your SELECT has no target. It will tell you to do a PERFORM instead. But I assume you want to return the results. Change the body to
...
RETURN QUERY (
with s as (
select
companies.id
from
companies
where name = company_name
),
i as (
insert into companies (name)
select company_name
where not exists (select 1 from s)
returning companies.id
)
select i.id
from i
union all
select s.id
from s);
...
to do so.
In the function you display there is no need for returns table (id int). It's supposed to always return exactly one integer ID. Simplify to RETURNS int. This also makes ERROR: column reference "id" is ambiguous go away, since we implicitly removed the OUT parameter id (visible in the whole function block).
How to return result of a SELECT inside a function in PostgreSQL?
There is also no need for LANGUAGE plpgsql. Could simply be LANGUAGE sql, then you wouldn't need to add RETURNS QUERY, either. So:
CREATE OR REPLACE FUNCTION sp_get_insert_company(_company_name text)
RETURNS int AS
$func$
WITH s as (
select c.id -- still good style to table-qualify all columns
from companies c
where c.name = _company_name
),
i as (
insert into companies (name)
select _company_name
where not exists (select 1 from s)
returning id
)
select s.id from s
union all
select i.id from i
LIMIT 1; -- to optimize performance
$func$ LANGUAGE sql;
Except that it still suffers from concurrency issues. Find a proper solution for your undisclosed version of Postgres in this closely related answer:
Is SELECT or INSERT in a function prone to race conditions?

Selecting and passing a record as a function argument

It may look like a duplicate of existing questions (e.g. This one) but they only deal with passing "new" arguments, not selecting rows from the database.
I have a table, for example:
CREATE TABLE my_table (
id bigserial NOT NULL,
name text,
CONSTRAINT my_table_pkey PRIMARY KEY (id)
);
And a function:
CREATE FUNCTION do_something(row_in my_table) RETURNS void AS
$$
BEGIN
-- does something
END;
$$
LANGUAGE plpgsql;
I would like to run it on data already existing in the database. It's no problem if I would like to use it from another PL/pgSQL stored procedure, for example:
-- ...
SELECT * INTO row_var FROM my_table WHERE id = 123; -- row_var is of type my_table%rowtype
PERFORM do_something(row_var);
-- ...
However, I have no idea how to do it using an "ordinary" query, e.g.
SELECT do_something(SELECT * FROM my_table WHERE id = 123);
ERROR: syntax error at or near "SELECT"
LINE 1: SELECT FROM do_something(SELECT * FROM my_table ...
Is there a way to execute such query?
You need to pass a scalar record to that function, this requires to enclose the actual select in another pair of parentheses:
SELECT do_something( (SELECT * FROM my_table WHERE id = 123) );
However the above will NOT work, because the function only expects a single column (a record of type my_table) whereas select * returns multiple columns (which is something different than a single record with multiple fields).
In order to return a record from the select you need to use the following syntax:
SELECT do_something( (SELECT my_table FROM my_table WHERE id = 123) );
Note that this might still fail if you don't make sure the select returns exactly one row.
If you want to apply the function to more than one row, you can do that like this:
select do_something(my_table)
from my_table;

RETURN QUERY-Record in PostgreSQL

I am trying to write a PostgreSQL function that inserts data in the database and then receives some data and returns it.
Here is the code:
CREATE OR REPLACE FUNCTION newTask(projectid api.objects.projectid%TYPE, predecessortaskid api.objects.predecessortaskid%TYPE, creatoruserid api.objects.creatoruserid%TYPE, title api.objects.title%TYPE, description api.objects.description%TYPE, deadline api.objects.deadline%TYPE, creationdate api.objects.creationdate%TYPE, issingletask api.tasks.issingletask%TYPE)
RETURNS SETOF api.v_task AS
$$
DECLARE
v_objectid api.objects.objectid%TYPE;
BEGIN
INSERT INTO api.objects(objectid, projectid, predecessortaskid, creatoruserid, title, description, deadline, creationdate) VALUES (DEFAULT, projectid, predecessortaskid, creatoruserid, title, description, deadline, creationdate)
RETURNING objectid INTO v_objectid;
INSERT INTO api.tasks(objectid, issingletask) VALUES (v_objectid, issingletask);
RETURN QUERY (SELECT * FROM api.v_task WHERE objectid = v_objectid);
END;
$$ LANGUAGE plpgsql;
objects and tasks are both tables and v_task is a view, which is a join of the two. The reason why I return data that I just inserted is that there are some triggers doing work on it.
So far so good. I use RETURNS SETOF api.v_task as my return type and RETURN QUERY (...) and therefore expect the result to look like a SELECT from v_task (same columns with same data types). However, what really happens is (output from pgAdmin, same result from my node.js-application):
SELECT newTask( CAST(NULL AS integer), CAST(NULL AS integer), 1, varchar 'a',varchar 'a', cast(NOW() as timestamp(0) without time zone), cast(NOW() as timestamp(0) without time zone), true);
newtask
api.v_task
--------
"(27,,,1,a,a,"2012-03-19 12:15:50","2012-03-19 12:15:50","2012-03-19 12:15:49.629997",,t)"
Instead of several column the output is forced into one, separated by commas.
As I am already using a special record type I can't use the AS keyword to specify the fields of my output.
Calling a table function
To retrieve individual columns from a function returning multiple columns (effectively a composite type or row type), call it with:
SELECT * FROM func();
If you want to, you can also just SELECT some columns and not others. Think of such a function (also called table function) like of a table:
SELECT objectid, projectid, title FROM func();
Alternative here: plain SQL
If you use PostgreSQL 9.1 or later you might be interested in this variant. I use a writable CTE to chain the INSERTs.
One might be tempted to add the final SELECT as another module to the CTE, but that does not work in this case, because the newly inserted values are not visible in the view within the same CTE. So I left that as a separate command - without brackets around the SELECT:
CREATE OR REPLACE FUNCTION new_task (
_projectid api.objects.projectid%TYPE
,_predecessortaskid api.objects.predecessortaskid%TYPE
,_creatoruserid api.objects.creatoruserid%TYPE
,_title api.objects.title%TYPE
,_description api.objects.description%TYPE
,_deadline api.objects.deadline%TYPE
,_creationdate api.objects.creationdate%TYPE
,_issingletask api.tasks.issingletask%TYPE)
RETURNS SETOF api.v_task AS
$func$
DECLARE
_objectid api.objects.objectid%TYPE;
BEGIN
RETURN QUERY
WITH x AS (
INSERT INTO api.objects
( projectid, predecessortaskid, creatoruserid, title
, description, deadline, creationdate)
VALUES (_projectid, _predecessortaskid, _creatoruserid, _title
, _description, _deadline, _creationdate)
RETURNING objectid
)
INSERT INTO api.tasks
(objectid, issingletask)
SELECT x.objectid, _issingletask
FROM x
RETURNING objectid INTO _objectid;
RETURN QUERY
SELECT * FROM api.v_task WHERE objectid = _objectid;
END
$func$ LANGUAGE plpgsql;

How to return a record from function, executed by INSERT/UPDATE rule (trigger)?

Do the following scheme for my database:
create sequence data_sequence;
create table data_table
{
id integer primary key;
field varchar(100);
};
create view data_view as
select id, field from data_table;
create function data_insert(_new data_view) returns data_view as
$$declare
_id integer;
_result data_view%rowtype;
begin
_id := nextval('data_sequence');
insert into data_table(id, field) values(_id, _new.field);
select * into _result from data_view where id = _id;
return _result;
end;
$$
language plpgsql;
create rule insert as on insert to data_view do instead
select data_insert(new);
Then type in psql:
insert into data_view(field) values('abc');
Would like to see something like:
id | field
----+---------
1 | abc
Instead see:
data_insert
-------------
(1, "abc")
Is it possible to fix this somehow?
Thanks for any ideas.
Ultimate idea is to use this in other functions, so that I could obtain id of just inserted record without selecting for it from scratch. Something like:
insert into data_view(field) values('abc') returning id into my_variable
would be nice but doesn't work with error:
ERROR: cannot perform INSERT RETURNING on relation "data_view"
HINT: You need an unconditional ON INSERT DO INSTEAD rule with a RETURNING clause.
I don't really understand that HINT. I use PostgreSQL 8.4.
What you want to do is already built into postgres. It allows you to include a RETURNING clause on INSERT statements.
CREATE TABLE data_table (
id SERIAL,
field VARCHAR(100),
CONSTRAINT data_table_pkey PRIMARY KEY (id)
);
INSERT INTO data_table (field) VALUES ('testing') RETURNING id, field;
If you feel you must use a view, check this thread on the postgres mailing list before going any further.