Function to select existing value or insert new row - sql

I'm new to working with PL/pgSQL, and I'm attempting to create a function that will either find the ID of an existing row, or will insert a new row if it is not found, and return the new ID.
The query contained in the function below works fine on its own, and the function gets created fine. However, when I try to run it, I get an error stating "ERROR: column reference "id" is ambiguous". Can anybody identify my problem, or suggest a more appropriate way to do this?
create or replace function sp_get_insert_company(
in company_name varchar(100)
)
returns table (id int)
as $$
begin
with s as (
select
id
from
companies
where name = company_name
),
i as (
insert into companies (name)
select company_name
where not exists (select 1 from s)
returning id
)
select id
from i
union all
select id
from s;
end;
$$ language plpgsql;
This is how I call the function:
select sp_get_insert_company('TEST')
And this is the error that I get:
SQL Error [42702]: ERROR: column reference "id" is ambiguous
Detail: It could refer to either a PL/pgSQL variable or a table column.
Where: PL/pgSQL function sp_get_insert_company(character varying) line 3 at SQL statement

As the messages says, id is in there twice. Once in the queries, once in the table definition of the return type. Somehow this clashes.
Try qualifying the column expressions, everywhere.
...
with s as (
select
companies.id
from
companies
where name = company_name
),
i as (
insert into companies (name)
select company_name
where not exists (select 1 from s)
returning companies.id
)
select i.id
from i
union all
select s.id
from s;
...
By qualifying the column expression the DBMS does no longer confuse id of a table with the id in the return type definition.
The next problem will be, that your SELECT has no target. It will tell you to do a PERFORM instead. But I assume you want to return the results. Change the body to
...
RETURN QUERY (
with s as (
select
companies.id
from
companies
where name = company_name
),
i as (
insert into companies (name)
select company_name
where not exists (select 1 from s)
returning companies.id
)
select i.id
from i
union all
select s.id
from s);
...
to do so.

In the function you display there is no need for returns table (id int). It's supposed to always return exactly one integer ID. Simplify to RETURNS int. This also makes ERROR: column reference "id" is ambiguous go away, since we implicitly removed the OUT parameter id (visible in the whole function block).
How to return result of a SELECT inside a function in PostgreSQL?
There is also no need for LANGUAGE plpgsql. Could simply be LANGUAGE sql, then you wouldn't need to add RETURNS QUERY, either. So:
CREATE OR REPLACE FUNCTION sp_get_insert_company(_company_name text)
RETURNS int AS
$func$
WITH s as (
select c.id -- still good style to table-qualify all columns
from companies c
where c.name = _company_name
),
i as (
insert into companies (name)
select _company_name
where not exists (select 1 from s)
returning id
)
select s.id from s
union all
select i.id from i
LIMIT 1; -- to optimize performance
$func$ LANGUAGE sql;
Except that it still suffers from concurrency issues. Find a proper solution for your undisclosed version of Postgres in this closely related answer:
Is SELECT or INSERT in a function prone to race conditions?

Related

Table variable join equivalent in Oracle

I'm given quite a huge table My_Table and a user-defined collection Picture_Arr as an input usually having 5-10 rows declared as:
TYPE Picture_Rec IS RECORD (
seq_no NUMBER,
task_seq NUMBER);
TYPE Picture_Arr IS TABLE OF Picture_Rec;
In MS SQL I would normally write something like:
DECLARE #Picture_Arr TABLE (seq_no INT, task_seq INT)
SELECT M.*
FROM My_Table M
INNER JOIN #Picture_Arr A
ON M.seq_no = A.seq_no AND M.task_seq = A.task_seq
But I can't get my head around how to re-write the same code in Oracle as Picture_Arr is not a table. As some tutorials state that I could've looped through My_Table and compare keys, but is it efficient in Oracle or is there another way of doing that?
Perhaps this is what you are looking for. It is a bit complicated to understand what is the desired output, and whether the data of the record is stored somewhere or not
create type Picture_Rec as object(
seq_no NUMBER,
task_seq NUMBER);
)
/
create type Picture_Tab as table of Picture_Rec
/
create or replace function get_picture_list
return Picture_Tab
is
l_pic Picture_Tab;
begin
select Picture_Rec ( seqno, taskseq )
bulk collect into l_pic
from your_table; -- the table you have these records
return l_pic;
end;
/
Then you run
SELECT M.*
FROM My_Table M
JOIN TABLE ( get_picture_list() ) p
ON M.seq_no = p.seq_no AND M.task_seq = p.task_seq

Is SELECT "faster" than function with nested INSERT?

I'm using a function that inserts a row to a table if it doesn't exist, then returns the id of the row.
Whenever I put the function inside a SELECT statement, with values that don't exist in the table yet, e.g.:
SELECT * FROM table WHERE id = function(123);
... it returns an empty row. However, running it again with the same values will return the row with the values I want to see.
Why does this happen? Is the INSERT running behind the SELECT speed? Or does PostgreSQL cache the table when it didn't exist, and at next run, it displays the result?
Here's a ready to use example of how this issue can occur:
CREATE TABLE IF NOT EXISTS test_table(
id INTEGER,
tvalue boolean
);
CREATE OR REPLACE FUNCTION test_function(user_id INTEGER)
RETURNS integer
LANGUAGE 'plpgsql'
AS $$
DECLARE
__user_id INTEGER;
BEGIN
EXECUTE format('SELECT * FROM test_table WHERE id = $1')
USING user_id
INTO __user_id;
IF __user_id IS NOT NULL THEN
RETURN __user_id;
ELSE
INSERT INTO test_table(id, tvalue)
VALUES (user_id, TRUE)
RETURNING id
INTO __user_id;
RETURN __user_id;
END IF;
END;
$$;
Call:
SELECT * FROM test_table WHERE id = test_function(4);
To reproduce the issue, pass any integer that doesn't exist in the table, yet.
The example is broken in multiple places.
No need for dynamic SQL with EXECUTE.
SELECT * in the function is wrong.
Your table definition should have a UNIQUE or PRIMARY KEY constraint on (id).
Most importantly, the final SELECT statement is bound to fail. Since the function is VOLATILE (has to be), it is evaluated once for every existing row in the table. Even if that worked, it would be a performance nightmare. But it does not. Like #user2864740 commented, there is also a problem with visibility. Postgres checks every existing row against the result of the function, which in turn adds 1 or more rows, and those rows are not yet in the snapshot the SELECT is operating on.
SELECT * FROM test_table WHERE id = test_function(4);
This would work (but see below!):
CREATE TABLE test_table (
id int PRIMARY KEY --!
, tvalue bool
);
CREATE OR REPLACE FUNCTION test_function(_user_id int)
RETURNS test_table LANGUAGE sql AS
$func$
WITH ins AS (
INSERT INTO test_table(id, tvalue)
VALUES (_user_id, TRUE)
ON CONFLICT DO NOTHING
RETURNING *
)
TABLE ins
UNION ALL
SELECT * FROM test_table WHERE id = _user_id
LIMIT 1
$func$;
And replace your SELECT with just:
SELECT * FROM test_function(1);
db<>fiddle here
Related:
Return a value if no record is found
How to use RETURNING with ON CONFLICT in PostgreSQL?
There is still a race condition for concurrent calls. If that can happen, consider:
Is SELECT or INSERT in a function prone to race conditions?

How fix problem "column reference "id" is ambiguous" in PostgreSQL function?

In PostgreSQL database I have 2 table: services and services_organizations_relationship. Each organization has a specific list of services.
My next function need to create new records in services table, then create relationship between services and organizations and finally return list of all new created services.
CREATE OR REPLACE FUNCTION test (
SERVICE_NAME_ARRAY VARCHAR[],
ACTIVE_ARRAY BOOLEAN[],
DESCRIPTION_ARRAY TEXT[],
ORGANIZATION_ID_ARRAY INT[]
) RETURNS TABLE (
ID UUID,
NAME VARCHAR,
ACTIVE BOOLEAN,
DESCRIPTION TEXT
) AS $$
BEGIN
RETURN QUERY
WITH RESULTS AS (
INSERT INTO SERVICES (NAME, ACTIVE, DESCRIPTION)
SELECT
UNNEST(ARRAY[SERVICE_NAME_ARRAY]) AS NAME,
UNNEST(ARRAY[ACTIVE_ARRAY]) AS ACTICE,
UNNEST(ARRAY[DESCRIPTION_ARRAY]) AS DESCRIPTION
RETURNING ID, NAME, ACTIVE, DESCRIPTION
),
GENERATE_SERVICES_ORGANIZATIONS_RELATIONSHIP AS
(
INSERT INTO SERVICES_ORGANIZATIONS_RELATIONSHIP (SERVICE_ID, ORGANIZATION_ID)
SELECT
UNNEST(ARRAY_AGG(ID)) AS SERVICE_ID,
UNNEST(ARRAY[ORGANIZATION_ID_ARRAY]) AS ORGANIZATION_ID
FROM RESULTS
ON CONFLICT ON CONSTRAINT SERVICES_ORGANIZATIONS_RELATIONSHIP_UNIQUE_KEY DO NOTHING
)
SELECT ID, NAME, ACTIVE, DESCRIPTION FROM RESULTS;
END;
$$ LANGUAGE plpgsql;
When I call this function:
SELECT * FROM test(ARRAY['SLOT', 'JTC'], ARRAY[TRUE, FALSE], ARRAY['SLOT', 'JTC'], ARRAY[30572, 30573]);
I see such error:
SQL Error [42702]: ERROR: column reference "id" is ambiguous
Details: It could refer to either a PL/pgSQL variable or a table column.
Where: PL/pgSQL function test(character varying[],boolean[],text[],integer[]) line 3 at RETURN QUERY
How to fix this problem?
The final line of the query should be
SELECT result.id, result.name,... FROM result
To avoid such collisions, you can use different names for the columns in the RETURNS TABLE clause (which are variables) and the columns in the queries (e.g. by using aliases).
Try this
GENERATE_SERVICES_ORGANIZATIONS_RELATIONSHIP AS
(
INSERT INTO SERVICES_ORGANIZATIONS_RELATIONSHIP (SERVICE_ID, ORGANIZATION_ID)
SELECT
UNNEST(ARRAY_AGG(t1.ID)) AS SERVICE_ID,
UNNEST(ARRAY[ORGANIZATION_ID_ARRAY]) AS ORGANIZATION_ID
FROM RESULTS t1
ON CONFLICT ON CONSTRAINT SERVICES_ORGANIZATIONS_RELATIONSHIP_UNIQUE_KEY DO NOTHING
)

Selecting and passing a record as a function argument

It may look like a duplicate of existing questions (e.g. This one) but they only deal with passing "new" arguments, not selecting rows from the database.
I have a table, for example:
CREATE TABLE my_table (
id bigserial NOT NULL,
name text,
CONSTRAINT my_table_pkey PRIMARY KEY (id)
);
And a function:
CREATE FUNCTION do_something(row_in my_table) RETURNS void AS
$$
BEGIN
-- does something
END;
$$
LANGUAGE plpgsql;
I would like to run it on data already existing in the database. It's no problem if I would like to use it from another PL/pgSQL stored procedure, for example:
-- ...
SELECT * INTO row_var FROM my_table WHERE id = 123; -- row_var is of type my_table%rowtype
PERFORM do_something(row_var);
-- ...
However, I have no idea how to do it using an "ordinary" query, e.g.
SELECT do_something(SELECT * FROM my_table WHERE id = 123);
ERROR: syntax error at or near "SELECT"
LINE 1: SELECT FROM do_something(SELECT * FROM my_table ...
Is there a way to execute such query?
You need to pass a scalar record to that function, this requires to enclose the actual select in another pair of parentheses:
SELECT do_something( (SELECT * FROM my_table WHERE id = 123) );
However the above will NOT work, because the function only expects a single column (a record of type my_table) whereas select * returns multiple columns (which is something different than a single record with multiple fields).
In order to return a record from the select you need to use the following syntax:
SELECT do_something( (SELECT my_table FROM my_table WHERE id = 123) );
Note that this might still fail if you don't make sure the select returns exactly one row.
If you want to apply the function to more than one row, you can do that like this:
select do_something(my_table)
from my_table;

PostgreSQL Function Returning Table or SETOF

Quick background: very new to PostgreSQL, came from SQL Server. I am working on converting stored procedures from SQL Server to PostgreSQL. I have read Postgres 9.3 documentation and looked through numerous examples and questions but still cannot find a solution.
I have this select statement that I execute weekly to return new values
select distinct id, null as charges
, generaldescription as chargedescription, amount as paidamount
from mytable
where id not in (select id from myothertable)
What can I do to turn this into a function where I can just right click and execute on a weekly basis. Eventually I will automate this using a program another developer built. So it will execute with no user involvement and send results in a spreadsheet to the user. Not sure if that last part matters to how the function is written.
This is one of my many failed attempts:
CREATE FUNCTION newids
RETURNS TABLE (id VARCHAR, charges NUMERIC
, chargedescription VARCHAR, paidamount NUMERIC) AS Results
Begin
SELECT DISTINCT id, NULL AS charges
, generaldescription AS chargedescription, amount AS paidamount
FROM mytable
WHERE id NOT IN (SELECT id FROM myothertable)
END;
$$ LANGUAGE plpgsql;
Also I am using Navicat and the error is:
function result type must be specified
You can use PL/pgSQL here and it may even be the better choice.
Difference between language sql and language plpgsql in PostgreSQL functions
But you need to fix a syntax error, match your dollar-quoting, add an explicit cast (NULL::numeric) and use RETURN QUERY:
CREATE FUNCTION newids_plpgsql()
RETURNS TABLE (
id varchar
, charges numeric
, chargedescription varchar
, paidamount numeric
) AS -- Results -- remove this
$func$
BEGIN
RETURN QUERY
SELECT ...;
END;
$func$ LANGUAGE plpgsql;
Or use a simple SQL function:
CREATE FUNCTION newids_sql()
RETURNS TABLE (
id varchar
, charges numeric
, chargedescription varchar
, paidamount numeric
) AS
$func$
SELECT ...;
$func$ LANGUAGE sql;
Either way, the SELECT statement can be more efficient:
SELECT DISTINCT t.id, NULL::numeric AS charges
, t.generaldescription AS chargedescription, t.amount AS paidamount
FROM mytable t
LEFT JOIN myothertable m ON m.id = t.id
WHERE m.id IS NULL;
Select rows which are not present in other table
Of course, all data types must match. I can't tell, table definition is missing.
And are you sure you need DISTINCT?