Store and reuse value returned by INSERT ... RETURNING - sql

In PostgreSQL, it is possible to put RETURNING at the end of an INSERT statement to return, say, the row's primary key value when that value is automatically set by a SERIAL type.
Question:
How do I store this value in a variable that can be used to insert values into other tables?
Note that I want to insert the generated id into multiple tables. A WITH clause is, as far as I understand, only useful for a single insert. I take it that this will probably have to be done in PHP.
This is really the result of bad design; without a natural key, it is difficult to grab a unique row unless you have a handle on the primary key;

... that can be used to insert values into other tables?
You can even do that in a single SQL statement using a data-modifying CTE:
WITH ins1 AS (
INSERT INTO tbl1(txt)
VALUES ('foo')
RETURNING tbl1_id
)
INSERT INTO tbl2(tbl1_id)
SELECT * FROM ins1
Requires PostgreSQL 9.1 or later.
db<>fiddle here (Postgres 13)
Old sqlfiddle (Postgres 9.6)
Reply to question update
You can also insert into multiple tables in a single query:
WITH ins1 AS (
INSERT INTO tbl1(txt)
VALUES ('bar')
RETURNING tbl1_id
)
, ins2 AS (
INSERT INTO tbl2(tbl1_id)
SELECT tbl1_id FROM ins1
)
INSERT INTO tbl3(tbl1_id)
SELECT tbl1_id FROM ins1;

Related

What happens with duplicates when inserting multiple rows?

I am running a python script that inserts a large amount of data into a Postgres database, I use a single query to perform multiple row inserts:
INSERT INTO table (col1,col2) VALUES ('v1','v2'),('v3','v4') ... etc
I was wondering what would happen if it hits a duplicate key for the insert. Will it stop the entire query and throw an exception? Or will it merely ignore the insert of that specific row and move on?
The INSERT will just insert all rows and nothing special will happen, unless you have some kind of constraint disallowing duplicate / overlapping values (PRIMARY KEY, UNIQUE, CHECK or EXCLUDE constraint) - which you did not mention in your question. But that's what you are probably worried about.
Assuming a UNIQUE or PK constraint on (col1,col2), you are dealing with a textbook UPSERT situation. Many related questions and answers to find here.
Generally, if any constraint is violated, an exception is raised which (unless trapped in subtransaction like it's possible in a procedural server-side language like plpgsql) will roll back not only the statement, but the whole transaction.
Without concurrent writes
I.e.: No other transactions will try to write to the same table at the same time.
Exclude rows that are already in the table with WHERE NOT EXISTS ... or any other applicable technique:
Select rows which are not present in other table
And don't forget to remove duplicates within the inserted set as well, which would not be excluded by the semi-anti-join WHERE NOT EXISTS ...
One technique to deal with both at once would be EXCEPT:
INSERT INTO tbl (col1, col2)
VALUES
(text 'v1', text 'v2') -- explicit type cast may be needed in 1st row
, ('v3', 'v4')
, ('v3', 'v4') -- beware of dupes in source
EXCEPT SELECT col1, col2 FROM tbl;
EXCEPT without the key word ALL folds duplicate rows in the source. If you know there are no dupes, or you don't want to fold duplicates silently, use EXCEPT ALL (or one of the other techniques). See:
Using EXCEPT clause in PostgreSQL
Generally, if the target table is big, WHERE NOT EXISTS in combination with DISTINCT on the source will probably be faster:
INSERT INTO tbl (col1, col2)
SELECT *
FROM (
SELECT DISTINCT *
FROM (
VALUES
(text 'v1', text'v2')
, ('v3', 'v4')
, ('v3', 'v4') -- dupes in source
) t(c1, c2)
) t
WHERE NOT EXISTS (
SELECT FROM tbl
WHERE col1 = t.c1 AND col2 = t.c2
);
If there can be many dupes, it pays to fold them in the source first. Else use one subquery less.
Related:
Select rows which are not present in other table
With concurrent writes
Use the Postgres UPSERT implementation INSERT ... ON CONFLICT ... in Postgres 9.5 or later:
INSERT INTO tbl (col1,col2)
SELECT DISTINCT * -- still can't insert the same row more than once
FROM (
VALUES
(text 'v1', text 'v2')
, ('v3','v4')
, ('v3','v4') -- you still need to fold dupes in source!
) t(c1, c2)
ON CONFLICT DO NOTHING; -- ignores rows with *any* conflict!
Further reading:
How to use RETURNING with ON CONFLICT in PostgreSQL?
How do I insert a row which contains a foreign key?
Documentation:
The manual
The commit page
The Postgres Wiki page
Craig's reference answer for UPSERT problems:
How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?
Will it stop the entire query and throw an exception? Yes.
To avoid that, you can look on the following SO question here, which describes how to avoid Postgres from throwing an error for multiple inserts when some of the inserted keys already exist on the DB.
You should basically do this:
INSERT INTO DBtable
(id, field1)
SELECT 1, 'value'
WHERE
NOT EXISTS (
SELECT id FROM DBtable WHERE id = 1
);

SELECT * FROM NEW TABLE equivalent in Postgres

In DB2 I can do a command that looks like this to retrieve information from the inserted row:
SELECT *
FROM NEW TABLE (
INSERT INTO phone_book
VALUES ( 'Peter Doe','555-2323' )
) AS t
How do I do that in Postgres?
There are way to retrieve a sequence, but I need to retrieve arbitrary columns.
My desire to merge a select with the insert is for performance reasons. This way I only need to execute one statement to insert values and select values from the insert. The values that are inserted come from a subselect rather than a values clause. I only need to insert 1 row.
That sample code was lifted from Wikipedia Insert Article
A plain INSERT ... RETURNING ... does the job and delivers best performance.
A CTE is not necessary.
INSERT INTO phone_book (name, number)
VALUES ( 'Peter Doe','555-2323' )
RETURNING * -- or just phonebook_id, if that's all you need
Aside: In most cases it's advisable to add a target list.
The Wikipedia page you quoted already has the same advice:
Using an INSERT statement with RETURNING clause for PostgreSQL (since
8.2). The returned list is identical to the result of a SELECT.
PostgreSQL supports this kind of behavior through a returning clause in a common table expression. You generally shouldn't assume that something like this will improve performance simply because you're executing one statement instead of two. Use EXPLAIN to measure performance.
create table test (
test_id serial primary key,
col1 integer
);
with inserted_rows as (
insert into test (c1) values (3)
returning *
)
select * from inserted_rows;
test_id col1
--
1 3
Docs

SQLite : insert multiple value (select from) with insert specified value

i'm trying to make sqlite query that can insert multiple values , here's my query that i try :
insert into table1(idy_table1,idx_table1)
values ('1', //specified value insert to idy_table1
(select id_table2 from table2)) //insert value from select id_table2
i'm having some trouble, it just only insert one value,
and my question is how to make a proper query? so i can make it work.
The VALUES clause always adds one row.
(Except when you're using multiple tuples, but this does not work with queries.)
The easiest way to add multiple rows from a query is to use the SELECT form of the INSERT statement:
INSERT INTO Table1(idy_table1, idx_table1)
SELECT '1', id_table2 FROM table2;

What are the benefits of using the Row Constructor syntax in a T-Sql insert statement?

In SQL Server 2008, you can use the Row Constructor syntax to insert multiple rows with a single insert statement, e.g.:
insert into MyTable (Col1, Col2) values
('c1v', 0),
('c2v', 1),
('c3v', 2);
Are there benefits to doing this instead of having one insert statement for each record other than readability?
Aye, there is a rather large performance difference between:
declare #numbers table (n int not null primary key clustered);
insert into #numbers (n)
values (0)
, (1)
, (2)
, (3)
, (4);
and
declare #numbers table (n int not null primary key clustered);
insert into #numbers (n) values (0);
insert into #numbers (n) values (1);
insert into #numbers (n) values (2);
insert into #numbers (n) values (3);
insert into #numbers (n) values (4);
The fact that every single insert statement has its own implicit transaction guarantees this. You can prove it to yourself easily by viewing the execution plans for each statement or by timing the executions using set statistics time on;. There is a fixed cost associated with "setting up" and "tearing down" the context for each individual insert and the second query has to pay this penalty five times while the first only pays it once.
Not only is the list method more efficient but you can also use it to build a derived table:
select *
from (values
(0)
, (1)
, (2)
, (3)
, (4)
) as Numbers (n);
This format gets around the 1,000 value limitation and allows you to join and filter your list before it is inserted. One might also notice that we're not bound to the insert statement at all! As a de facto table, this construct can be used anywhere a table reference would be valid.
Yes - you will see performance improvements. Especially with large numbers of records.
If you will be inserting more than one column of data with a SELECT in addition to your explicitly typed rows, the Table Value Constructor will require you to spell out each column individually as opposed to when you are using one INSERT statement, you can specify multiple columns in the SELECT.
For example:
USE AdventureWorks2008R2;
GO
CREATE TABLE dbo.MyProducts (Name varchar(50), ListPrice money);
GO
-- This statement fails because the third values list contains multiple columns in the subquery.
INSERT INTO dbo.MyProducts (Name, ListPrice)
VALUES ('Helmet', 25.50),
('Wheel', 30.00),
(SELECT Name, ListPrice FROM Production.Product WHERE ProductID = 720);
GO
Would fail; you would have to do it like this:
INSERT INTO dbo.MyProducts (Name, ListPrice)
VALUES ('Helmet', 25.50),
('Wheel', 30.00),
((SELECT Name FROM Production.Product WHERE ProductID = 720),
(SELECT ListPrice FROM Production.Product WHERE ProductID = 720));
GO
see Table Value Constructor Limitations and Restrictions
There is no performance benefit as Abe mentioned.
The order of the columns constructor is the required order for the values (or select statement). You can list the columns in any order - the values will have to follow that order.
If you accidently switch columns in the select statement (or values clause) and the data types are compatible, using the columns construct will help you find the problem.

Does DB2 have an "insert or update" statement?

From my code (Java) I want to ensure that a row exists in the database (DB2) after my code is executed.
My code now does a select and if no result is returned it does an insert. I really don't like this code since it exposes me to concurrency issues when running in a multi-threaded environment.
What I would like to do is to put this logic in DB2 instead of in my Java code.
Does DB2 have an insert-or-update statement? Or anything like it that I can use?
For example:
insertupdate into mytable values ('myid')
Another way of doing it would probably be to always do the insert and catch "SQL-code -803 primary key already exists", but I would like to avoid that if possible.
Yes, DB2 has the MERGE statement, which will do an UPSERT (update or insert).
MERGE INTO target_table USING source_table ON match-condition
{WHEN [NOT] MATCHED
THEN [UPDATE SET ...|DELETE|INSERT VALUES ....|SIGNAL ...]}
[ELSE IGNORE]
See:
http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/r0010873.htm
https://www.ibm.com/support/knowledgecenter/en/SS6NHC/com.ibm.swg.im.dashdb.sql.ref.doc/doc/r0010873.html
https://www.ibm.com/developerworks/community/blogs/SQLTips4DB2LUW/entry/merge?lang=en
I found this thread because I really needed a one-liner for DB2 INSERT OR UPDATE.
The following syntax seems to work, without requiring a separate temp table.
It works by using VALUES() to create a table structure . The SELECT * seems surplus IMHO but without it I get syntax errors.
MERGE INTO mytable AS mt USING (
SELECT * FROM TABLE (
VALUES
(123, 'text')
)
) AS vt(id, val) ON (mt.id = vt.id)
WHEN MATCHED THEN
UPDATE SET val = vt.val
WHEN NOT MATCHED THEN
INSERT (id, val) VALUES (vt.id, vt.val)
;
if you have to insert more than one row, the VALUES part can be repeated without having to duplicate the rest.
VALUES
(123, 'text'),
(456, 'more')
The result is a single statement that can INSERT OR UPDATE one or many rows presumably as an atomic operation.
This response is to hopefully fully answer the query MrSimpleMind had in use-update-and-insert-in-same-query and to provide a working simple example of the DB2 MERGE statement with a scenario of inserting AND updating in one go (record with ID 2 is updated and record ID 3 inserted).
CREATE TABLE STAGE.TEST_TAB ( ID INTEGER, DATE DATE, STATUS VARCHAR(10) );
COMMIT;
INSERT INTO TEST_TAB VALUES (1, '2013-04-14', NULL), (2, '2013-04-15', NULL); COMMIT;
MERGE INTO TEST_TAB T USING (
SELECT
3 NEW_ID,
CURRENT_DATE NEW_DATE,
'NEW' NEW_STATUS
FROM
SYSIBM.DUAL
UNION ALL
SELECT
2 NEW_ID,
NULL NEW_DATE,
'OLD' NEW_STATUS
FROM
SYSIBM.DUAL
) AS S
ON
S.NEW_ID = T.ID
WHEN MATCHED THEN
UPDATE SET
(T.STATUS) = (S.NEW_STATUS)
WHEN NOT MATCHED THEN
INSERT
(T.ID, T.DATE, T.STATUS) VALUES (S.NEW_ID, S.NEW_DATE, S.NEW_STATUS);
COMMIT;
Another way is to execute this 2 queries. It's simpler than create a MERGE statement:
update TABLE_NAME set FIELD_NAME=xxxxx where MyID=XXX;
INSERT INTO TABLE_NAME (MyField1,MyField2) values (xxx,xxxxx)
WHERE NOT EXISTS(select 1 from TABLE_NAME where MyId=xxxx);
The first query just updateS the field you need, if the MyId exists.
The second insertS the row into db if MyId does not exist.
The result is that only one of the queries is executed in your db.
I started with hibernate project where hibernate allows you to saveOrUpdate().
I converted that project into JDBC project the problem was with save and update.
I wanted to save and update at the same time using JDBC.
So, I did some research and I came accross ON DUPLICATE KEY UPDATE :
String sql="Insert into tblstudent (firstName,lastName,gender) values (?,?,?)
ON DUPLICATE KEY UPDATE
firstName= VALUES(firstName),
lastName= VALUES(lastName),
gender= VALUES(gender)";
The issue with the above code was that it updated primary key twice which is true as
per mysql documentation:
The affected rows is just a return code. 1 row means you inserted, 2 means you updated, 0 means nothing happend.
I introduced id and increment it to 1. Now I was incrementing the value of id and not mysql.
String sql="Insert into tblstudent (id,firstName,lastName,gender) values (?,?,?)
ON DUPLICATE KEY UPDATE
id=id+1,
firstName= VALUES(firstName),
lastName= VALUES(lastName),
gender= VALUES(gender)";
The above code worked for me for both insert and update.
Hope it works for you as well.