What are the benefits of using the Row Constructor syntax in a T-Sql insert statement? - sql

In SQL Server 2008, you can use the Row Constructor syntax to insert multiple rows with a single insert statement, e.g.:
insert into MyTable (Col1, Col2) values
('c1v', 0),
('c2v', 1),
('c3v', 2);
Are there benefits to doing this instead of having one insert statement for each record other than readability?

Aye, there is a rather large performance difference between:
declare #numbers table (n int not null primary key clustered);
insert into #numbers (n)
values (0)
, (1)
, (2)
, (3)
, (4);
and
declare #numbers table (n int not null primary key clustered);
insert into #numbers (n) values (0);
insert into #numbers (n) values (1);
insert into #numbers (n) values (2);
insert into #numbers (n) values (3);
insert into #numbers (n) values (4);
The fact that every single insert statement has its own implicit transaction guarantees this. You can prove it to yourself easily by viewing the execution plans for each statement or by timing the executions using set statistics time on;. There is a fixed cost associated with "setting up" and "tearing down" the context for each individual insert and the second query has to pay this penalty five times while the first only pays it once.
Not only is the list method more efficient but you can also use it to build a derived table:
select *
from (values
(0)
, (1)
, (2)
, (3)
, (4)
) as Numbers (n);
This format gets around the 1,000 value limitation and allows you to join and filter your list before it is inserted. One might also notice that we're not bound to the insert statement at all! As a de facto table, this construct can be used anywhere a table reference would be valid.

Yes - you will see performance improvements. Especially with large numbers of records.

If you will be inserting more than one column of data with a SELECT in addition to your explicitly typed rows, the Table Value Constructor will require you to spell out each column individually as opposed to when you are using one INSERT statement, you can specify multiple columns in the SELECT.
For example:
USE AdventureWorks2008R2;
GO
CREATE TABLE dbo.MyProducts (Name varchar(50), ListPrice money);
GO
-- This statement fails because the third values list contains multiple columns in the subquery.
INSERT INTO dbo.MyProducts (Name, ListPrice)
VALUES ('Helmet', 25.50),
('Wheel', 30.00),
(SELECT Name, ListPrice FROM Production.Product WHERE ProductID = 720);
GO
Would fail; you would have to do it like this:
INSERT INTO dbo.MyProducts (Name, ListPrice)
VALUES ('Helmet', 25.50),
('Wheel', 30.00),
((SELECT Name FROM Production.Product WHERE ProductID = 720),
(SELECT ListPrice FROM Production.Product WHERE ProductID = 720));
GO
see Table Value Constructor Limitations and Restrictions

There is no performance benefit as Abe mentioned.
The order of the columns constructor is the required order for the values (or select statement). You can list the columns in any order - the values will have to follow that order.
If you accidently switch columns in the select statement (or values clause) and the data types are compatible, using the columns construct will help you find the problem.

Related

Does SQL treat multiple records as one when you wrap them in parenthesis in a VALUES clause?

I was recently working on a stored procedure that would insert multiple records into a table variable and came across what seemed to be an oddity, but now I realize that I may have just been confusing myself by overlooking the obvious. Regardless, take the table variable definition as:
DECLARE #TableVariable TABLE(ID INT NOT NULL PRIMARY KEY IDENTITY(1, 1), SomeValue NVARCHAR(1000));
When I execute the following insert statement, it succeeds with no issues:
INSERT INTO #TableVariable (SomeValue) VALUES
('Value 1'), ('Value 2')
However, when I came back to clean up the query, I added parenthesis to the VALUES clause in an effort to show where it ended without adding a comment (don't know why, probably a habit from C# of wrapping in #region tags):
INSERT INTO #TableVariable (SomeValue) VALUES (
('Value 1'), ('Value 2')
);
This produced an error stating that there were fewer columns in the INSERT than there were in the VALUES clause:
There are fewer columns in the INSERT statement than values specified in the VALUES clause. The number of values in the VALUES clause must match the number of columns specified in the INSERT statement.
My theory is that it sees the version with the parenthesis as an attempt to insert a single value, with multiple columns since I wrapped all of the entries in parenthesis.
My question is, why did that officially make a difference?
The syntax is simple. VALUES is followed by a list of values for a row. Each row is within its own set of parentheses.
So:
values (1), (2)
is two rows with one column.
values ( (1) ), ( (2) )
is two rows with one column -- in fact the same as above.
But:
values ( (1, 2) )
is either an error. Or in databases that support tuples one row with one column that has a value of (1, 2).

Inserting more than 1000 rows using the table value constructor as a derived table

I have the following T-SQL query which inserts a list of values into a temporary table. My issue is that the INSERT function is limited to 1,000 rows and I have a list of 4,000 rows.
Here is an extract of the query:
USE MyDatabase
create table #t1
(
ResaID numeric (20) NOT NULL,
CtyRes varchar (20) NOT NULL
);
INSERT INTO #t1 VALUES
('304475','GB'),
('304482','GB'),
('304857','GB'),
('314643','GB'),
('321711','GB'),
('321714','GB'),
...
...and the list goes on till Row 4,000
As per Microsoft documentation, this limitation can be bypassed using a table value constructor as a derived table.
Example from Microsoft: Inserting more than 1,000 rows
CREATE TABLE dbo.Test ([Value] int);
INSERT INTO dbo.Test ([Value])
SELECT drvd.[NewVal]
FROM (VALUES (0), (1), (2), (3), ..., (5000)) drvd([NewVal]);
How do I modify my existing SQL query to adapt it to this example?
You could extend MS example to handle multiple columns by using:
INSERT INTO #t1(ResaID, CtyRes)
SELECT ResaId, CtyRes
FROM (VALUES
('304475','GB'),
('304482','GB'),
('304857','GB'),
('314643','GB')) AS sub( ResaId, CtyRes);
db<>fiddle demo

SELECT * FROM NEW TABLE equivalent in Postgres

In DB2 I can do a command that looks like this to retrieve information from the inserted row:
SELECT *
FROM NEW TABLE (
INSERT INTO phone_book
VALUES ( 'Peter Doe','555-2323' )
) AS t
How do I do that in Postgres?
There are way to retrieve a sequence, but I need to retrieve arbitrary columns.
My desire to merge a select with the insert is for performance reasons. This way I only need to execute one statement to insert values and select values from the insert. The values that are inserted come from a subselect rather than a values clause. I only need to insert 1 row.
That sample code was lifted from Wikipedia Insert Article
A plain INSERT ... RETURNING ... does the job and delivers best performance.
A CTE is not necessary.
INSERT INTO phone_book (name, number)
VALUES ( 'Peter Doe','555-2323' )
RETURNING * -- or just phonebook_id, if that's all you need
Aside: In most cases it's advisable to add a target list.
The Wikipedia page you quoted already has the same advice:
Using an INSERT statement with RETURNING clause for PostgreSQL (since
8.2). The returned list is identical to the result of a SELECT.
PostgreSQL supports this kind of behavior through a returning clause in a common table expression. You generally shouldn't assume that something like this will improve performance simply because you're executing one statement instead of two. Use EXPLAIN to measure performance.
create table test (
test_id serial primary key,
col1 integer
);
with inserted_rows as (
insert into test (c1) values (3)
returning *
)
select * from inserted_rows;
test_id col1
--
1 3
Docs

Store and reuse value returned by INSERT ... RETURNING

In PostgreSQL, it is possible to put RETURNING at the end of an INSERT statement to return, say, the row's primary key value when that value is automatically set by a SERIAL type.
Question:
How do I store this value in a variable that can be used to insert values into other tables?
Note that I want to insert the generated id into multiple tables. A WITH clause is, as far as I understand, only useful for a single insert. I take it that this will probably have to be done in PHP.
This is really the result of bad design; without a natural key, it is difficult to grab a unique row unless you have a handle on the primary key;
... that can be used to insert values into other tables?
You can even do that in a single SQL statement using a data-modifying CTE:
WITH ins1 AS (
INSERT INTO tbl1(txt)
VALUES ('foo')
RETURNING tbl1_id
)
INSERT INTO tbl2(tbl1_id)
SELECT * FROM ins1
Requires PostgreSQL 9.1 or later.
db<>fiddle here (Postgres 13)
Old sqlfiddle (Postgres 9.6)
Reply to question update
You can also insert into multiple tables in a single query:
WITH ins1 AS (
INSERT INTO tbl1(txt)
VALUES ('bar')
RETURNING tbl1_id
)
, ins2 AS (
INSERT INTO tbl2(tbl1_id)
SELECT tbl1_id FROM ins1
)
INSERT INTO tbl3(tbl1_id)
SELECT tbl1_id FROM ins1;

what is the difference between insert statement with into and without into?

I have created table #temp with columns id as int identity(1,1) and name as varchar.
Say suppose I am writing the following 2 different statements for inserting rows:
insert into #temp (name) select ('Vikrant') ;
insert #temp (name) select ('Vikrant')
I want to ask what is the difference between these two types of insert statements?
Is there really any difference in between these insertions?
From the MSDN documentation:
[INTO]
Is an optional keyword that can be used between INSERT and the target table.
There is no difference between the two statements.