How does insert into select work internally in SQL Server? - sql

Let's consider the following insertion:
INSERT INTO SOMETABLE
SELECT * FROM SOMETABLE
WHERE Column > NUMBER
Is there a risk that result set of the SELECT statement can be changed due to inserting?
In the other words how many times the SELECT statement can be calculated by query engine to perform INSERT INTO SELECT to the same table?
Assumption: A single thread scenario without concurrent transactions.

Is there a risk that result set of the SELECT statement can be changed due to inserting?
No. The result set of the SELECT statement will not be affected by the INSERT. The result of the SELECT will be the same as if you ran it by itself.

Related

Insert into select, target data unaffected

We have a simple query
INSERT INTO table2
SELECT *
FROM table1
WHERE condition;
I can read somewhere that to use INSERT INTO SELECT statement, the following condition must be fulfilled:
The existing records in the target table are unaffected
What does it mean?
INSERT is a SQL operations that add some new rows into your table, with not affect on the others. This is happening instead of UPDATE operations, that cand affect multiple rows from your table if you use a wrong WHERE Clause.

Get Id from a conditional INSERT

For a table like this one:
CREATE TABLE Users(
id SERIAL PRIMARY KEY,
name TEXT UNIQUE
);
What would be the correct one-query insert for the following operation:
Given a user name, insert a new record and return the new id. But if the name already exists, just return the id.
I am aware of the new syntax within PostgreSQL 9.5 for ON CONFLICT(column) DO UPDATE/NOTHING, but I can't figure out how, if at all, it can help, given that I need the id to be returned.
It seems that RETURNING id and ON CONFLICT do not belong together.
The UPSERT implementation is hugely complex to be safe against concurrent write access. Take a look at this Postgres Wiki that served as log during initial development. The Postgres hackers decided not to include "excluded" rows in the RETURNING clause for the first release in Postgres 9.5. They might build something in for the next release.
This is the crucial statement in the manual to explain your situation:
The syntax of the RETURNING list is identical to that of the output
list of SELECT. Only rows that were successfully inserted or updated
will be returned. For example, if a row was locked but not updated
because an ON CONFLICT DO UPDATE ... WHERE clause condition was not
satisfied, the row will not be returned.
Bold emphasis mine.
For a single row to insert:
Without concurrent write load on the same table
WITH ins AS (
INSERT INTO users(name)
VALUES ('new_usr_name') -- input value
ON CONFLICT(name) DO NOTHING
RETURNING users.id
)
SELECT id FROM ins
UNION ALL
SELECT id FROM users -- 2nd SELECT never executed if INSERT successful
WHERE name = 'new_usr_name' -- input value a 2nd time
LIMIT 1;
With possible concurrent write load on the table
Consider this instead (for single row INSERT):
Is SELECT or INSERT in a function prone to race conditions?
To insert a set of rows:
How to use RETURNING with ON CONFLICT in PostgreSQL?
How to include excluded rows in RETURNING from INSERT ... ON CONFLICT
All three with very detailed explanation.
For a single row insert and no update:
with i as (
insert into users (name)
select 'the name'
where not exists (
select 1
from users
where name = 'the name'
)
returning id
)
select id
from users
where name = 'the name'
union all
select id from i
The manual about the primary and the with subqueries parts:
The primary query and the WITH queries are all (notionally) executed at the same time
Although that sounds to me "same snapshot" I'm not sure since I don't know what notionally means in that context.
But there is also:
The sub-statements in WITH are executed concurrently with each other and with the main query. Therefore, when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot
If I understand correctly that same snapshot bit prevents a race condition. But again I'm not sure if by all the statements it refers only to the statements in the with subqueries excluding the main query. To avoid any doubt move the select in the previous query to a with subquery:
with s as (
select id
from users
where name = 'the name'
), i as (
insert into users (name)
select 'the name'
where not exists (select 1 from s)
returning id
)
select id from s
union all
select id from i

OUTPUT Clause in Sql Server (Transact-SQL)

I Know that OUTPUT Clause can be used in INSERT, UPDATE, DELETE, or MERGE statement. The results of an OUTPUT clause in a INSERT, UPDATE, DELETE, or MERGE statements can be stored into a target table.
But, when i run this query
select * from <Tablename> output
I didn't get any error. The query executed as like select * from tablename with out any error and with same no. of rows
So what is the exact use of output clause in select statement. If any then how it can be used?
I searched for the answer but i couldn't find a answer!!
The query in your question is in the same category of errors as the following (that I have also seen on this site)
SELECT *
FROM T1 NOLOCK
SELECT *
FROM T1
LOOP JOIN T2
ON X = Y
The first one just ends up aliasing T1 AS NOLOCK. The correct syntax for the hint would be (NOLOCK) or ideally WITH(NOLOCK).
The second one aliases T1 AS LOOP. To request a nested loops join the syntax would need to be INNER LOOP JOIN
Similarly in your question it just ends up applying the table alias of OUTPUT to your table.
None of OUTPUT, LOOP, NOLOCK are actually reversed keywords in TSQL so it is valid to use them as a table alias without needing to quote them, e.g. in square brackets.
OUTPUT clause return information about the rows affected by a statement. OUTPUT Clause is used along with INSERT, UPDATE, DELETE, or MERGE statements as you mentioned. The reason it is used is because these statements themselves just return the number of rows effected not the rows effected. Thus the usage of OUTPUT with INSERT, UPDATE, DELETE, or MERGE statements helps the user by returning actual rows effected.
SELECT statement itself returns the rows and SELECT doesn't effect any rows. Thus the usage of OUTPUT clause with SELECT is not required or supported. If you want to store the results of a SELECT statement into a target table use SELECT INTO or the standard INSERT along with the SELECT statement.
EDIT
I guess I misunderstood your question. AS #Martin Smith mentioned its is acting an alias in the SELECT statement you mentioned.
IF OBJECT_ID('tempdelete') IS NOT NULL DROP TABLE tempdelete
GO
IF OBJECT_ID('tempdb..#asd') IS NOT NULL DROP TABLE #asd
GO
CREATE TABLE tempdelete (
name NVARCHAR(100)
)
INSERT INTO tempdelete VALUES ('a'),('b'),('c')
--Creating empty temp table with the same columns as tempdelete
SELECT * INTO #asd FROM tempdelete WHERE 1 = 0
DELETE FROM tempdelete
OUTPUT deleted.* INTO #asd
SELECT * FROM #asd
This is how you can put all the deleted records in to a table. The problem with that is that you have to define the table with all the columns matching the table from which you are deleting. This is how i do it.

SQL Error [512] [21000]: Subquery returned more than 1 value

Why does it not work? (SQL Server)
UPDATE
someTable
SET
name='AB'
WHERE
id IN (
SELECT t.id
FROM someTable t
WHERE t.name='ABC'
)
this one doesn't work too
UPDATE
someTable
SET
name='AB'
WHERE
name='ABC'
Because you must have a broken UPDATE trigger on the table.
A common error in triggers is not taking into account that a statement can affect multiple or zero rows and thus that the INSERTED/DELETED tables don't always contain exactly one row.
Look in the trigger for constructs like
SET #ID = (select ID FROM INSERTED)

How to get last inserted row from a table?

I tried below query but results in more than one row and [SCOPE_IDENTITY] as NULL.What are the alternates?
SELECT TOP 1000
[RTID],xxx,xxx
FROM [RouteTiming]
GO
SELECT SCOPE_IDENTITY() AS [SCOPE_IDENTITY];
GO
SELECT ##IDENTITY AS [##IDENTITY];
GO
Depending on the server version (SQL Server 2005+) you can use the OUTPUT clause:
INSERT INTO tablename (column names)
OUTPUT --this is where you put your select statement to get returned IDs etc.
VALUES (values in here, or you can use a select statememt as per usual)
MSDN Article: OUTPUT Clause (Transact-SQL)
If you are inside a stored procedure or a function, you can use INSERTED table (there is also a DELETED table) which is stored in memory until the scope is completed.
Once you have performed your insert, you can join the inserted table just like any other as long as it is within the same scope. I believe the inserted table has been around since SQL Server 2000, but it is definitely in 2005+.
MSDN Examples: Use the inserted and deleted Tables