Insert batch statement difference between two queries - sql

Is there any difference between these two queries:
create table tab(value int, ts timestamp) timestamp(ts)
-- q1:
insert batch 2 into tab values(1, systimestamp()),(2, systimestamp())
-- q2:
insert batch 2 into tab select cast(x as int),systimestamp() from long_sequence(2)
Should they be equivalent?

In q1 batch 2 has no effect.
In q2 batch 2 will delay commit until two insert statements have been seen.

Related

How to maintain transaction integrity per common column value for inserting into two tables?

Let's say I have two table variables declared as below:
DECLARE #Table1 TABLE (
A INT,
B NVARCHAR(100)
)
DECLARE #Table2 TABLE (
A INT,
C NVARCHAR(100)
)
Here are the contents of #Table1:
1, 'Hello'
2, 'Hi'
3, 'Ola'
These are the contents of #Table2:
1, 'my old friend'
1, 'sweetheart'
2, 'buddy'
4, 'the end'
Now I want to insert #Table1 into a table X and #Table2 into a table Y. The scenario is that I have to maintain transaction integrity for the insertion into both X and Y for every same value of column A.
For instance, let's say I am inserting the first row (1,'Hello') of #Table1 into X. This means I must also insert the first two rows ((1,'my old friend'), (1,'sweetheart')) of #Table2 into Y in the same transaction. So if any insert of Y fails for A=1, X also fails for A=1. For any value of column A that is not in both #Table1 and #Table2, they are individual transactions by themselves (e.g. A=3 in #Table1 and A=4 in #Table2).
Here are the ways I see to deal with this problem:
I fetch all values of A in both #Table1 and #Table2, run a cursor over it and then for each value of A, I insert into tables X and Y in a single transaction. The issue here is first of all, I don't want to use cursors as much as possible and also, this would mean a super large number of individual inserts.
I pre-validate my #Table1 and #Table2 values and then do one single insert of #Table1 on X and #Table2 on Y. This will be much faster than the above method. But the issues I see here are that first of all, not putting it in a 'transaction' somehow doesn't seem right and also, there could be a small chance I might have missed a validation somewhere (unlikely, yet still).
Which approach should I go for? Is there a better solution?
P.S. Please also note that I do not want to fail the entire insert on X and Y if there is an issue in inserting for only one or few values of A. Also, going back and deleting tables from my DB based on the failed inserts is also not an option as it messes with the running id continuity which I am trying to avoid.
A DML statement is executed completely or not executed at all.
You can do a mix of your two options
First add as much validations as possible, if it fails run the second one using a temp table instead of a cursor
BEGIN TRY
BEGIN TRAN Opt2
--Option 2
COMMIT TRAN Opt2
END TRY
BEGIN CATCH
ROLLBACK TRAN Opt2
DECLARE #A INT, #B VARCHAR(100)
SELECT * INTO #TMP FROM #Table1
WHILE EXISTS (SELECT 1 FROM #TMP)
BEGIN
SELECT TOP 1 #A = A, #B = B FROM #Table1
-- Option 1
DELETE #Temp WHERE A = #A
END
END CATCH

Using "BEGIN TRANSACTION" and "END TRANSACTION" to improve the performance

I am reading the post Improve INSERT-per-second performance of SQLite? to improve the performance of my SQLite.
One question is: If I need to perform the following queries:
INSERT INTO
INSERT INTO
...
INSERT INTO(more than 10000 times)
SELECT ...
SELECT
UPDATE ...
If I want to improve the performance, should I insert "BEGIN TRANSATION" and "END TRANSATION" at the very beginning and ending of all codes, like this:
BEGIN TRANSACTION
INSERT INTO
INSERT INTO
...
INSERT INTO(more than 10000 times)
SELECT ...
SELECT
UPDATE ...
UPDATE ...
END TRANSACTION
Or should I insert BEGIN/END TRANSACTION just for the insert operation?
BEGIN TRANSACTION
INSERT INTO
INSERT INTO
...
INSERT INTO(more than 10000 times)
END TRANSACTION
SELECT ...
SELECT
UPDATE ...
UPDATE ...
IF the INSERTs are for the same table, with the same columns inserted, using one insert will improve performance significantly, that's because each seperate insert command includes going back and forth from the DB, much more time than the actual query time.
Based on the limits of the server (other processes logged in etc) , I would set a limit to the number of inserted rows, for example a 1000 rows at a time.
INSERT INTO table (col1, col2, col3,...) VALUES
{(v1, v2, v3,...), }X 1000;
Is much faster than
{
INSERT INTO table (col1, col2, col3,...) VALUES
(v1, v2, v3,...);
}
X 1000
hope that helps

Difference in inserting values into SQL Server

I am using SQL Server 2012
The query is:
drop table x
create table x(id int primary key)
insert into x values(5)
insert into x values(6)
begin tran
insert into x values(1),(2),(3),(3),(4)--Primary key violation
commit tran
select* from x
This returns
5
6
and another query
drop table x
create table x(id int primary key)
insert into x values(5)
insert into x values(6)
begin tran
insert into x values(1)
insert into x values(2)
insert into x values(3)
insert into x values(3) --Primary key violation
insert into x values (4)
commit tran
select * from x
This returns
1
2
3
4
5
6
So what is the difference in inserting values in SQL Server?
Between those 2 queries and why the different result sets?
Sample 1 has a single insert statement for the 1,2,3,3,4,5. This is a "bulk" insert statement (however SQL Server uses the term bulk insert in a different fashion). Essentially it means all the inserts in this line are executed as 1 single action.
Sample 2 has separate insert statements. Since there is no exception handling in place, there is no reason for the transaction to abort. The error is ignored, the other records are added, and the result is then what you see.
SQL server executes the queries as batches. So if any error occurs in the batch, according to MSDN, one the following is possible.
No statements in the batch are executed.
No statements in the batch are executed and the transaction is rolled back.
All of the statements before the error statement are executed.
All of the statements except the error statement are executed.
In your first case, "No statements in the batch are executed". And in your second case, "All of the statements except the error statement or executed".
For more about SQL batches, please refer the following MSDN articles,
Batches of SQL Statements
Executing Batches
Errors and Batches

Simulate a deadlock using stored procedure

Does anyone know how to simulate a deadlock using a stored procedure inserting or updating values? I could only do so in sybase using individual commands.
Thanks,
Ver
Create two stored procedures.
The first should start a transaction, modify table 1 (and take a long time) and then modify table 2.
The second should start a transaction, modify table 2 (and take a long time) and then modify table 1.
Ideally, the modifications should affect the same rows, or create table locks.
Then, in a client application, start SP1, and immediately then also start SP2 (before SP1 has finished).
The simple and short answer to get a deadlock will be to access the tables data in a reverse order and hence introducing a cyclic deadlock between two connections. Let me show you code:
Create table vin_deadlock (id int, Name Varchar(30))
GO
Insert into vin_deadlock values (1, 'Vinod')
Insert into vin_deadlock values (2, 'Kumar')
Insert into vin_deadlock values (3, 'Saravana')
Insert into vin_deadlock values (4, 'Srinivas')
Insert into vin_deadlock values (5, 'Sampath')
Insert into vin_deadlock values (6, 'Manoj')
GO
Now with the tables ready. Just update the columns in the reverse order from two connections like:
-- Connection 1
Begin Tran
Update vin_deadlock
SET Name = 'Manoj'
Where id = 6
WAITFOR DELAY '00:00:10'
Update vin_deadlock
SET Name = 'Vinod'
Where id = 1
and from connection 2
-- Connection 2
Begin Tran
Update vin_deadlock
SET Name = 'Vinod'
Where id = 1
WAITFOR DELAY '00:00:10'
Update vin_deadlock
SET Name = 'Manoj'
Where id = 6
And this will result in a deadlock. You can see the deadlock graph from profiler.
Start a process which continously insert or update a table using while loop with script and run your desired sp.

Insert into a temporary table and update another table in one SQL query (Oracle)

Here's what I'm trying to do:
1) Insert into a temp table some values from an original table
INSERT INTO temp_table SELECT id FROM original WHERE status='t'
2) Update the original table
UPDATE original SET valid='t' WHERE status='t'
3) Select based on a join between the two tables
SELECT * FROM original WHERE temp_table.id = original.id
Is there a way to combine steps 1 and 2?
You can combine the steps by doing the update in PL/SQL and using the RETURNING clause to get the updated ids into a PL/SQL table.
EDIT:
If you still need to do the final query, you can still use this method to insert into the temp_table; although depending on what that last query is for, there may be other ways of achieving what you want. To illustrate:
DECLARE
id_table_t IS TABLE OF original.id%TYPE INDEX BY PLS_INTEGER;
id_table id_table_t;
BEGIN
UPDATE original SET valid='t' WHERE status='t'
RETURNING id INTO id_table;
FORALL i IN 1..id_table.COUNT
INSERT INTO temp_table
VALUES (id_table(i));
END;
/
SELECT * FROM original WHERE temp_table.id = original.id;
No, DML statements can not be mixed.
There's a MERGE statement, but it's only for operations on a single table.
Maybe create a TRIGGER wich fires after inserting into a temp_table and updates the original
Create a cursor holding the values from insert and then loop through the cursor updating the table. No need to create temp table in the first place.
You can combine steps 1 and 2 using a MERGE statement and DML error logging. Select twice as many rows, update half of them, and force the other half to fail and then be inserted into an error log that you can use as your temporary table.
The solution below assumes that you have a primary key constraint on ID, but there are other ways you could force a failure.
Although I think this is pretty cool, I would recommend you not use it. It looks very weird, has some strange issues (the inserts into TEMP_TABLE are auto-committed), and is probably very slow.
--Create ORIGINAL table for testing.
--Primary key will be intentionally violated later.
create table original (id number, status varchar2(10), valid varchar2(10)
,primary key (id));
--Create TEMP_TABLE as error log. There will be some extra columns generated.
begin
dbms_errlog.create_error_log(dml_table_name => 'ORIGINAL'
,err_log_table_name => 'TEMP_TABLE');
end;
/
--Test data
insert into original values(1, 't', null);
insert into original values(2, 't', null);
insert into original values(3, 's', null);
commit;
--Update rows in ORIGINAL and also insert those updated rows to TEMP_TABLE.
merge into original original1
using
(
--Duplicate the rows. Only choose rows with the relevant status.
select id, status, valid, rownumber
from original
cross join
(select 1 rownumber from dual union all select 2 rownumber from dual)
where status = 't'
) original2
on (original1.id = original2.id and original2.rownumber = 1)
--Only math half the rows, those with rownumber = 1.
when matched then update set valid = 't'
--The other half will be inserted. Inserting ID causes a PK error and will
--insert the data into the error table, TEMP_TABLE.
when not matched then insert(original1.id, original1.status, original1.valid)
values(original2.id, original2.status, original2.valid)
log errors into temp_table reject limit 999999999;
--Expected: ORIGINAL rows 1 and 2 have VALID = 't'.
--TEMP_TABLE has the two original values for ID 1 and 2.
select * from original;
select * from temp_table;