in sql server when deadlock occured, WHICH transaction will be abort - sql

when deadlock occurs in sql server, which of transactions will be aborted. I mean what's the plan of sql server to decide which of transactions should be killed!
consider the two transaction below
Transaction A
begin transaction
update Customers set LastName = 'Kharazmi'
waitfor delay '00:00:5'; -- waits for 5 second
update Orders set OrderId=13
commit transaction
Transaction B
begin transaction
update Orders set OrderId=14
waitfor delay '00:00:5'; -- waits for 5 second
update Customers set LastName = 'EbneSina'
commit transaction
if both transaction are executed at the same time, transaction A locks and updates Customers table whereas transaction B locks and updates Orders table. After a delay of 5 second, transaction A looks for the lock on Orders table which is already held by transaction B and transaction B looks for the lock on Customers table which is already held by transaction A. So both the transactions can not proceed further, the deadlock occurs.
my question is that, when deadlock occurs, which of the both transaction will be aborted.
first, i executes transaction A then transaction B, sql server aborts transaction A and then first executes transaction B then A the result is the same and transaction A is alorted again. it confused me!
thanks for any help.

You can learn the criteria in SET DEADLOCK_PRIORITY (Transact-SQL):
Which session is chosen as the deadlock victim depends on each
session's deadlock priority:
If both sessions have the same deadlock priority, the instance of SQL Server chooses the session that is less expensive to roll back as
the deadlock victim. For example, if both sessions have set their
deadlock priority to HIGH, the instance will choose as a victim the
session it estimates is less costly to roll back.
If the sessions have different deadlock priorities, the session with the lowest deadlock priority is chosen as the deadlock victim.
For your case, A should be considered by DBMS the less expensive transaction to roll back.

There's no way to know just be looking at the queries. Basically, so far as I understand things, the engine considers all transactions involved in the deadlock and tries to work out which one will be the "cheepest" to roll back.
So, if, say, there are 2 rows in your Customers table, and 9000000 rows in your Orders table, it will be considerably cheeper to rollback the UPDATE applied to Customers than the one that was applied to Orders.

Related

Row locks on select

I have a stored procedure which does the following:
selects top N from table
sets these rows as processed
returns these rows to the client
Here is roughly how I am doing it in Sybase ASE:
set rowcount #count
begin tran get_items
insert into #temp_table
select item
from available_item
where is_processed = 0
update available_item
set is_processed = 1
from available_item, #temp_table
where available_item.item = #temp_table.item
# select the processed items...
commit trans
I am wondering whether there is a race condition here. If two separate processes execute this stored procedure at the same time, could they select and mark processed the same data? Or does having it in a transaction stop this?
If not, is there a way to hold locks on selected rows?
Some of the details will depend on your tables locking scheme. Allpages, pages and row level locking will have different impacts on your ability to run concurrent updates on a single table. I am assuming a page/row level scheme to allow for concurrency.
Your query will grab an initial shared page/row lock, which will be upgraded to an update lock, which will then be followed by an exclusive row lock on the updated pages/rows. No other processes will be able to make changes to the selected pages/rows for the duration of the transaction, but another process could read the selected rows prior to your update, which could lead to some inconsistency.
To get around this possibility, you can specify the isolation level in the transaction to either isolation level 2 (repeatable reads), or isolation level 3 (serialization). You may want to read up on the specifics of each level to decide which you want to enforce, and the trade-offs associate with it.
In your transaction, you would use it like this:
set rowcount #count
set transaction isolation level 2
...
Something to note, is that depending on the number of records you grab in your query, you could trigger a lock upgrade which could prevent your concurrent transactions from executing, even if they are not looking at the same rows as your first transaction. By default, the server will attempt to escalate to a table lock if it acquires locks on more than 200 pages/rows. This can be changed either to an explicit value or a range of values and percentage, and is configurable at the server, database or table level.
Relevant Documentation:
Transaction: Maintaining Data Consistency and Recovery
Performance and Tuning Series: Locking and Concurrency Control
Transact-SQL Users Guide 15.7 > Transactions: Maintaining Data Consistency and Recovery

Delete and Insert Inside one Transaction SQL

I just want to ask if it is always the first query will be executed when encapsulate to a transaction? for example i got 500 k records to be deleted and 500 k to be inserted, is there a possibility of locking?
Actually I already test this query and it works fine but i want to make sure if my assumption is correct.
Note: this will Delete and Insert the same record with possible update on other columns.
BEGIN TRAN;
DELETE FROM OUTPUT TABLE WHERE ID = (1,2,3,4 etc)
INSERT INTO OUTPUT TABLE Values (1,2,3,4 etc)
COMMIT TRAN;
Within a transaction all write locks (all locks acquired for modifications) must obey the strict two phase locking rule. One of the consequences is that a write (X) lock acquired in a transaction cannot be released until the transaction commits. So yes, the DELETE and INSERT will execute sequentially and all locks acquired during the DELETE will be retained while executing the INSERT.
Keep in mind that deleting 500k rows in a transaction will escalate the locks to one table lock, see Lock Escalation.
Deleting 500k rows and inserting 500k rows in a single transaction, while maybe correct, is a bad idea. You should avoid such large units of works, long transaction, if possible. Long transactions pin the log in place, create blocking and contention, increase recovery and DB startup time, increase SQL Server resource consumption (locks require memory).
You should consider doing the operation in small batches (perhaps 10000 rows at time), use MERGE instead of DELETE/INSERT (if possible) and, last but not least, consider a partitioned sliding window
implementation, see How to Implement an Automatic Sliding Window in a Partitioned Table.
From the documentation on TRANSACTION (emphasis mine):
BEGIN TRANSACTION represents a point at which the data referenced by a
connection is logically and physically consistent. If errors are
encountered, all data modifications made after the BEGIN TRANSACTION
can be rolled back to return the data to this known state of
consistency. Each transaction lasts until either it completes without
errors and COMMIT TRANSACTION is issued to make the modifications a
permanent part of the database, or errors are encountered and all
modifications are erased with a ROLLBACK TRANSACTION statement.
BEGIN TRANSACTION starts a local transaction for the connection
issuing the statement. Depending on the current transaction isolation
level settings, many resources acquired to support the Transact-SQL
statements issued by the connection are locked by the transaction
until it is completed with either a COMMIT TRANSACTION or ROLLBACK
TRANSACTION statement. Transactions left outstanding for long periods
of time can prevent other users from accessing these locked resources,
and also can prevent log truncation.
Although BEGIN TRANSACTION starts a local transaction, it is not
recorded in the transaction log until the application subsequently
performs an action that must be recorded in the log, such as executing
an INSERT, UPDATE, or DELETE statement. An application can perform
actions such as acquiring locks to protect the transaction isolation
level of SELECT statements, but nothing is recorded in the log until
the application performs a modification action.

Check if table data has changed?

I am pulling the data from several tables and then passing the data to a long running process. I would like to be able to record what data was used for the process and then query the database to check if any of the tables have changed since the process was last run.
Is there a method of solving this problem that should work across all sql databases?
One possible solution that I've thought of is having a separate table that is only used for keeping track of whether the data has changed since the process was run. The table contains a "stale" flag. When I start running the process, stale is set to false. If any creation, update, or deletion occurs in any of the tables on which the operation depends, I set stale to true. Is this a valid solution? Are there better solutions?
One concern with my solution is situations like this:
One user starts inserting a new row into one of the tables. Stale gets set to true, but the new row has not actually been added yet. Another user has simultaneously started the long running process, pulling the data from the tables and setting the flag to false. The row is finally added. Now the data used for the process is out of date but the flag indicates it is not stale. Would transactions be able to solve this problem?
EDIT:
This is some SQL for my idea. Not sure if it works, but just to give you a better idea of what I was thinking:
# First transaction reads the data and sets the flag to false
BEGIN TRANSACTION
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
UPDATE flag SET stale = false
SELECT * FROM DATATABLE
COMMIT TRANSACTION
# Second transaction updates the data and sets the flag to true
BEGIN TRANSACTION
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
UPDATE data SET val = 15 WHERE ID = 10
UPDATE flag SET stale = true
COMMIT TRANSACTION
I do not have much experience with transactions or handwriting xml, so there are probably issues with this. From what I understand two serializable transactions can not be interleaved. Please correct me if I'm wrong.
Is there a way to accomplish this with only the first transaction? The process will be run rarely, but the updates to the data table will occur more frequently, so it would be nice to not lock up the data table when performing updates.
Also, is the SET TRANSACTION ISOLATION syntax specific to MS?
The stale flag will probably work, but a timestamp would be better since it provides more metadata about the age of the records which could be used to tune your queries, e.g., only pull data that is over 5 minutes old.
To address your concern about inserting a row at the same time a query is run, transactions with an appropriate isolation level will help. For row inserts, updates, and selects, at least use a transaction with an isolation level that prevents dirty reads so that no other connections can see the updated data until the transaction is committed.
If you are strongly concerned about the case where an update happens at the same time as a record pull, you could use the REPEATABLE READ or even SERIALIZABLE isolation levels, but this will slow DB access down.
Your SQLServer sampled should work. For alternate databases, Here's an example that works in PostGres:
Transaction 1
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- run queries that update the tables, then set last_updated column
UPDATE sometable SET last_updatee = now() WHERE id = 1;;
COMMIT;
Transaction 2
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- select data from tables, then set last_queried column
UPDATE sometable SET last_queried = now() WHERE id = 1;
COMMIT;
If transaction 1 starts, and then transaction 2 starts before transaction 1 has completed, transaction 2 will block during on the update, and then will throw an error when transaction 1 is committed. If transaction 2 starts first, and transaction 1 starts before that has finished, then transaction 1 will error. Your application code or process should be able to handle those errors.
Other databases use similar syntax - MySQL (with InnoDB plugin) requires you to set the isolation level before you start the transaction.

waitfor problem in SQL Server

while 1 = 1
begin
waitfor time #timeToRun
begin
/*delete some records from table X*/
end
end
In the codes above, will the SQL server lock the table X during the wait? I would
like to insert records into table X during this wait time. Is it possible?
All write operations acquire X locks on the rows being updated (deleted) and all these X locks will be hold until the transaction commits. Every statement creates an implicit transaction that commits automatically at the end of the statement, if no transaction is explicitly specified.
So the answer to your question depends whether you call this in a context of an existing transaction or not. If not, then (assuming you do not start a transaction in the inner begin... end block and leave the transaction open) then no lock will be held. If the code is run in a the context of an existing transaction (eg. a TransactionScope in the client started automatically by the WCF service behavior) then any lock placed by the delete will be hold while you wait, until the transaction is committed..
Two part question:
1) In the codes above, will the SQL server lock the table X during the wait?
No. It may lock the rows, but not the table.
2) I would like to insert records into table X during this wait time. Is it possible?
Yes, but they will be locked until your commit
Note: you will want to wrap any work you are doing in a transaction with a BEGIN/COMMIT block. This will avoid the locking issue entirely.

SQL Server Insert query for a forum

Considering a forum table and many users simultaneously inserting messages into it, how safe is this transaction?
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
DECLARE #LastMessageId SMALLINT
SELECT #LastMessageId = MAX(MessageId)
FROM Discussions
WHERE ForumId = #ForumId AND DiscussionId = #DiscussionId
INSERT INTO Discussions
(ForumId, DiscussionId, MessageId, ParentId, MessageSubject, MessageBody)
VALUES
(#ForumId, #DiscussionId, #LastMessageId + 1, #ParentId, #MessageSubject, #MessageBody)
IF ##ERROR = 0
BEGIN
COMMIT TRANSACTION
RETURN 0
END
ROLLBACK TRANSACTION
RETURN 1
Here I read last MessageId and increment it. I can't use Identity field because it needs to be incremented for every message inserted in a group (not every message insert into table.)
Your transaction should be quite safe indeed - check out the MSDN docs on the SERIALIZABLE transaction level:
SERIALIZABLE
Specifies the following:
Statements cannot read data that has been modified but not yet
committed by other transactions.
No other transactions can modify data that has been read by the
current transaction until the current
transaction completes.
Other transactions cannot insert new rows with key values that
would fall in the range of keys read
by any statements in the current
transaction until the current
transaction completes.
Range locks are placed in the range of key values that match the
search conditions of each statement
executed in a transaction. This blocks
other transactions from updating or
inserting any rows that would qualify
for any of the statements executed by
the current transaction. This means
that if any of the statements in a
transaction are executed a second
time, they will read the same set of
rows. The range locks are held until
the transaction completes. This is the
most restrictive of the isolation
levels because it locks entire ranges
of keys and holds the locks until the
transaction completes. Because
concurrency is lower, use this option
only when necessary. This option has
the same effect as setting HOLDLOCK on
all tables in all SELECT statements in
a transaction.
The main problem with this transaction isolation level is that it's a pretty heavy load on the server, and serializes (as the name implies) any access, so your server performance and scalability will suffer, e.g. with very high numbers of users, you'll possibly get lots of timeouts for users waiting for a transaction to finish.
So using the more lightweight approach of a global message id as INT IDENTITY is definitely much better!