Single statement HOLDLOCK - locking

I have some legacy code in Sybase ase and there is a statement which contains HOLDLOCK. Due to these hold locks sometimes deadlocks occur. So I want to get rid of hold locks. At the same time, I don't get what the purpose of hold lock in the statement and I guess whether I can just remove hold lock and all would be ok.
Statement is
Insert into #temptable
Select
T2.field1,
T1.field1,
T1.field2
From T1 HOLDLOCK
join T2 on T1.field3 = T2.field3
Where some_conditions
Union
Select
T2.field1,
T1.field1,
T1.field2
From T1 HOLDLOCK
join T2 on T1.field3 = T2.field3
Where some_other_conditions
When this code run there could run concurrently some other code which updates and inserts T1.
When I've read about hold lock in the documentation it says that isolation level 3 or HOLDLOCK prevents phantom rows or phantom updates. It is when there is one multi-statement transaction which needs to hold locks on data pages so there will be no changes for data pages between two statements.
But here there is one transaction and I don't get what HOLDLOCK is trying to overcome.
Maybe UNION here is a chance to read two versions of the same row from one table?

Related

INSERT INTO SELECT with LEFT JOIN not preventing duplicates for simultaneous hits

I have this SQL query that inserts records from one table to another without duplicates.
It works fine, if I call this SQL query from one instance of my application. But in production, the application is horizontally scaled, having more than one instance of application, each calling below query simultaneously at the same time. That is causing duplicate records to me. Is there any way to fix this query, so it allows simultaneous hits?
INSERT INTO table1 (col1, col2)
SELECT DISTINCT TOP 10
t2.col1,
t2.col2
FROM
table2 t2
LEFT JOIN
table1 t1 ON t2.col1 = t1.col1
AND t2.col2 = t1.col2
WHERE
t1.col1 IS NULL
The corrective action here depends on the behavior you want. If you intend to allow for just a single horizontal instance of your application to execute this query, then you need to create a critical section, into which one instance is allowed to enter. Since you are already using SQL Server, you could implement by forcing each instance to get a lock on a certain table. Only the instance which gets the lock will execute the query, and the others will drop off.
If, on the other hand, you really want each instance to execute the query, then you should use a serializable transaction. Using a serializable transaction will ensure that only one instance can do the insert on the table at a given time. It would not be possible for two or more instances to interleave and execute the same insert.

Retrieve UNLOCKED records

In SSMS, in one session, I acquired a exclusive lock on a table1 for a specific record as below.
Session1
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRAN
SELECT * FROM TABLE1 WITH (XLOCK,ROWLOCK)
WHERE (FIELD1+FIELD2) = ('0101R001');
In another Session2
How to get unlocked records from table1.
When used with readpast as below, the results are inconsistent (displays all records). Is there a alternative ways to identify the unlocked records alone from table1 ?
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT * FROM TABLE1 WITH (READPAST)
Before modification, we acquire a xclusive lock and we do some processing on those fetched values and then initiate an update statement followed by commit. During this time, no other process should change the values of that record. Using XLOCK, was able to get an exclusive lock on that record. This is working. In the select statement, if we change to WITH (UPDLOCK), it allows other process to read that record values which we want to restrict.
With Xclusive locks in place, if we use (READPAST) along with where clause involving primary keys to check for specific record, it skips that locked record which is an expected behaviour. If no where clause, it displays all records (incl locked records).
SELECT * FROM TABLE1 WITH (READPAST)

Is INSERT ... SELECT an atomic transaction?

I use a query like this:
INSERT INTO table
SELECT * FROM table2 t2
JOIN ...
...
WHERE table2.date < now() - '1 day'::INTERVAL
FOR UPDATE OF t2 SKIP LOCKED
ON CONFLICT (...)
DO UPDATE SET ...
RETURNING *;
My question is about FOR UPDATE t2 SKIP LOCKED. Should I use it here? Or will Postgres lock these rows automatically with INSERT SELECT ON CONFLICT till the end of the transaction?
My goal is to prevent other apps from (concurrently) capturing rows with the inner SELECT which are already captured by this one.
Yes, FOR UPDATE OF t2 SKIP LOCKED is the right approach to prevent race conditions with default Read Committed transaction isolation.
The added SKIP LOCKED also prevents deadlocks. Be aware that competing transactions might each get a partial set from the SELECT - whatever it could lock first.
While any transaction is atomic in Postgres, it would not prevent another (also atomic) transaction from selecting (and inserting - or at least trying) the same row, because SELECT without FOR UPDATE does not take an exclusive lock.
The Postgres manual about transactions:
A transaction is said to be atomic: from the point of view of other transactions, it either happens completely or not at all.
Related:
Postgres UPDATE … LIMIT 1
Clarifications:
An SQL DML command like INSERT is always automatically atomic, since it cannot run outside a transaction. But you can't say that INSERT is a transaction. Wrong terminology.
In Postgres all locks are kept until and released at the end of the current transaction.

Determining threshold for lock escalation

I have a table with around 2.5 millions records and will be updating around 700k of them and want to update these while still allowing other users to see the data. My update statement looks something like this:
UPDATE A WITH (UPDLOCK,ROWLOCK)
SET A.field = B.field
FROM Table_1 A
INNER JOIN Table2 B ON A.id = B.id WHERE A.field IS NULL
AND B.field IS NOT NULL
I was wondering if there was any way to work out at what point sql server will escalate a lock placed on an update statement (as I don't want the whole table to be locked)?
I don't have permissions to run a server trace to see how the locks are being applied, so is there any other way of knowing at what point the lock will be escalated to cover the whole table?
Thanks!
According to BOL once the statement has acquired 5,000 row or page level locks on a single instance of an object an attempt is made to escalate the locks. If this attempt fails because another transaction has a conflicting lock then it will try again after every additional 1,250 locks are acquired.
I'm not sure if you can actually take these figures as gospel or not or whether there are a few more subtleties than that (I guess you could always hit the memory limit for the instance at any number of locks)
As #Martin states, 5000 is the number BOL gives, however i've seen the actual number vary in production.
You have two options:
1) Batch your updates and try to keep the batchsize under 5000
2) Disable lock escalation (becareful) via:
*ALTER TABLE (sql2k8)
*Trace Flags 1211/1224
(SQL Server 2008: Lock escalation changes)
By other locking tricks
Here's a method you can use to systematically determine your threshold. (Assuming you have VIEW SERVER STATE permissions).
DECLARE #BatchSize int;
SET #BatchSize = <Vary this number until you see a table lock taken>;
BEGIN TRAN
UPDATE TOP(#BatchSize) A WITH (UPDLOCK,ROWLOCK)
SET A.field = B.field
FROM Table_1 A
INNER JOIN Table2 B ON A.id = B.id
WHERE A.field IS NULL
AND B.field IS NOT NULL
SELECT
*
FROM
sys.dm_tran_locks
WHERE
[request_session_id] = ##spid
ROLLBACK
ROWLOCK hint does not prevent lock escalation, it just informs the server that it should not assume initial locking level and start from rows.
Row locks then may be promoted to a table lock.
To make the table data available for reading during the update, use SNAPSHOT transaction isolation level.

SQL Server locks - avoid insertion of duplicate entries

After reading a lot of articles and many answers related to the above subject, I am still wondering how the SQL Server database engine works in the following example:
Let's assume that we have a table named t3:
create table t3 (a int , b int);
create index test on t3 (a);
and a query as follow:
INSERT INTO T3
SELECT -86,-86
WHERE NOT EXISTS (SELECT 1 FROM t3 where t3.a=-86);
The query inserts a line in the table t3 after verifying that the row does not already exist based on the column "a".
Many articles and answers indicate that using the above query there is no way that a row will be inserted twice.
For the execution of the above query, I assume that the database engine works as follow:
The subquery is executed first.
The database engine sets a shared(s) lock on a range.
The data is read.
The shared lock is released. According to MSDN a shared
lock is released as soon as the data
has been read.
If a row does not exist it inserts a new line in the table.
The new line is locked with an exclusive lock (x)
Now consider the following scenario:
The above query is executed by processor A (SPID 1).
The same query is executed by a
processor B (SPID 2).
[SPID 1] The database engine sets a shared(s) lock
[SPID 1] The subquery reads the
data. Now rows are returned.
[SPID 1] The shared lock is
released.
[SPID 2] The database engine sets a
shared(s) lock
[SPID 2] The subquery reads the
data. No rows are return.
[SPID 2] The shared lock is
released.
Both processes proceed with a row insertion (and we get a duplicate entry).
Am I missing something? Is the above way a correct way for avoiding duplicate entries?
A safe way to avoid duplicate entries is using the code below, but I am just wondering whether the above method is correct.
begin tran
if (SELECT 1 FROM t3 with (updlock) where t3.a=-86)
begin
INSERT INTO T3
SELECT -86,-86
end
commit
If you just have a unique constraint on the column, you'll never have duplicates.
The technique you've outlined will avoid you having to catch an error or an exception in the case of the (second "simultaneous") operation failing.
I'd like to add that relying on "outer" code (even T-SQL) to enforce your database consistency is not a great idea. In all cases, using declarative referential integrity at the table level is important for the database to ensure consistency and matching expectations, regardless of whether application code is written well or not. As in security, you need to utilize a strategy of defense in depth - constraints, unique indexes, triggers, stored procedures, and views can all assist in making a multi-layered approach to ensure the database presents a consistent and reliable interface to the application or system.
To keep locks between multiple statements, they have to be wrapped in a transaction. In your example:
If (SELECT 1 FROM t3 with (updlock) where t3.a=-86)
INSERT INTO T3 SELECT -86,-86
The update lock can be released before the insert is executed. This would work reliably:
begin transaction
If (SELECT 1 FROM t3 with (updlock) where t3.a=-86)
INSERT INTO T3 SELECT -86,-86
commit transaction
Single statements are always wrapped in a transaction, so this would work too:
INSERT INTO T3 SELECT -86,-86
WHERE NOT EXISTS (SELECT 1 FROM t3 with (updlock) where t3.a=-86)
(This is assuming you have "implicit transactions" turned off, like the default SQL Server setting.)