I'm reading "SQL Server fundamentals" and don't understand why if I have an intent lock on a table, I can still have exclusive or shared locks on it's sub-parts. Can someone explain?
I mean we lock an entire table and don't want it to be modified, but here comes an exclusive lock and modifies this table's row. I obviously don't understand something, so, please explain to me how this thing works, thanks!
Below is the table that the author provides in his book.
When we modify rows in a table, we can't modify the same rows by other transactions, and non-intent locks guarantee that.
However, an intent lock on a table is compatible with an exclusive lock on parts of a table because the purpose of an intent lock is to protect modifications that modify all of data.
One example would be when you are reading rows of table A, so you have shared locks on those rows and an intent shared lock on that table, intent shared lock protects from a situation where another transaction can obtain an exclusive lock on that table A and try to delete it(that is a modification of all of data). Since the shared locks on rows were first to start reading rows' data, it is logical to allow that operation to finish first.
Related
I have 8 tables. A multi threaded simultaneous process, using entity framework, that first deletes the master record causing a cascaded delete to child records and then reinserts the data, no updates.
Because of the way the keys are aligned, inserts and deletes happen in a specific order. To be precise in the exact opposite order, also some records might not have data for a given table. (I think) when the locks are escalated from row to table, for whatever reason that SQL thinks it's right, it causes deadlock, which makes perfect sense. I am guessing the more sparse data I will have the less this will happen since escalation will be on row.
Is there a named term for this?
I am looking for a solution that does not interfere with SQL plan as well as SQL locking but its done via data process while keeping the max degree of parallelism.
So I have a rather large table (150 million rows) that data scrub queries get run on nightly. Now these queries don't update a lot of records, but to get the records needed, that have to query that single table multiple times in sub queries, which takes some time.
So, would it be better for me to do a normal update statement, or would it be better if I put the few results I needed in a temp table, and then just did an update for those few rows, which would greatly reduce the locks during update.
I'm unsure how an update statement locks work when most of the time is spent querying. If it is going to only update 5 records, and runs for half and hour, will it release a record that it updated in the first minute, or does it wait till the end of the query?
Thanks
You need to use (and look into) into the ROWLOCK table hint. You can use it with the update statement while updating in batches of 5000 rows of less. This will attempt to place row locks in the target table (or on index keys, if a covering index is present). If for some reason that fails, the lock will be escalated to a table lock.
From MSDN (as for reasons why lock escalation might occur):
When the Database Engine checks for possible escalations at every 1250
newly acquired locks, a lock escalation will occur if and only if a
Transact-SQL statement has acquired at least 5000 locks on a single
reference of a table. Lock escalation is triggered when a Transact-SQL
statement acquires at least 5,000 locks on a single reference of a
table. For example, lock escalation is not triggered if a statement
acquires 3,000 locks in one index and 3,000 locks in another index of
the same table. Similarly, lock escalation is not triggered if a
statement has a self join on a table, and each reference to the table
only acquires 3,000 locks in the table.
Actually, there's more to read in this last article. You should have a look at mixed lock type escalation section.
Can a race condition occur in sql under these conditions?
If I have this SQL update running in one thread call it statement 1:
Update Items
Set Flag = B
where Flag = A;
And this SQL update running in another call it statement 2:
Update Items
Set Flag = C
where Flag = A;
Is it possible for each thread to read the same record where Flag is equal to A and write the record with their own values? Such that statement 1 can write it first and then statement 2 writes it or visa versa?
The answer to this question depends on when the database exclusively locks the update. Does it happen before it finds the records or after it finds the records and evaluates the where clause?
First, there are three lock contexts:
Database level lock
Table level lock
Row level lock
Then you have four lock modes:
IX
IS
X
S
IX and IS locks are "intention" locks. These locks are held before acquiring other types of locks. X locks are exclusive (write) locks and S locks are shared (read) locks.
The locks (IX,IS,X or S) locks can be taken at any context level. An X lock at the database level will block all other operations in the database for example. This is the type of lock that SQLlite takes. An S lock is taken for the entire database during reads, and an X lock is taken for the entire database during writes. Writes will wait for any S locks to complete and will block new S and X locks until the write lock is released. This provides a serializable isolation transaction level.
For MySQL, the locking depends on the storage engine. MyISAM will take X and S locks on entire (sets of) tables. X locks will wait on existing S or X locks and block new locks. New X locks will be given higher priority in the queue, moved ahead of new S locks. This behavior can be changed by setting LOW_PRIORITY_UPDATES, which could result in write starvation because writes will be de-prioritized in favor of reads.
It is possible in MySQL to obtain an X lock over the entire database using 'FLUSH TABLES WITH READ LOCK'.
InnoDB locks rows as they are encountered via an index read. InnoDB locks index records and locks the records when the index records are traversed. InnoDB uses special locks called 'gap' locks to ensure REPEATABLE-READ transaction isolation level. Locks are held on index entries, so if a table is not well indexed for an UPDATE query, then many rows will be locked. Note that InnoDB does not create S locks for normal SELECT queries. It uses row versioning, not row level locking for consistent snapshots.
When acquiring X locks, the database needs to detect deadlocks. Consider the following:
>connection 1
start transaction;
update T set c = c + 1 order by id asc;
>connection 2
start transaction;
update T set c = c - 1 order by id desc;
In a row locking model, these two statements can not both complete successfully. The first would wait forever to acquire locks the second holds, and vice-versa. The database will pick one of the connections to roll back. InnoDB will pick the connection which has made the fewest number of changes. MyISAM will lock the whole table for whichever connection acquires the lock first, and then the second will run after the first completes.
The simple example given by you will be resolved by X locks at any context (database, table or row). If two connections begin at exactly the same type, both running two updates which try to update the same row, both will attempt to acquire an X lock. Only one connection can acquire the X lock. It is not possible to determine exactly which one will acquire the lock. The other connection will have to wait until the lock is released until it can acquire the X lock. Keep in mind, that if the row was locked by a DELETE or UPDATE, then the waiter might end up not acquiring a lock after waiting, because there is nothing left in the database to lock.
In your example, the first UPDATE to acquire the X lock, and the second UPDATE will then wait on the X lock and will eventually execute but not match any rows.
Exclusive lock, used for data-modification operations, such as INSERT, UPDATE, or DELETE will be used in this scenario.
An exclusive lock ensures that multiple updates cannot be made to the same resource at the same time.
You will not get a race condition in this scenario.
If you have a more complex scenario involving multiple tables then you may get race conditions, or deadlocks. There are many ways to avoid this, simplifying and separating queries, etc.
You can also apply hints to queries that tell SQL what type of lock to use.
http://msdn.microsoft.com/en-us/library/aa213026(v=sql.80).aspx
Sounds like you should read about locking. SQL server has a complex set of logic and will perform either table or row level locks based on the number of rows it estimates will require updates. Unless you specifically tell it which you want it to perform it can even vary from query to query. Usually if you are modifying a small subset of the table it will choose a row level lock.
SQL Server is designed with ACID in mind, thus it writes changes to its logs before performing any actual updates to the data. This allows any failed updates to be rolled back and allows consistency between queries (like your asking about). You can perform dirty reads to get around locking issues, however you cannot prevent SQL Server from locking inserted, updated and/or deleted records.
SQL Server Locking
EDIT: Here is an article about ACID.
ACID - Wikipedia
All SQL databases pretty much guarantee that such a collision will not occur. "When" locking occurs depends on whether locking is at the table, partition, page, or row level. Or, whether you have turned off such locking in your database.
What can happen, if you have concurrent update statements and multiple rows being updated, is that sone row are updated with the first, some with the second.
In general, I think of the where clause as being evaluated to select the row set, lock the rows one at a time, do the update and unlock. However, this depends on the type of locking. In this case, the scenario above would continue with the values flipping.
If you are concerned about this situation, use table level locking to force serialization when concurrent update requests are being processed.
I have a database table that I use as a queue system, where separate process that talk to each other create and read entries in the table. For example, when a user initiates a search an entry is created, then another process that runs every second or two will pick up that new entry, update the status and then do a search, updating the entry again when the search is complete. This all seems to work well with thousands of searches per hour.
However, I have a master admin screen that lets me view the status of all of these 'jobs' but it runs very slowly. I basically return all entries in the table for the last hour so I can keep an eye on what's going on. I think that I am running into lock issues of some sort. I only need to read each entry, and don't really care if it the data is a little bit out of date. I just use a standard 'Select * from Table' statement so maybe it is waiting for other locks to expire before returning data as the jobs are constantly updating the data.
Would this be handled better by a certain kind of cursor to return each row one at a time, etc? Any other ideas?
Thanks
If you really don't care if the data is a bit out of date... or if you only need the data to be 99.99% accurate, consider using WITH (NOLOCK):
SELECT * FROM Table WITH (NOLOCK);
This will instruct your query to use the READ UNCOMMITTED ISOLATION LEVEL, which has the following behavior:
Specifies that dirty reads are allowed. No shared locks are issued to
prevent other transactions from modifying data read by the current
transaction, and exclusive locks set by other transactions do not
block the current transaction from reading the locked data.
Be aware that NOLOCK may cause some inaccuracies in your data, so it probably isn't a good idea to use it throughout the rest of your system.
You need FROM yourtable WITH (NOLOCK) table hint.
You may also want to look at transaction isolation in your update process, if you aren't already
An alternative to NOLOCK (which can lead to very bad things, such as missed rows or duplicated rows) is to allow read committed snapshot isolation at the database level and then issue your query with:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
Question is a follow up to this.
The SQL in question was
UPDATE stats SET visits = (visits+1)
And the question is, for the purpose of performance, does it matter if you lock all rows in stats in comparison to locking the table stats? Or, if the database uses a page-lock rather than a table/row lock?
There is no predicate on this. Any self respecting DB engine should work this out and realise all rows need updated.
Generally, don't second guess the DB engine: performance is subjectively the same.
Personally,
I'd not use table or locking hints unless I have to and know why I'm doing it.
I'd not issue a query like this anyway from an application without a WHERE clause
In theory you should lock the table, because 1 lock is cheaper than 1M locks.
Many DBs, though, will promote locks for operations like this. As they see the locks expanding, they'll automatically promote to page and table locks.
But, as with anything, "it depends", and it's better to be specific and lock the table yourself.
Edit:
sigh
Postgres example:
LOCK TABLE mytable IN EXCLUSIVE MODE;
UPDATE mytable SET field = field + 1;
COMMIT;
Here's the deal. This is going to happen ANYWAY, the LOCK TABLE command makes it more explicit, and ensures that your intent, locking the table, is clear and capable before the process takes place.
Would I do this on a 10 row table? No.
Would I do this on a database that I KNEW I had exclusive access to? No, there's no need.
Would I do this on a operational database with a table with a large amount a rows? You bet.