INSERT and UPDATE blocking other session for DML operation - sql

I am inserting data into table from xml data and when I have large xml then it is taking time to insert.
The problem is when inserting data with in BEGIN TRAN & COMMIT TRAN then other session not being able to read and write data into BrokerData table.
I am using default isolation. Please guide me what changes I should do in database as a result table should not be blocked when one session will insert or update data into that table as a result user from other session can do the DML operation on this BrokerData table.
BEGIN TRAN
INSERT INTO BrokerData
SELECT col.value('(Section/text())[1]', 'NVARCHAR(MAX)') AS Section
,col.value('(LineItem/text())[1]', 'NVARCHAR(MAX)') AS LineItem
,col.value('(XFundCode/text())[1]', 'NVARCHAR(MAX)') AS XFundCode
,col.value('(StandardDate/text())[1]', 'NVARCHAR(MAX)') AS StandardDate
,col.value('(StandardValue/text())[1]', 'VARCHAR(MAX)') AS StandardValue
,col.value('(ActualProvidedByCompany/text())[1]', 'VARCHAR(MAX)') AS ActualProvidedByCompany
FROM #BogyXML.nodes('/Root/PeriodicalData') AS tab (col)
COMMIT TRAN

First, it is good to know default isolation level for MSSQL Server is READ COMMITED.
READ COMMITTED uses pessimistic locking to protect data. With this
isolation level set, a transaction cannot read uncommitted data that
is being added or changed by another transaction. A transaction
attempting to read data that is currently being changed is blocked
until the transaction changing the data releases the lock. A
transaction running under this isolation level issues shared locks,
but releases row or page locks after reading a row.
So, in your case isolation level is not only problem. Loading XML is heavy operation and best practice is to load this data first in the some staging table.
Staging table can be:
Temporay table stored on the disk
Temporary table in memory
Memory optimized table
In your case and probably the best option is Memory-Optimized with two modes of durability SCHEMA_AND_DATA and SCHEMA_ONLY. If you want faster mode and if you don't need this data stored on the disk then it is SCHEMA_ONLY because non-durable memory-optimized tables do not incur logging overhead, transactions writing to them run faster than write operations on durable tables.
The In-Memory OLTP feature built into SQL Server 2016 adds a new
memory-optimized relational data management engine and a native stored
procedure compiler to the platform that you can use to run
transactional workloads with higher concurrency.
There is a little configuration for using this feature of In-Memory OLTP and if you are not in possition to use this feature temporary table can be replacement, but not good as memory-optimized tables.
Once the data is loaded and stored in the memory of SQL Server then you can start updating your real table which is BrokerData in your case.
However, updating real table is still a problem and I am pretty sure it will be faster if you prepare exactly data in the staging table and then move to the producation one. If you have small number of queries that are accessing to the producation table you can use READ UNCOMMITED isolation level.
The READ UNCOMMITTED isolation level is the least restrictive setting.
It allows a transaction to read data that has not yet been committed
by other transactions. SQL Server ignores any locks and reads data
from memory. Furthermore, transactions running under this isolation
level do not acquire shared (S) locks to prevent other transactions
from changing the data being read.
For these reasons, READ UNCOMMITTED is never a good choice for line of business applications where accuracy matters most, but might be acceptable for a reporting application where the performance benefit outweighs the need for a precise value.
It is really easy to use READ UNCOMMITED isolation level in the query directy:
SELECT *
FROM BrokerData
WITH (NOLOCK)
If the data is really important and you cannot allow your queries to read dirty data, and some data where is possibility of rollback high then you should think about READ_COMMITTED_SNAPSHOT. This isolation level is combination of READ COMMITED and SNAPSHOT.
The READ_COMMITTED_SNAPSHOT isolation level is an optimistic
alternative to READ COMMITTED. Like the SNAPSHOT isolation level, you
must first enable it at the database level before setting it for a
transaction.
During the transaction, SQL Server copies rows modified by other transactions into a collection of pages in tempdb known as the version store. When a row is updated multiple times, a copy of each change is in the version store. This set of row versions is called a version chain.
If you are going to use READ_COMMITTED_SNAPSHOT prepare a lot of memory in the tempdb.

Related

Avoid dirty/phantom reads while selecting from database

I have two tables A and B.
My transactions are like this:
Read -> read from table A
Write -> write in table B, write in table A
I want to avoid dirty/phantom reads since I have multiple nodes making request to same database.
Here is an example:
Transaction 1 - Update is happening on table B
Transaction 2 - Read is happening on table A
Transaction 1 - Update is happening on table A
Transaction 2 - Completed
Transaction 1 - Rollback
Now Transaction 2 client has dirty data. How should I avoid this?
If your database is not logged, there is nothing you can do. By choosing an unlogged database, those who set it up decided this sort of issue was not a problem. The only way to fix the problem here is change the database mode to logged, but that is not something you do casually on a whim — there are lots of ramifications to the change.
Assuming your database is logged — it doesn't matter here whether it is buffered logging or unbuffered logging or (mostly) a MODE ANSI database — then unless you set DIRTY READ isolation, you are running using at least COMMITTED READ isolation (it will be Informix's REPEATABLE READ level, standard SQL's SERIALIZABLE level, if the database is MODE ANSI).
If you want to ensure that data rows do not change after a transaction has read them, you need to run at a higher isolation — REPEATABLE READ. (See SET ISOLATION in the manual for the details. (Beware of the nomenclature for SET TRANSACTION; there's a section of the manual about Comparing SET ISOLATION and SET TRANSACTION and related sections.) The downside to using SET ISOLATION TO REPEATABLE READ (or SET TRANSACTION ISOLATION LEVEL SERIALIZABLE) is that the extra locks needed reduce concurrency — but give you the best guarantees about the state of the database.

How to reduce downtime of table during inserts SQL Server

I have an operative table, call it Ops. The table gets queried by our customers via a web service every other second.
There are two processes that affect the table:
Deleting expired records (daily)
Inserting new records (weekly)
My goal is to reduce downtime to a minimum during these processes. I know Oracle, but this is the first time I'm using SQL Server and T-SQL. In Oracle, I would do a truncate to speed up the first process of deleting expired records and a partition exchange to insert new records.
Partition Exchanges for SQL Server seem a bit harder to handle, because from what I can read, one has to create file groups, partition schemes and partition functions (?).
What are your recommendations for reducing downtime?
A table is not offline because someone is deleting or inserting rows. The table can be read and updated concurrently.
However, under the default isolation level READ COMMITTED readers are blocked by writers and writers are blocked by readers. This means that a SELECT statement can take longer to complete because a not-yet-committed transaction is locking some rows the SELECT statement is trying to read. The SELECT statement is blocked until the transaction completes. This can be a problem if the transaction takes long time, since it appears as the table was offline.
On the other hand, under READ COMMITTED SNAPSHOT and SNAPSHOT isolation levels readers don't block writers and writers don't block readers. This means that a SELECT statement can run concurrently with INSERT, UPDATE and DELETE statements without waiting to acquire locks, because under these isolation levels SELECT statements don't request locks.
The simplest thing you can do is to enable READ COMMITTED SNAPSHOT isolation level on the database. When this isolation level is enabled it becomes the default isolation level, so you don't need to change the code of your application.
ALTER DATABASE MyDataBase SET READ_COMMITTED_SNAPSHOT ON
If your problem is "selects getting blocked," you can try 'NO LOCK' hint. But be sure to read the implications. You can check https://www.mssqltips.com/sqlservertip/2470/understanding-the-sql-server-nolock-hint/ for details.

why objectID(N'tablename') does not lock and name='tablename' does lock on sys.objects?

Experiment details:
I am running this in Microsoft SQL Server management studio.
On one query window I run:
BEGIN TRANSACTION a;
ALTER table <table name>
ALTER column <column> varchar(1025)
On the other I run:
SELECT 1
FROM sys.objects
WHERE name = ' <other_table name>'
Or this:
SELECT 1
FROM sys.objects
WHERE object_id = OBJECT_ID(N'[<other_table name>]')
For some reason the select with name= does not return until I do commit to the tranaction.
I am doing transaction to simulate a long operation of alter column that we have in our DB sometimes. Which I don't want to harm other operations.
This is because you are working under Default Transaction Isolation Level i.e ReadCommited .
Read Commited
Under Read Commited Trasanction Isolation Level when you explicitly Begin a Transaction Sql Server obtains Exclusive locks on the resources in order to maintain data integrity and prevent users from dirty reads. Once you have begin a transaction and working with some rows other users will not be able to see them rows untill you Commit your transaction or Rollback. However this is sql server's default behaviour this can be changed under different Transaction Isolation Level for instance under Read Uncommited You will be able to read rows which are being Modified/Used by other users , but there is a chance of you have Dirty Reads "Data That you think is still in database but Other user has changed it."
My Suggestion
If Dirty Reads is something you can live with go on than
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITED;
Preferably Otherwise stick to default behaviour of Sql Server and let the user wait for a bit rather than giving them dirty/wrong data. I hope it helps.
Edit
it does not matter if the both queries are executed under same or different isolation level what matters is under what isolation level the queries are being executed, and you have to have Read Uncommited transaction isolation level when you are making use of explicit transactions. If you want other user to have access to the data in during a transaction. Not recomended and a bad practice, I would personally use Snapshot Isolation which will extensively make use of tempdb and will only show the last commited data.

Confusion about SQL Server snapshot isolation and how to use it

I am newbie to MS Sql Server. Just a few months experience maintaining SQL Server SPs. I am reading up about transaction isolation levels for optimising an SP and am quite confused. Please help me with below questions:
If I run DBCC useroptions in ClientDB, the default value for isolation level is 'read committed'. Does this mean the DB is set to isolation level is set to READ_COMMITTED_SNAPSHOT ON ?
Is SET TRANSACTION ISOLATION LEVEL (at transaction level) same as SET READ_COMMITTED_SNAPSHOT ON (at DB level)? By this I mean that IF my DB has SNAPSHOT enabled, then can I set isolation level in my SP and process data accordingly?
is ALLOW_SNAPSHOT_ISOLATION similar to above?
I have an SP which starts off with a very long running SELECT statement that also dumps it's contents into a temp table. Then uses the temp table to UPDATE/INSERT base table. There are about 8 mil records being selected and dumped into temp table, then a similar number of total rows updated/inserted. The problem we are facing is that this SP takes up too much disk space.
It is a client DB and we do not have permissions to check disk space/log size etc in the DB. So I do not know if the tempDB/tempDB-log is taking up this disk space or clientDB/clientDB-log is. But the disk space can reduce by as much as 10GB at one go! This causes the transaction log to run out of disk space (as disk is full) and SP errors out.
If I use SNAPSHOT isolation level, will this disk space get even more affected? as it uses tempDB to versionize the data?
What I really want to do is this:
SET transaction isolation level to SNAPSHOT. Then do the SELECT into Temp table. Then BEGIN TRANSACTION and update/insert base table for say ... 1 mil records. Do this in a LOOP until all records processed. Then END TRANSACTION. Do you think this is a good idea? Should the initial SELECT be kept out of the TRANSACTION? Will this help in reducing the load on the transaction logs?
An isolation level of "read committed" isn't the same thing as setting READ_COMMITTED_SNAPSHOT ON. Setting READ_COMMITTED_SNAPSHOT ON sets the default isolation level for all queries. A query or procedure that then uses "read committed" isolation level does use snapshot isolation. See Isolation Levels in the Database Engine and SET TRANSACTION ISOLATION LEVEL in Books Online.
When the READ_COMMITTED_SNAPSHOT database option is set ON, read
committed isolation uses row versioning to provide statement-level
read consistency. Read operations require only SCH-S table level locks
and no page or row locks. When the READ_COMMITTED_SNAPSHOT database
option is set OFF, which is the default setting, read committed
isolation behaves as it did in earlier versions of SQL Server. Both
implementations meet the ANSI definition of read committed isolation.
ALLOW_SNAPSHOT_ISOLATION doesn't change the default isolation level. It lets each query or procedure use snapshot isolation if you want it to. Each query that you want to use snapshot isolation needs to SET TRANSACTION ISOLATION LEVEL SNAPSHOT. On big systems, if you want to use snapshot isolation, you probably want this rather than changing the default isolation level with READ_COMMITTED_SNAPSHOT.
A database configured to use snapshot isolation does take more disk space.
Think about moving the log files to a bigger disk.

Deadlocks - Will this really help?

So I've got a query that keeps deadlocking on me. People who know the system well can't figure out why the sproc is deadlocking, but they tell me that I should just add this to it:
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
Is this really a valid solution? What does that do?
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
This will cause the system to return inconsitent data, including duplicate records and missing records. Read more at Previously committed rows might be missed if NOLOCK hint is used, or here at Timebomb - The Consistency problem with NOLOCK / READ UNCOMMITTED.
Deadlocks can be investigated and fixed, is not a big deal if you follow the proper procedure. Of course, throwing a dirty read may seem easier, but down the road you'll be sitting long hours staring at your general ledger and wondering why the heck it does not balance debits and credits. So read again until you really grok this: DIRTY READs ARE INCONSISTENT READS.
If you want a get-out-of-jail card, turn on snapshot isolation:
ALTER DATABASE MyDatabase
SET READ_COMMITTED_SNAPSHOT ON
But keep in mind that snapshot isolation does not fix the deadlocks, it only hides them. Proper investigation of the deadlock cause and fix is always the appropriate action.
NOCOUNT will keep your query from returning rowcounts back to the calling application (i.e. 1000000 rows affected).
TRANSACTION ISOLATION LEVEL READ UNCOMMITTED will allow for dirty reads as indicated here.
The isolation level may help, but do you want to allow dirty reads?
Randomly adding SET options to the query is unlikely to help I'm afraid
SET NOCOUNT ON
Will have no effect on the issue.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
will prevent your query taking out shared locks. As well as reading "dirty" data it also can lead to your query reading the same rows twice, or not at all, dependant upon what other concurrent activity is happening.
Whether this will resolve your deadlock issue depends upon the type of deadlock. It will have no effect at all if the issue is 2 writers deadlocking due to non linear ordering of lock requests. (transaction 1 updating row a, transaction 2 updating row b then tran 1 requesting a lock on b and tran 2 requesting a lock on a)
Can you post the offending query and deadlock graph? (if you are on SQL 2005 or later)
The best guide is:
http://technet.microsoft.com/es-es/library/ms173763.aspx
Snippet:
Specifies that statements can read rows that have been modified by other
transactions but not yet committed.
Transactions running at the READ
UNCOMMITTED level do not issue shared
locks to prevent other transactions
from modifying data read by the
current transaction. READ UNCOMMITTED
transactions are also not blocked by
exclusive locks that would prevent the
current transaction from reading rows
that have been modified but not
committed by other transactions. When
this option is set, it is possible to
read uncommitted modifications, which
are called dirty reads. Values in the
data can be changed and rows can
appear or disappear in the data set
before the end of the transaction.
This option has the same effect as
setting NOLOCK on all tables in all
SELECT statements in a transaction.
This is the least restrictive of the
isolation levels.
In SQL Server, you can also minimize
locking contention while protecting
transactions from dirty reads of
uncommitted data modifications using
either:
The READ COMMITTED isolation level
with the READ_COMMITTED_SNAPSHOT
database option set to ON. The
SNAPSHOT isolation level
.
On a different tack, there are two other aspects to consider, that may help.
1) Indexes and the indexes used by the SQL. The indexing strategy used on the tables will affect how many rows are affected. If you make the data modifications using a unique index, you may reduce the chance of deadlocks.
One algorithm - of course it will not work it all cases. The use of NOLOCK is targeted rather than being global.
The "old" way:
UPDATE dbo.change_table
SET somecol = newval
WHERE non_unique_value = 'something'
The "new" way:
INSERT INTO #temp_table
SELECT uid FROM dbo.change_table WITH (NOLOCK)
WHERE non_unique_value = 'something'
UPDATE dbo.change_table
SET somecol = newval
FROM dbo.change_table c
INNER JOIN
#temp_table t
ON (c.uid = t.uid)
2) Transaction duration
The longer a transaction is open the more likely there may be contention. If there is a way to reduce the amount of time that records remain locked, you can reduce the chances of a deadlock occurring.
For example, perform as many SELECT statements (e.g. lookups) at the start of the code instead of performing an INSERT or UPDATE, then a lookup, then an INSERT, and then another lookup.
This is where one can use the NOLOCK hint for SELECTs on "static" tables that are not changing reducing the lock "footprint" of the code.