My database background is mainly Oracle, but I've recently been helping with some SQL Server work. My group has inherited some SQL server DTS packages that do daily loads and updates of large amounts of data. Currently it is running in SQL Server 2000, but will soon be upgraded to SQL Server 2005 or 2008. The mass updates are running too slowly.
One thing I noticed about the code is that some large updates are done in procedural code in loops, so that each statement only updates a small portion of the table in a single transaction. Is this a sound method to do updates in SQL server? Locking of concurrent sessions should not be an issue because user access to tables is disabled while the bulk loading takes place. I've googled around some, and found some articles suggesting that doing it this way conserves resources, and that resources are released each time an update commits, leading to greater efficiency. In Oracle this is generally a bad approach, and I've used single transactions for very large updates with success in Oracle. Frequent commits slow the process down and use more resources in Oracle.
My question is, for mass updates in SQL Server, is it generally a good practice to use procedural code, and commit many SQL statements, or to use one big statement to do the whole update?
Sorry Guys,
None of the above answer the question. They are just examples of how you can do things. The answer is, more resources get used with frequent commits, however, the transaction log cannot be truncated until a commit point. Thus, if your single spanning transaction is very big it will cause the transaction log to grow and possibly fregment which if undetected will cause problems later. Also, in a rollback situation, the duration is generally twice as long as the original transaction. So if your transaction fails after 1/2 hour it will take 1 hour to roll back and you can't stop it :-)
I have worked with SQL Server2000/2005, DB2, ADABAS and the above is true for all. I don't really see how Oracle can work differently.
You could possibly replace the T-SQL with a bcp command and there you can set the batch size without having to code it.
Issuing frequest commits in a single table scan is prefferable to running multiple scans with small processing numbers because generally if a table scan is required the whole table will be scanned even if you only returning a small subset.
Stay away from snapshots. A snapshot will only increase the number of IOs and competes for IO and CPU
In general, I find it better to update in batches - typically in the range of between 100 to 1000. It all depends on how your tables are structured: foreign keys? Triggers? Or just updating raw data? You need to experiment to see which scenario works best for you.
If I am in pure SQL, I will do something like this to help manage server resources:
SET ROWCOUNT 1000
WHILE 1=1 BEGIN
DELETE FROM MyTable WHERE ...
IF ##ROWCOUNT = 0
BREAK
END
SET ROWCOUNT 0
In this example, I am purging data. This would only work for an UPDATE if you could restrict or otherwise selectively update rows. (Or only insert xxxx number of rows into an auxiliary table that you can JOIN against.)
But yes, try not to update xx million rows at one time. It takes forever and if an error occurs, all those rows will be rolled back (which takes an additional forever.)
Well everything depends.
But ... assuming your db is in single user mode or you have table locks (tablockx) against all the tables involved, batches will probably perform worse. Especially if the batches are forcing table scans.
The one caveat is that very complex queries will quite often consume resources on tempdb, if tempdb runs out of space (cause the execution plan required a nasty complicated hash join) you are in deep trouble.
Working in batches is a general practice that is quite often used in SQL Server (when its not in snapshot isolation mode) to increase concurrency and avoid huge transaction rollbacks because of deadlocks (you tend to get deadlock galore when updating a 10 million row table that is active).
When you move to SQL Server 2005 or 2008, you will need to redo all those DTS packages in SSIS. I think you will pleasantly surprised to see how much faster SSIS can be.
In general, In SQL Server 2000, you want to run things in batches of records if the whole set ties up the table for too long. If you are running the packages at night when there is no use on the system, you may be able to get away with a set-based insert of the entire dataset. Row-by-row is always the slowest method, so avoid that if possible as well (Especially if all the row-row-row inserts are in one giant transaction!). If you have 24 hour access with no down time, you will almost certainly need to run in batches.
Related
I have a data flow, oledb source and oledb destination (both are SQL Server). In source, there are two tables A and B, A has 4M rows and B has 6M rows. They all have 30+ columns. When I do the query, I select 30 columns from A left join B where a.date > '2020-01-01'. it will return 50K rows. the query last 9 -10 seconds. Sometimes, I got error
Transaction (Process ID 75) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Even I did the query directly on the source server, I also could get
Transaction (Process ID 67) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
but not as frequent as on the SSIS
Is that because they are transaction tables, user could do some update at the same time?
How to avoid it. Like in SSIS, if fail can SSIS wait for 5 second and reran it?
SSIS doesn't know anything about scheduling. Typically, that is done through SQL Agent in which you can specify retry on failure values.
The root of your question is why am I getting these deadlocks. You are asking for data and your request is preventing a more important query from completing. Since your query is less important, it gets snuffed out so the database as a system can remain operational.
Your question identifies that you are querying against transactional tables and yes, the day-to-day operation of the system is what is likely killing your queries. The deadlock graph in the default extended event would reveal precisely what happened (ask your DBAs for help).
As David Browne points out, you likely need to look at using a different isolation level to allow your read queries to operate on data while concurrent activity inserts/deletes/updates data. This tends to be decision points the business units that you are generating ETL for can provide guidance. Maybe working with "dirty" data is acceptable. If so, add SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED to your query.
If not, then you need to look at the query plans being generated and optimize them. Maybe the left join could be reworked as an Exists if you're only using it to test whether a condition exists. Perhaps there's implicit conversion going on all over the place. Or statistics are out of date. Or a covering index could be created. Lots of options here, but the key takeaway is make the query go faster so there's less resource contention.
Use one of the row-versioning isolation levels READ_COMMITTED_SNAPSHOT isolation or SNAPSHOT isolation to prevent your SSIS source query from acquiring locks on the data it reads.
I ran a query to delete around 4 million rows from my database. It ran for about 12 hours before my laptop lost the network connection. At that point, I decided to take a look at the status of the query in the database. I found that it was in the suspended state. Specifically:
Start Time SPID Database Executing SQL Status command wait_type wait_time wait_resource last_wait_type
---------------------------------------------------------------------------------------------------------------------------------------------------
2018/08/15 11:28:39.490 115 RingClone *see below suspended DELETE PAGEIOLATCH_EX 41 5:1:1116111 PAGEIOLATCH_EX
*Here is the sql query in question:
DELETE FROM T_INDEXRAWDATA WHERE INDEXRAWDATAID IN (SELECT INDEXRAWDATAID FROM T_INDEX WHERE OWNERID='1486836020')
After reading this;
https://dba.stackexchange.com/questions/87066/sql-query-in-suspended-state-causing-high-cpu-usage
I realize I probably should have broken this up into smaller pieces to delete them (or even delete them one-by-one). But now I just want to know if it is "safe" for me to KILL this query, as the answer in that post suggests. One thing the selected answer states is that "you may run into data consistency problems" if you KILL a query while it's executing. If it causes some issues with the data I am trying to delete, I'm not that concerned. However, I'm more concerned about this causing some issues with other data, or with the table structure itself.
Is it safe to KILL this query?
If you ran the delete from your laptop over the network and it lost connection with the server, you can either kill the spid or wait when it will disappear by itself. Depending on the ##version of your SQL Server instance, in particular how well it's patched, the latter might require instance restart.
Regarding the consistency issues, you seem to misunderstand it. It is possible only if you had multiple statements run in a single batch without being wrapped with a transaction. As I understand, you had a single statement; if that's the case, don't worry about consistency, SQL Server wouldn't have become what it is now if it would be so easy to corrupt its data.
I would have rewritten the query however, if T_INDEX.INDEXRAWDATAID column has NULLs then you can run into issues. It's better to rewrite it via join, also adding batch splitting:
while 1=1 begin
DELETE top (10000) t
FROM T_INDEXRAWDATA t
inner join T_INDEX i on t.INDEXRAWDATAID = i.INDEXRAWDATAID
WHERE i.OWNERID = '1486836020';
if ##rowcount = 0
break;
checkpoint;
end;
It definitely will not be any slower, but it can boost performance, depending on your schema, data and the state of any indices the tables have.
I have a job which is updating a table for 20 minutes and at these moments I can't update any row of it naturally.
Is there a way or method for to do this ?
The job can be more longer but I have to continue to update table.
On the other hand the job should rollback if it has got error during job.
Thanks..
Split the job into separate transactions. The way locks work in a professional DBMS like SQL Server is that they escalate to higher levels as they are required. Once a query has hit a lot of pages it's only natural that it's updated to a read-only table lock. The only way to circumvent this while keeping transactional integrity is to split it up in smaller jobs.
As Niels has pointed out you should attempt to update the table in batches explicitly committing each batch within a transaction. If you are creating enough locks to warrant a table lock then the chances are you have also ballooned your transaction log. Its probably worth checking and shrinking down to a more reasonable size if necessary.
Alternatively you could try enabling Trace flags 1224 or 1211 which "Disables lock escalation based on the number of locks" - http://msdn.microsoft.com/en-us/library/ms188396.aspx
Understanding the final decision is business decision, what are the accuracy considerations between NOLOCK & READPAST running in SQL 2008 R2? I would like to have a better understanding before discussing changes with the business area.
I have inherited a number of queries, used to create data views for management reporting. ‘WITH (NOLOCK)’ is used liberally but inconsistently. The data being read is from the production server of a widely used application that is constantly being updated. We are migrating from a SQL 2005 server to a SQL 2008 R2 server. These reports want data fresher than the 24 hour old data on the archive server. The use of NOLOCK suggests a past decision; potential for conflict exists and it a bit of accuracy loss is acceptable. Data is used to populate dashboards for human awareness/decision making.
All the Queries are SELECT, with Read Only access for the data view login. The majority of the queries are single table with a few 2 and 3 table joins. Given the low level of joins WITH () seems a better choice than SET TRANSACTION ISOLATION LEVEL {}
Table Hints (Transact-SQL) http://msdn.microsoft.com/en-us/library/ms187373.aspx (as well as multiple questions on SO) says that NOLOCK and/or READUNCOMMITTED are likely to have duplicate read issues, in addition to missing locked records.
READPAST looks like the more accurate, as it will only miss locked records without a chance of duplicates. But I am not sure the level of missing locked records is consistent between it and NOLOCK.
There is good article by Tim Chapman comparing the two but it was written in 2007, most of the comments revolve around 2000 & 2005, with one comment indicating READPAST is problematic in 2008 R2
References
Effect of NOLOCK hint in SELECT statements
When should you use "with (nolock)"
Using NOLOCK and READPAST table hints in SQL Server (By Tim Chapman)
Edit:
Snapshot isolation is suggested in two answers below. Snapshot isolation is dependent setting of the DB, this Q/A https://serverfault.com/questions/117104/how-can-i-tell-if-snapshot-isolation-is-turned-on describes how to see what setting are in place on the database. I now know it is disabled, I am reading for reports from a major applications database. Changing the setting is not an option. +- a couple of percent accuracy is acceptable, application (OLTP) impact is not acceptable. Most simple queries do not need lock considerations but in some extreme cases, lock consideration is required. With the advent of Snapshot isolation for SQL 2005, little information is available on NOLOCK & READPAST behavior in SQL 2008 or higher. Yet they remain my only choices.
A better option worth consideration is enabling READ COMMITTED SNAPSHOT for the database itself. This uses versioning in the tempdb to capture the state of a table at the beginning of the transaction.
There is a very good read on various aspects of NOLOCK, READPAST etc, at http://www.brentozar.com/archive/2013/01/implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide/
WITH (NOLOCK) can provide incorrect results if someone is updating the table when you are selecting from it. If a page-split happens as a result of an insert while you are reading the table, and the new page happens to be beyond the point you've read, WITH (NOLOCK) will have already returned rows from the old page, and will then return duplicate rows from the new page. This is just a single example of why (NOLOCK) is bad.
WITH (READPAST) will skip any records that are being updated or inserted while you are reading from the table. Neither option is good in a busy database.
In light of the recent edit to your question where you state you cannot change the database setting for READ COMMITTED SNAPSHOT, perhaps you should consider using a stored procedure to gather data for you reports, and setting the transaction isolation level at the beginning of the stored proc using SET TRANSACTION ISOLATION LEVEL SNAPSHOT;. In order to do this, you would need to change the database option 'allow snapshot isolation'.
From SQL Server Books Online:
SNAPSHOT
Specifies that data read by any statement in a transaction will be the transactionally consistent version of the data that existed at the start of the transaction. The transaction can only recognize data modifications that were committed before the start of the transaction. Data modifications made by other transactions after the start of the current transaction are not visible to statements executing in the current transaction. The effect is as if the statements in a transaction get a snapshot of the committed data as it existed at the start of the transaction.
Except when a database is being recovered, SNAPSHOT transactions do not request locks when reading data. SNAPSHOT transactions reading data do not block other transactions from writing data. Transactions writing data do not block SNAPSHOT transactions from reading data.
During the roll-back phase of a database recovery, SNAPSHOT transactions will request a lock if an attempt is made to read data that is locked by another transaction that is being rolled back. The SNAPSHOT transaction is blocked until that transaction has been rolled back. The lock is released immediately after it has been granted.
The ALLOW_SNAPSHOT_ISOLATION database option must be set to ON before you can start a transaction that uses the SNAPSHOT isolation level. If a transaction using the SNAPSHOT isolation level accesses data in multiple databases, ALLOW_SNAPSHOT_ISOLATION must be set to ON in each database.
A transaction cannot be set to SNAPSHOT isolation level that started with another isolation level; doing so will cause the transaction to abort. If a transaction starts in the SNAPSHOT isolation level, you can change it to another isolation level and then back to SNAPSHOT. A transaction starts the first time it accesses data.
A transaction running under SNAPSHOT isolation level can view changes made by that transaction. For example, if the transaction performs an UPDATE on a table and then issues a SELECT statement against the same table, the modified data will be included in the result set.
NOLOCK can cause duplicate data to be read, data to be missed and the query to actually fail with an error messaged (something with "data movement").
On the other hand, a non-NONLOCK query can also read duplicate data and mis data! It is by no means a consistent snapshot of the database. The difference is that it will not read uncommitted data and will never fail.
The problem with NOLOCK is mostly that it can fail randomly, so you need retry. Also, the probability of wrong data being read is slightly higher.
NOLOCK has a big advantage when you're doing table scans: SQL Server can use allocation order scanning instead of index-order scans. TABLOCK has the same effect. This can be a significant speedup in the presence of fragmentation.
Consider just using the snapshot isolation level as it gets rid of all of these concerns. It comes with some other trade-offs and you don't get allocation-order scans. But it permanently and comprehensively removes locking problems.
Answering my own question after stress testing with SQLQueryStress http://www.datamanipulation.net/sqlquerystress/ (this is wonderful tool that is extremely easy to use). Results from SQLQueryStress are tested against SQL Server Profiler; the accuracy is the same as SQL Server Profiler, though the precession is two decimal places of a second less (but is sufficient for this test).
As mentioned in the question, the primary concern is application performance impact, with report accuracy and performance the secondary consideration. All testing occurred on the test server with where the test application is active and has some minor activity.
After downloading and becoming familiar with SQLQueryStress I set up a simple ‘ReportQuery’ to act as resource hog. It set to run 15 Iterations with 15 threads (225 total queries). Total run time is around 28 seconds, with an average iteration time of 1.49 seconds.
Created an Add/Delete ‘ApplicationQuery’ to represent ongoing application activity. It is set to run 2000 iterations with 1 thread. There are two versions, with a select statement (runs 31 seconds) and without a select statement (runs 28 seconds). These represent normal peak time application activity.
10 test runs of each of three versions of “ReportQuery” are run, this is to identify if there is any performance benefit between ‘with(nolock)’, ‘with(readpast), and without hints . Results indicate no significant difference the ReportQuery runs constantly in about 28 seconds with an average 1.5 second iteration time.
There are no big outliers so, decision to drop to 5 test runs for the following tests.
5 test runs of ApplicationQuery with a select statement; with one of each of three versions of “ReportQuery” also running. In each of 15 totals test, the ApplicationQuery is manually started, with the ReportQuery manually started immediately after. This scenario represents a resource heavy report query struggling with the applications ongoing activity for resources.
Repeated the test runs but this time used ApplicationQuery without a select statement.
Results: In every case the ApplicationQuery was throttled back to almost no forward progress, while the ReportQuery was running.
The ReportQuery had no significant loss of performance when struggling for resources with multiple ApplicationQuery’s against the database.
The ApplicationQuery was able to run queries parallel to the ReportQuery, but progress was very slow while competing for resources. In essence the total time to run 2000 Application Add/Delete Queries was extended by the time used by the ReportQuery.
The initial question was about which was more accurate, becomes pointless. There is essentially no report or application performance difference between using or not using the hints NOLOCK or READPAST, so don’t use either in a busy database and get the highest accuracy possible.
‘ReportQuery’
select
ID
, [TABLE_NAME]
, NUMBER
, FIELD
, OLD_VALUE
, NEW_VALUE
, SYSMODUSER
, SYSMODTIME
, SYSMODCOUNT
from dbo.UPMCINCIDENTMGMTAUDITRECORDSM1
where Number like '%'
or NUMBER like '2010-01-01'
‘ApplicationQuery’ (with Select Statement)
select *
from dbo.UPMCINCIDENTMGMTAUDITRECORDSM1
where FIELD = 'JJTestingPerformance'
insert into dbo.UPMCINCIDENTMGMTAUDITRECORDSM1 (ID
, [TABLE_NAME]
, NUMBER
, FIELD
, OLD_VALUE
, NEW_VALUE
)
values ('Test+Time'
, 'none'
, 'tst01'
, 'JJTestingPerformance'
, 'No Value'
, 'Test'
)
delete from dbo.UPMCINCIDENTMGMTAUDITRECORDSM1
where FIELD = 'JJTestingPerformance'
‘ApplicationQuery’ (without Select Statement)
insert into dbo.UPMCINCIDENTMGMTAUDITRECORDSM1 (ID
, [TABLE_NAME]
, NUMBER
, FIELD
, OLD_VALUE
, NEW_VALUE
)
values ('Test+Time'
, 'none'
, 'tst01'
, 'JJTestingPerformance'
, 'No Value'
, 'Test'
)
delete from dbo.UPMCINCIDENTMGMTAUDITRECORDSM1
where FIELD = 'JJTestingPerformance'
I noticed some performance problems with my DB. Such query (just for a example):
SELECT *
FROM ActionHistory
WHERE ObjectId = #id"
...executes at random with different reads and duration. ObjectId is Foreign Key, with index on it.
With SQL Profiler I found, that sometimes the results are: 5 reads, 0 duration, but in another case: 5 reads, 200 duration. Such big durations occurs accidentally.
I use distributed transaction with WCF. Such results I got when I was the only user at that time, so it likely not to be a locks or something else.
What is the reason of such behaviour: low reads, but high query duration ?
In general, distributed transactions are extremely expensive. Try disabling distributed transactions in your environment to see if that changes anything.
Since the query is exactly the same each time and the reads are the same, then it's most likely due to locking. Sometimes another query is executing and may have a lock on the records that need to be accessed. Waiting for the lock to be released would cause a slowdown.
Using SQL Profiler to compare start/stop times for queries you can identify overlapping queries that may cause locking.
This is not an indication of a problem, just an explanation of the differences you're seeing.
Enable read committed snapshot in the database:
ALTER DATABASE ... SET READ_COMMITTED_SNAPSHOT ON;
This will miraculously change your reads that occur under the default read-committed isolation into snapshot reads, which are not hindered by locks. See Choosing Row Versioning-based Isolation Levels for details, including the runtime resource usage caused by enabling snapshot reads.