SQL Server - is using ##ROWCOUNT safe in multithreaded applications? - sql

I am using SQL Server 2008.
I have a table A which accepts many insert/update in one seconds. After insert, update I want to get the number of rows affected.
INSERT INTO A (ID) VALUES (1)
IF ##ROWCOUNT = 0
PRINT 'NO ROWS AFFECTED'
While query is being executed, the same query may be called again by application. So what happens if the current execution is after INSERT but before IF block at that moment.
Do you think ##ROWCOUNT may give wrong result for that reason?
Or is it always safe in its context?

Yes - its safe. It always refers the previous operation in current query
BUT
if you want to know the number of rows affected, save it to variable first, because after IF statement the count ##ROWCOUNT resets
INSERT INTO A (ID) VALUES (1)
DECLARE #rc INT = ##ROWCOUNT
IF #rc = 0
PRINT 'NO ROWS AFFECTED'
ELSE
SELECT #rc AS RowsAffected

##ROWCOUNT is both scope and connection safe.
In fact, it reads only the last statement row count for that connection and scope. The full rules are here on MSDN (cursors, DML, EXECUTE etc)
To use it in subsequent statements, you need to store it in a local variable.

You must preserve the ##ROWCOUNT value in a local variable, otherwise after the IF statement its value will reset to zero:
SET #rowCount = ##ROWCOUNT
IF #rowCount = 0
PRINT 'NO ROWS AFFECTED'
Other than that, yes, it is safe.

Short answer: Yes.
However it worth to see the question in a perspective, for the deeper understanding why the answer yes is so natural without doubt.
SQL Server is prepared to handle concurrent access correctly by its nature, regardless if the client application is multithreaded or not. Unless this attribute SQL Server would be useless in any multiuser scenario. From point of view of the server it does not matter if the concurrent access caused by one multithreaded application, or two applications which are currently using the server concurrently by multiple users.
Regarding this point the ##rowcount is only the top of the iceberg, there are much more and deeper functionality what must be handled correctly when concurrent access is in the picture.
The most practical part of this area is transaction management and transaction isolation.

Related

How ##rowcount works if two queries executed simultaneously in two query windows?

I want to use ##rowcount to track rows processed. But I have question if one statement is executed and taking long time, meanwhile if other query executed in other query window will it affect ##rowcount value of first one?
How ##rowcount will behave in this scenario?
Database: SQL Server
It will give the corresponding rows affected for the two queries. Two windows here, are effectively two sessions. So they are not affected in any way whatsoever.
##ROWCOUNT is both scope and connection safe.
In fact, it reads only the last statement row count for that connection and scope. The full rules are here on MSDN (cursors, DML, EXECUTE etc)
To use it in subsequent statements, you need to store it in a local variable.

Do I have to include "SELECT ##RowCount" if I have more than one SQL statement?

I know that, if I execute a single SQL statement that UPDATEs or DELETEs some data, that it will return the number of rows affected.
But if I have multiple SQL statements in a sql script, and I want to know the number of rows affected from the last statement executed, will it still return that automatically, or do I need a
SELECT ##RowCount
at the end of the script?
The code in question is not a Stored Procedure. Rather, it is a parameterized SQL script stored in an arbitrary location, executed using the ExecuteStoreCommand function in Entity Framework, as in:
var numberOfRowsAffected = context.ExecuteStoreCommand<int>(mySqlScript, parameters);
It depends on the NOCOUNT setting when executing your quer(y/ies).
If NOCOUNT is ON then no DONE_IN_PROC messages will NOT be returned.
If NOCOUNT is OFF, the default setting, then DONE_IN_PROC messages will be returned, (eg. counts).
Both of these situations are different to executing,
SELECT ##ROWCOUNT;
which will return a result set with a single scalar value, different from a DONE_IN_PROC message. This will occur, regardless of the setting of NOCOUNT.
I believe that SELECT ##ROWCOUNT is sometimes used to make Entity Framework "play" with more complex TSQL statements because EF both requires
Requires a count for post validation
And will accept a scalar number result set as a substitute for a DONE_IN_PROC message.
Its important that SELECT ##ROWCOUNT; is executed immediately after the last query statement because many statements will reset ##ROWCOUNT and therefore yield an unexpected result.
Just to be specific on answer part, you would need to add SELECT ##RowCount to return number of rows affected by last statement.
I think confusion might be due to rows returned in SSMS window while executing query.By default SSMS shows number of rows returned for all sql statements but it returns affected rows as message not a dataset.
##ROWCOUNT will automatically return number of rows effected by the last statement.
Please find the msdn link here
https://msdn.microsoft.com/en-us/library/ms187316.aspx

Initial value of ##ROWCOUNT varies across databases and servers

We are attempting to run the a DELETE statement inside a WHILE loop (to avoid large transaction logs for lots of rows) as follows:
WHILE (##ROWCOUNT > 0)
BEGIN
DELETE TOP (250000)
FROM
MYDATABASE.MYSCHEMA.MYTABLE
WHERE
MYDATABASE.MYSCHEMA.MYTABLE.DATE_KEY = 20160301
END
When this command is executed inside a new SQL Server Management Studio connection in our development environment, it deletes rows in blocks of 250K, which is the expected behavior.
When this command is executed in the same way on our test server, we get the message
Command completed successfully
That is, the WHILE loop was not entered when the statement was run.
After some additional investigation, we have found that the behavior also varies depending on the database that we connect to. So if the code is run (in our test environment) while SQL Server Management Studio is connected to MYDATABASE, the DELETE statement does not run. If we run the code while connected to SOME_OTHER_DATABASE, it does.
We partially suspect that the value of ##ROWCOUNT is not reliable, and may be different for different connections. But when we run the code multiple times for each database & server combination, we see behavior that is 100% consistent. So random initial values of ##ROWCOUNT do not appear to explain things.
Any suggestions as to what could be going on here? Thanks for your help!
Edit #1
For those asking about the initial value of ##ROWCOUNT and where it is coming from, we're not sure. But in some cases ##ROWCOUNT is definitely being initialized to some value above zero, as the code works on a fresh connection as-is.
Edit #2
For those proposing the declaration of our own variable, for our particular application we are executing SQL commands via a programming language wrapper which only allows for the execution of one statement at a time (i.e., one semicolon).
We have previously tried to establish the value of ##ROWCOUNT by executing one delete statement prior to the loop:
Statement #1:
DELETE TOP (250000)
FROM
MYDATABASE.MYSCHEMA.MYTABLE
WHERE
MYDATABASE.MYSCHEMA.MYTABLE.DATE_KEY = 20160301
Statement #2 (##ROWCOUNT is presumably now 250,000):
WHILE (##ROWCOUNT > 0)
BEGIN
DELETE TOP (250000)
FROM
MYDATABASE.MYSCHEMA.MYTABLE
WHERE
MYDATABASE.MYSCHEMA.MYTABLE.DATE_KEY = 20160301
END
However, whatever is causing ##ROWCOUNT to take on a different value on start-up is also affecting the value between commands. So in some cases the second statement never executes.
You should not use a variable before you have set its value. That is equally true for system variables.
The code that you have is very dangerous. Someone could add something like SELECT 'Here I am in the loop' after the delete and it will break.
A better approach? Use your own variable:
DELCARE #RC int;
WHILE (#RC > 0 OR #RC IS NULL)
BEGIN
DELETE TOP (250000)
FROM MYDATABASE.MYSCHEMA.MYTABLE
WHERE MYDATABASE.MYSCHEMA.MYTABLE.DATE_KEY = 20160301;
SET #RC = ##ROWCOUNT;
END;
Where are you getting your initial ##ROWCOUNT from? I mean, you're never going to enter that block, because ##ROWCOUNT would be expected to be zero, so you'd never enter the loop. Also, deleting in 250K batches wouldn't change the size of your transaction log - all of the deletions will be logged if you're logging, so there's no benefit (and some penalty) for doing this w/in a loop.
Have you traced the session? Since ##ROWCOUNT returns the number of rows affected by the prior statement in the session, I would guess that either the last query SSMS executes as part of establishing the session returns a different number of rows in the two environments or that you have a login trigger in one or the other environments whose last statement returns a different number of rows. Either way, a trace should tell you exactly why the behavior is different.
Fundamentally, though, it makes no sense to refer to ##ROWCOUNT before you run the statement that you are interested in getting a count for. It's easy enough to fix this using a variable
DECLARE cnt integer = -1;
WHILE (cnt != 0)
BEGIN
DELETE TOP (250000)
FROM MYDATABASE.MYSCHEMA.MYTABLE
WHERE MYDATABASE.MYSCHEMA.MYTABLE.DATE_KEY = 20160301;
SET cnt = ##ROWCOUNT;
END

Transactions within loop within stored procedure

I'm working on a procedure that will update a large number of items on a remote server, using records from a local database. Here's the pseudocode.
CREATE PROCEDURE UpdateRemoteServer
pre-processing
get cursor with ID's of records to be updated
while on cursor
process the item
No matter how much we optimize it, the routine is going to take a while, so we don't want the whole thing to be processed as a single transaction. The items are flagged after being processed, so it should be possible to pick up where we left off if the process is interrupted.
Wrapping the contents of the loop ("process the item") in a begin/commit tran does not do the trick... it seems that the whole statement
EXEC UpdateRemoteServer
is treated as a single transaction. How can I make each item process as a complete, separate transaction?
Note that I would love to run these as "non-transacted updates", but that option is only available (so far as I know) in 2008.
EXEC procedure does not create a transaction. A very simple test will show this:
create procedure usp_foo
as
begin
select ##trancount;
end
go
exec usp_foo;
The ##trancount inside usp_foo is 0, so the EXEC statement does not start an implicit transaction. If you have a transaction started when entering UpdateRemoteServer it means somebody started that transaction, I can't say who.
That being said, using remote servers and DTC to update items is going to perform quite bad. Is the other server also SQL Server 2005 at least? Maybe you can queue the requests to update and use messaging between the local and remote server and have the remote server perform the updates based on the info from the message. It would perform significantly better because both servers only have to deal with local transactions, and you get much better availability due to the loose coupling of queued messaging.
Updated
Cursors actually don't start transactions. The typical cursor based batch processing is usually based on cursors and batches updates into transactions of a certain size. This is fairly common for overnight jobs, as it allows for better performance (log flush throughput due to larger transaction size) and jobs can be interrupted and resumed w/o losing everithing. A simplified version of a batch processing loop is typically like this:
create procedure usp_UpdateRemoteServer
as
begin
declare #id int, #batch int;
set nocount on;
set #batch = 0;
declare crsFoo cursor
forward_only static read_only
for
select object_id
from sys.objects;
open crsFoo;
begin transaction
fetch next from crsFoo into #id ;
while ##fetch_status = 0
begin
-- process here
declare #transactionId int;
SELECT #transactionId = transaction_id
FROM sys.dm_tran_current_transaction;
print #transactionId;
set #batch = #batch + 1
if #batch > 10
begin
commit;
print ##trancount;
set #batch = 0;
begin transaction;
end
fetch next from crsFoo into #id ;
end
commit;
close crsFoo;
deallocate crsFoo;
end
go
exec usp_UpdateRemoteServer;
I ommitted the error handling part (begin try/begin catch) and the fancy ##fetch_status checks (static cursors actually don't need them anyway). This demo code shows that during the run there are several different transactions started (different transaction IDs). Many times batches also deploy transaction savepoints at each item processed so they can skip safely an item that causes an exception, using a pattern similar to the one in my link, but this does not apply to distributed transactions since savepoints and DTC don't mix.
EDIT: as pointed out by Remus below, cursors do NOT open a transaction by default; thus, this is not the answer to the question posed by the OP. I still think there are better options than a cursor, but that doesn't answer the question.
Stu
ORIGINAL ANSWER:
The specific symptom you describe is due to the fact that a cursor opens a transaction by default, therefore no matter how you work it, you're gonna have a long-running transaction as long as you are using a cursor (unless you avoid locks altogether, which is another bad idea).
As others are pointing out, cursors SUCK. You don't need them for 99.9999% of the time.
You really have two options if you want to do this at the database level with SQL Server:
Use SSIS to perform your operation; very fast, but may not be available to you in your particular flavor of SQL Server.
Because you're dealing with remote servers, and you're worried about connectivity, you may have to use a looping mechanism, so use WHILE instead and commit batches at a time. Although WHILE has many of the same issues as a cursor (looping still sucks in SQL), you avoid creating the outer transaction.
Stu
Are yo running this only from within sql server, or from an app? if so, get the list to be processed, then loop in the app to only process for the subsets as required.
Then the transaction should be handled by your app, and should only lock the items being updated/pages the items are in.
NEVER process one item at a time in a loop when you are doing transactional work. You can loop through records processing groups of them but never ever do one record at a time. Do set-based inserts instead and your performance will change from hours to minutes or even seconds. If you are using a cursor to insert update or delete and it isn't handling at least 1000 rowa in each statement (not one at atime) you are doing the wrong thing. Cursors are an extremely poor practice for such thing.
Just an idea ..
Only process a few items when the procedure is called (e.g. only get the TOP 10 items to process)
Process those
Hopefully, this will be the end of the transaction.
Then write a wrapper that calls the procedure as long as there is more work to do (either use a simple count(..) to see if there are items or have the procedure return true indicating that there is more work to do.
Don't know if this works, but maybe the idea is helpful.

SQL Server Race Condition Question

(Note: this is for MS SQL Server)
Say you have a table ABC with a primary key identity column, and a CODE column. We want every row in here to have a unique, sequentially-generated code (based on some typical check-digit formula).
Say you have another table DEF with only one row, which stores the next available CODE (imagine a simple autonumber).
I know logic like below would present a race condition, in which two users could end up with the same CODE:
1) Run a select query to grab next available code from DEF
2) Insert said code into table ABC
3) Increment the value in DEF so it's not re-used.
I know that, two users could get stuck at Step 1), and could end up with same CODE in the ABC table.
What is the best way to deal with this situation? I thought I could just wrap a "begin tran" / "commit tran" around this logic, but I don't think that worked. I had a stored procedure like this to test, but I didn't avoid the race condition when I ran from two different windows in MS:
begin tran
declare #x int
select #x= nextcode FROM def
waitfor delay '00:00:15'
update def set nextcode = nextcode + 1
select #x
commit tran
Can someone shed some light on this? I thought the transaction would prevent another user from being able to access my NextCodeTable until the first transaction completed, but I guess my understanding of transactions is flawed.
EDIT: I tried moving the wait to after the "update" statement, and I got two different codes... but I suspected that. I have the waitfor statement there to simulate a delay so the race condition can be easily seen. I think the key problem is my incorrect perception of how transactions work.
Set the Transaction Isolation Level to Serializable.
At lower isolation levels, other transactions can read the data in a row that is read, (but not yet modified) in this transaction. So two transactions can indeed read the same value. At very low isolation (Read Uncommitted) other transactions can even read data after it's been modified (but before committed)...
Review details about SQL Server Isolation Levels here
So bottom line is that the Isolation level is crtitical piece here to control what level of access other transactions get into this one.
NOTE. From the link, about Serializable
Statements cannot read data that has been modified but not yet committed by other transactions.
This is because the locks are placed when the row is modified, not when the Begin Trans occurs, So what you have done may still allow another transaction to read the old value until the point where you modify it. So I would change the logic to modify it in the same statement as you read it, thereby putting the lock on it at the same time.
begin tran
declare #x int
update def set #x= nextcode, nextcode += 1
waitfor delay '00:00:15'
select #x
commit tran
As other responders have mentioned, you can set the transaction isolation level to ensure that anything you 'read' using a SELECT statement cannot change within a transaction.
Alternatively, you could take out a lock specifically on the DEF table by adding the syntax WITH HOLDLOCK after the table name, e.g.,
SELECT nextcode FROM DEF WITH HOLDLOCK
It doesn't make much difference here, as your transaction is small, but it can be useful to take out locks for some SELECTs and not others within a transaction. It's a question of 'repeatability versus concurrency'.
A couple of relavant MS-SQL docs.
Isolation levels
Table hints
Late answer. You want to avoid a race condition...
"SQL Server Process Queue Race Condition"
Recap:
You began a transaction. This doesn't actually "do" anything in and of itself, it modifies subsequent behavior
You read data from a table. The default isolation level is Read Committed, so this select statement is not made part of the transaction.
You then wait 15 seconds
You then issue an update. With the declared transaction, this will generate a lock until the transaction is committed.
You then commit the transaction, releasing the lock.
So, guessing you ran this simultaneously in two windows (A and B):
A read the "next" value from table def, then went into wait mode
B read the same "next" value from the table, then went into wait mode. (Since A only did a read, the transaction did not lock anything.)
A then updated the table, and probably commited the change before B exited the wait state.
B then updated the table, after A's write was committed.
Try putting the wait statement after the update, before the commit, and see what happens.
It's not a real race condition. It's more a common problem with concurrent transactions. One solution is to set a read lock on the table and therefor have a serialization in place.
This is actually a common problem in SQL databases and that is why most (all?) of them have some built in features to take care of this issue of obtaining a unique identifier. Here are some things to look into if you are using Mysql or Postgres. If you are using a different database I bet the provide something very similar.
A good example of this is postgres sequences which you can check out here:
Postgres Sequences
Mysql uses something called auto increments.
Mysql auto increment
You can set the column to a computed value that is persisted. This will take care of the race condition.
Persisted Computed Columns
NOTE
Using this method means you do not need to store the next code in a table. The code column becomes the reference point.
Implementation
Give the column the following properties under computed column specification.
Formula = dbo.GetNextCode()
Is Persisted = Yes
Create Function dbo.GetNextCode()
Returns VarChar(10)
As
Begin
Declare #Return VarChar(10);
Declare #MaxId Int
Select #MaxId = Max(Id)
From Table
Select #Return = Code
From Table
Where Id = #MaxId;
/* Generate New Code ... */
Return #Return;
End