Deadlocks when running NServicebus service causes corrupt connection - nhibernate

We're running NServiceBus for a web application to handle situations where the user do "batch like" actions. Like fire a command that affects 1000 entities..
It works well, but during moderate load we get some deadlocks, this isn't a problem, just retry the message.. right? :)
The problem occurs when the next message arrives and tries to open a connection. The connection is then "corrupt".
We get the following error:
System.Data.SqlClient.SqlException (0x80131904): New request is not allowed to start because it should come with valid transaction descriptor
I've searched the web and I think our problem is a reported NH "bug":
A workaround should be to disable connection pooling. But I don't like that, since performce will degrade..
We're running NServiceBus 2.6, NHibernate 3.3.
Does anyone have any experience with this? Can a upgrade of NServiceBus help?

I’ve seen this in the past, if your design warrants, try breaking the transaction into two, if you flow the message transaction all the way to your database operations, any failures will have a cascading effect and it will impact (ideally it shouldn’t) any subsequent messages as well.

Instead of updating the 1000 entities in the command could you publishing an event to say that the command has been completed and then have several subscribers acting on this event to update effect entities. It sounds to me that a command that updates a 1000 entities should be split into a number of smaller commands. Take a look a the sagas to see how you can handle long running business process. For example, you might have something like, process started, step 1 completed, step 2 completed , process completed etc...

Related

How to pause Web API ? Is it even possible?

We are facing odd issue.
We have two parts
1. Windows task to update database
2. Web API using same database to provide search results
We want to pause API while Windows task updating the database. So Search results won't be partial or incorrect.
Is it possible to pause API request while database is being updated? Database update take about 10-15 seconds.
When you say "pause", what do you expect to happen to callers? It seems like you are choosing to give them errors instead of incomplete data.
If possible, your database updates should be wrapped in a transaction so consumers get current, complete data until the transaction is committed. Then, the next call will have updated and complete data.
I would hope that transactional processing would also help you recover from errors in your updates. What happens now if something fails part way through an update?
This post may help you: How to Decide to use Database Transactions
If the API knows when the this task is being starting, you can do have the thread sleep for 10 seconds by calling:
System.Threading.Thread.Sleep(10000)

NServiceBus - Long running handler prevents queue from processing any other messages

I am running NSB 5 and I am using NHibernate Persistence and have MaximumConcurrencyLevel set to 10.
I have a handler that calls a stored proc that executes an SSIS package. This package takes a non trivial amount of time to run. I started to notice that whenever this particular message type is handled all other message handling stops. I noticed via SQL Profiler that the constant querying of the queue table that NSB does in the background stops and that any extra messages put into the queue are not handled even though NSB is only handling one message.
Is there any guidelines or known issues for dealing with handlers that block the queue because database commands take a long time to complete?
Is sounds like 10 threads are busy, so the endpoint is blocked, can you test this?
I would recommend hosting this message handler in its own process
Make sense?

NServiceBus v5 Circuit Breaker

We have recently upgraded our NServiceBus project from version 4 to version 5. We are using NHibernate for data storage to an SQL server database. Since the upgrade we have started to encounter an error around connection timeouts and the TimeoutEntity table. The NServiceBus services run fine for a while - at least a couple of hours and then they stop.
When investigating the cause of this it seems to be down to polling query to the TimeoutEntity table - the query is done every minute and if the query takes more than 2 seconds to complete an error is raised and CriticalError.Raise is called - this causes the NServiceBus to stop the service.
One route of investigation is to find out of the cause of the timeouts, but we would also like to know why this functionality was changed - in the previous version of NServiceBus, Logger.Warn was called rather than CriticalError.Raise. Would anybody know why this change was made in NServiceBus 5 and what we can do to mitigate it?
You can configure the time to wait before raising a critical error, see http://docs.particular.net/nservicebus/errors/critical-exception-for-timeout-outages on how to do it.
You can also define your own critical error action by using
config.DefineCriticalErrorAction((message, exception) => {
<do something here>
});

nServiceBus : How do I make a non-transactional call to a database from within the context of a transactional operation

Quick overview of our topology:
Web sites sending commands to an nServiceBus server, which accepts the commands and then publishes the correct pub/sub events. This service also has message handlers that can do some process against the DB in response to the command, for instance:
1 user registers on web site
2 web site sends nServicebus command to nServicebus service on another server.
3 nServicebus server has a handler for that specific type of command, which logs something to the database and sends a welcome email
Since instituting this architecture we started to get deadlocks on the DB. I have traced it down to MSDTC on the database server. If I turn that service OFF on the database server nServicebus starts throwing up errors, which to me shows that nServiceBus has been enlisting the DB update in the transaction.
I don't wish this to happen, I want to handle the DB failing myself, I only want the transaction to ensure the message is delivered to my nServicebus proxy service. I don't want a transaction from the web all the way through 2 servers to the DB and back.
Any suggestions?
EDIT: this post provides some clues, however I'm not entirely sure it's the proper way to proceed.. NServiceBus - Problem with using TransactionScopeOption.Suppress in message handler
EDIT2: The reason that we want the DB work outside the scope of the transaction is that the intent is to 'asynchronously' process these commands on another server so as not to slow down the web site and/or cause users to wait for these long running aggregation commands. If the DB is within the scope of the transaction, is that blocking execution on the website at the point where the original command is fired to the distributor? Is there a better nServicebus architecture for this scenario? We want the command to fire quickly and return control to the web site so the user can quickly proceed and not have to wait for our longish running DB command, which is updating aggregate counts and sending emails etc.
I wouldn't recommend having the DB work outside the context of the NServiceBus transaction. Instead, try reducing the isolation level of the transactions. This can be done by calling:
.IsolationLevel(System.Transactions.IsolationLevel.ReadCommited)
in the fluent configuration. You'll have to put this after .MsmqTransport() in v2.6. In v3.0 you can put this call almost anywhere.
RESPONSE TO EDIT2:
Just using NServiceBus will achieve your objective of not slowing down the website, regardless of the level of the transactions run on the other server. The use of transactions is to provide a guarantee that messages won't be lost in case of failure and also that you won't have to write your own deduplication logic.

Setting a deadlock victim

We're using siteCore 6.5 and each time we start to publish items, users who are browsing the website will get server 500 errors which end up being
Transaction (Process ID ##) was deadlocked on lock resources with
another process and has been chosen as the deadlock victim. Rerun the
transaction.
How can we setup SQL Server to give priority to a specific application? We cannnot modify any queries or code so it has to be done via SQL Server (or connection string)
I've seen "deadlock victim" in transaction, how to change the priority? and looked at http://msdn.microsoft.com/en-us/library/ms186736(v=SQL.105).aspx but these seem to be per session, not globally.
I don't care if it's a fix/change to SiteCore or a SQL solution.
I don't think you can set the deadlock priority globally - it's a session-only setting. There are not any connection string settings that I know of. The list of possible SqlConnection string settings can be found here.
It sounds to me like you're actually having a problem with the cache and every time you publish, it's clearing the cache and thus you're getting deadlocks with all these calls made at the same time. I haven't seen this sort of thing happen with 6.5 so you might also want to check into your caching. It would help a lot to look at your Sitecore logs and see if this is happening when caches are being created. Either way, check the caching guide on the SDN and see if that helps.