NServiceBus v5 Circuit Breaker - nhibernate

We have recently upgraded our NServiceBus project from version 4 to version 5. We are using NHibernate for data storage to an SQL server database. Since the upgrade we have started to encounter an error around connection timeouts and the TimeoutEntity table. The NServiceBus services run fine for a while - at least a couple of hours and then they stop.
When investigating the cause of this it seems to be down to polling query to the TimeoutEntity table - the query is done every minute and if the query takes more than 2 seconds to complete an error is raised and CriticalError.Raise is called - this causes the NServiceBus to stop the service.
One route of investigation is to find out of the cause of the timeouts, but we would also like to know why this functionality was changed - in the previous version of NServiceBus, Logger.Warn was called rather than CriticalError.Raise. Would anybody know why this change was made in NServiceBus 5 and what we can do to mitigate it?

You can configure the time to wait before raising a critical error, see http://docs.particular.net/nservicebus/errors/critical-exception-for-timeout-outages on how to do it.

You can also define your own critical error action by using
config.DefineCriticalErrorAction((message, exception) => {
<do something here>
});

Related

Unhandled Exception Error - Login Failed for User

We have a strange error here. In our ASP.NET 4.6 app, using Entity Framework 6.2, we are getting "Login failed for user" when accessing the SQL Azure database. I'm pretty sure the cause of the error is switching tiers in Azure. What I don't get is why the error isn't caught. Every SQL operation we have is inside a try...catch block. The errors fall out of the block and get caught by Globals.asax just before the app crashes.
We have
SetExecutionStrategy("System.Data.SqlClient", Function() New SqlServer.SqlAzureExecutionStrategy(10, TimeSpan.FromSeconds(7)))
which,as I understand it, will retry any SQL execution 10 times for at least 70 seconds from the first error. According to the Microsoft tech support, this isn't engaged because it hasn't made the connection to SQL Azure yet. The ConnectRetryCount and interval in the connection string do not apply since it is talking to the server. The server is just saying, "I know you are there, but I'm not going to let you in!"
According to MS Tech support, the only way around this is to have a try...catch block around all of our SQL commands... which we do! It just falls through and crashes the app!
I can't do a retry in globals.asax because at that point, it is already crashed.
According to MS, there is no way to trap the error in the context and retry from there. So, what's the solution? There must be some answer other than, "just let the app crash and have them refresh the page!"
When the page is refreshed seconds later, all is fine. No errors, no problems.
Example of one of the lines of code throwing the error:
MapTo = ctx.BrowserMaps.FirstOrDefault(Function(x) code.Contains(x.NameOrUserAgent))
It's really very straight forward. this one just happens to come up a lot because this code block is called frequently. The actual SQL request is irrelevant because no matter what line is used, the connection, within EF, fails.
Server logins will be disconnected while scaling up/down to a new tier, and transactions are rolled back. However, contained database logins stay connected during the scaling process, and for that reason they are recommended over server logins.
Having a try and catch may not solve the issue because you may be capturing error # zero and a lot of errors in Azure SQL database fall on that error 0 category.
Just a comment, performance after scaling may be poor right after scaling and improves after a few minutes. Query plans may also change.

BizTalk WCF-SQL Receivelocation Notification callback returned an error

Im trying to get SQL Notifications to work with BizTalk, but im struggling a one point.
The Binding of the Receivelocation is the following:
The SQL Server is supporting Notifications, and the connection string is correct.
When i start the Receivelocation it is working exactly one time in a correct way, but when i disable it and start it again, i get the following error in the eventlog.
The Messaging Engine failed to add a receive location
"RL.MDM.SQL" with URL
"mssql://.//Database?InboundId=GetNewMDMChanges" to the adapter
"WCF-SQL". Reason:
"Microsoft.ServiceModel.Channels.Common.TargetSystemException: The
notification callback returned an error. Info=Invalid.
Source=Statement. Type=Subscribe.
I cant start the Receivelocation again till i Execute the following command on the Database to enable the Broker.
alter database MDMDEV set enable_broker with rollback immediate;
The strange thing here is when i check if the broker is still enabled before i execute the command above, i see that the broker is indeed still enabled.
So the command to enable the broker fixes my problem for exactly one other notification and than i have to do this again.
Has anybody ever had this problem or can tell me what im doing wrong?
Thanks in advance.
Regarding the Notifications feature in general, my recommendation is to not use it.
With both SQL Server and Oracle, the Notifications feature is quite fragile and will stop receiving event with no warning or error. When this happens, the only way to recover is Disable/Enable the Receive Location.
Basically, I have found it not reliable enough to use in production apps.
If you or your organization own the database, Polling [+ Triggers if needed] are 100% reliable.
This article describes some different Polling scenarios: BizTalk Server: SQL Patterns for Polling and Batch Retrieve

Disable DTC tracking in RavenDB with Voron storage

We are experimenting with using RavenDB as a log target within our current NLog usage. We created a new RavenDB server and a database using Voron as the storage engine. Then it was just a matter of updating our nlog configurations, which we did.
For a while, everything was great, but then we ran into a situation where we are running calling Trace() within a database transaction. With ravenDB setup as the log target, this means the SaveChanges() call is made within the transaction as well.
Everything is great until we call Transaction.Commit(). RavenDB throws a 500 server error at that point, because DTC is not supported with the Voron engine (and is slated to be removed everywhere, from what I understand). I'm fine with this. I don't particularly care - writing to the log should not be part of the transaction anyway, because if something does go wrong, we don't want to remove the related log entries.
I searched the documentation hoping to find a configuration option that I could set that would just tell RavenDB to ignore transactions, or at least ignore DTC, but I can't find anything. Is there anyway I can get it to stop throwing these 500 server errors?
Set EnlistInDistributedTransactions = false on the document store, and it will work for you

SQL Azure: Frequent error

We are having trouble with Sql Azure. We deploy a ancient .Net 4.0/C# based website on Azure. It is all good, except that every here and there we start getting Timeout or anything related to database start giving error. Couple of time it stop insert, DataSet.Fill function fails with Timeout or "Column not found".
As I understand it happens when SQL Azure is probably is busy to respond to our request and break connection in between, probably restarting or switching the node. However it looks bad on end client, when such error appears. Our Database is less than 1 GB in size and we got 10-15 at a time. So I don't think we have heavy load [as Azure is ready for 150 GB Database ours is less than 1% of it].
Anyone suggest what we should do to avoid such error, can we detect such upcoming error?
You will need to use Transient Fauly handling, The Transient Fault Handling Application Block is good place to start. When refactoring with legacy code for Sql Azure, I found it easier to use the SqlConnection and SqlCommand Extension methods:
Example of SqlConnection extions:
var retryPolicy = RetryPolicyFactory.GetDefaultSqlConnectionRetryPolicy()
using(var connection = new SqlConnection(connectingString))
{
connection.OpenWithRetry(retryPolicy);
///Do usual stuff with connection
}

Deadlocks when running NServicebus service causes corrupt connection

We're running NServiceBus for a web application to handle situations where the user do "batch like" actions. Like fire a command that affects 1000 entities..
It works well, but during moderate load we get some deadlocks, this isn't a problem, just retry the message.. right? :)
The problem occurs when the next message arrives and tries to open a connection. The connection is then "corrupt".
We get the following error:
System.Data.SqlClient.SqlException (0x80131904): New request is not allowed to start because it should come with valid transaction descriptor
I've searched the web and I think our problem is a reported NH "bug":
A workaround should be to disable connection pooling. But I don't like that, since performce will degrade..
We're running NServiceBus 2.6, NHibernate 3.3.
Does anyone have any experience with this? Can a upgrade of NServiceBus help?
I’ve seen this in the past, if your design warrants, try breaking the transaction into two, if you flow the message transaction all the way to your database operations, any failures will have a cascading effect and it will impact (ideally it shouldn’t) any subsequent messages as well.
Instead of updating the 1000 entities in the command could you publishing an event to say that the command has been completed and then have several subscribers acting on this event to update effect entities. It sounds to me that a command that updates a 1000 entities should be split into a number of smaller commands. Take a look a the sagas to see how you can handle long running business process. For example, you might have something like, process started, step 1 completed, step 2 completed , process completed etc...