I recently found out Entity Framework has a very easy way to make connections resilient in SQL Azure. Is there a recommended way to accomplish the same in Dapper?
The fastest way to protect against connection issues in C# against Azure is the Microsoft Transient Fault Handling Block.
For example the below code would retry upto 3 times with 1 second intervals in-between when attempting to open a connection to a Windows Azure SQL Database:
var retryStrategy = new FixedInterval(3, TimeSpan.FromSeconds(1));
var retryPolicy =
new RetryPolicy<SqlDatabaseTransientErrorDetectionStrategy>(retryStrategy);
retryPolicy.ExecuteAction(() => myConnection.Open());
FixedInterval is the back off policy, so it will try, wait 1 second, try again, etc until it's tried 3 times.
SqlDatabaseTransientErrorDetectionStrategy is simply does a check on the exception thrown, if it's a connection exception that should be retried, it will tell the RetryPolicy to execute the action again. If it is not a connection exception, then the action will not be executed and the original exception will be thrown as normal.
As for when you should use it with Dapper; you can safely retry opening connections and read operations, however be aware or retrying write operations as there is a risk of duplicating inserts, trying to delete a row twice etc.
More detail here, this library can be found as a NuGet Package here which includes the detection strategies for Windows Azure.
Related
We are having trouble with Sql Azure. We deploy a ancient .Net 4.0/C# based website on Azure. It is all good, except that every here and there we start getting Timeout or anything related to database start giving error. Couple of time it stop insert, DataSet.Fill function fails with Timeout or "Column not found".
As I understand it happens when SQL Azure is probably is busy to respond to our request and break connection in between, probably restarting or switching the node. However it looks bad on end client, when such error appears. Our Database is less than 1 GB in size and we got 10-15 at a time. So I don't think we have heavy load [as Azure is ready for 150 GB Database ours is less than 1% of it].
Anyone suggest what we should do to avoid such error, can we detect such upcoming error?
You will need to use Transient Fauly handling, The Transient Fault Handling Application Block is good place to start. When refactoring with legacy code for Sql Azure, I found it easier to use the SqlConnection and SqlCommand Extension methods:
Example of SqlConnection extions:
var retryPolicy = RetryPolicyFactory.GetDefaultSqlConnectionRetryPolicy()
using(var connection = new SqlConnection(connectingString))
{
connection.OpenWithRetry(retryPolicy);
///Do usual stuff with connection
}
We're running NServiceBus for a web application to handle situations where the user do "batch like" actions. Like fire a command that affects 1000 entities..
It works well, but during moderate load we get some deadlocks, this isn't a problem, just retry the message.. right? :)
The problem occurs when the next message arrives and tries to open a connection. The connection is then "corrupt".
We get the following error:
System.Data.SqlClient.SqlException (0x80131904): New request is not allowed to start because it should come with valid transaction descriptor
I've searched the web and I think our problem is a reported NH "bug":
A workaround should be to disable connection pooling. But I don't like that, since performce will degrade..
We're running NServiceBus 2.6, NHibernate 3.3.
Does anyone have any experience with this? Can a upgrade of NServiceBus help?
I’ve seen this in the past, if your design warrants, try breaking the transaction into two, if you flow the message transaction all the way to your database operations, any failures will have a cascading effect and it will impact (ideally it shouldn’t) any subsequent messages as well.
Instead of updating the 1000 entities in the command could you publishing an event to say that the command has been completed and then have several subscribers acting on this event to update effect entities. It sounds to me that a command that updates a 1000 entities should be split into a number of smaller commands. Take a look a the sagas to see how you can handle long running business process. For example, you might have something like, process started, step 1 completed, step 2 completed , process completed etc...
We are trying to implement retry logic to recover from transient errors in Azure environment.
We are using long-running sessions to keep track and commit the whole bunch of changes at the end of application transaction (which may spread over several web-requests). Along the way we need to get additional data from database. Our main problem is that we can't easily recover from db error because we can't "replay" all user actions.
So far we used straightforward recovery algorithm:
Try to perform operation in long-running session
In case of error, close the session, open a new one and merge entities into it
Retry the operation
It's very expensive approach in terms of time (merge is really long for big entity hierarchies). So we'd like to optimize things a little.
We'd like to perform query operations in separate session (to keep long running one untouched and safe) and on success, merge results back to the long-running session. Retry is relatively simple here - we just need to open new session and run query once more. However, with this approach we have an issue with initializing lazy properties/collections:
If we do this in separate session, we need to merge results back (a lot of entities) but merge could fail and break the long-running session
We tried different ways of "moving" original entity to different session, loading details and returning it back, but without success (evict, replicate, etc.)
There is known statement that session should be discarded in case of exception. However, the example shows write operation. Is it still true for read ones? I mean if I guarantee that no data is written back to the database, can I reuse the same session to run query again?
Do you have any other suggestions about retry logic with long-running sessions?
IMO there's no way to solve your issue. It's gonna take a lot of time to commit everything or you will have to do a lot of work to break it up into smaller sessions and handle every error that can occur while merging.
To answer your question about using the session after an exception: you cannot trust ANYTHING anymore inside this session, not even loaded entities.
Read this paragraph from Ayende's article about building a simple todo app with a recoveryplan in case of an exception in the session:
Then there is the problem of error handling. If you get an exception
(such as StaleObjectStateException, because of concurrency conflict),
your session and its loaded entities are toast, because with
NHibernate, an exception thrown from a session moves that session into
an undefined state. You can no longer use that session or any loaded
entities. If you have only a single global session, it means that you
probably need to restart the application, which is probably not a good
idea.
Quick overview of our topology:
Web sites sending commands to an nServiceBus server, which accepts the commands and then publishes the correct pub/sub events. This service also has message handlers that can do some process against the DB in response to the command, for instance:
1 user registers on web site
2 web site sends nServicebus command to nServicebus service on another server.
3 nServicebus server has a handler for that specific type of command, which logs something to the database and sends a welcome email
Since instituting this architecture we started to get deadlocks on the DB. I have traced it down to MSDTC on the database server. If I turn that service OFF on the database server nServicebus starts throwing up errors, which to me shows that nServiceBus has been enlisting the DB update in the transaction.
I don't wish this to happen, I want to handle the DB failing myself, I only want the transaction to ensure the message is delivered to my nServicebus proxy service. I don't want a transaction from the web all the way through 2 servers to the DB and back.
Any suggestions?
EDIT: this post provides some clues, however I'm not entirely sure it's the proper way to proceed.. NServiceBus - Problem with using TransactionScopeOption.Suppress in message handler
EDIT2: The reason that we want the DB work outside the scope of the transaction is that the intent is to 'asynchronously' process these commands on another server so as not to slow down the web site and/or cause users to wait for these long running aggregation commands. If the DB is within the scope of the transaction, is that blocking execution on the website at the point where the original command is fired to the distributor? Is there a better nServicebus architecture for this scenario? We want the command to fire quickly and return control to the web site so the user can quickly proceed and not have to wait for our longish running DB command, which is updating aggregate counts and sending emails etc.
I wouldn't recommend having the DB work outside the context of the NServiceBus transaction. Instead, try reducing the isolation level of the transactions. This can be done by calling:
.IsolationLevel(System.Transactions.IsolationLevel.ReadCommited)
in the fluent configuration. You'll have to put this after .MsmqTransport() in v2.6. In v3.0 you can put this call almost anywhere.
RESPONSE TO EDIT2:
Just using NServiceBus will achieve your objective of not slowing down the website, regardless of the level of the transactions run on the other server. The use of transactions is to provide a guarantee that messages won't be lost in case of failure and also that you won't have to write your own deduplication logic.
I'm using SQL Compact 3.5 SP2. My application is multi-threaded, but it does not share connections across threads. Instead, I use a custom object pool to ensure that each thread gets its own connection. That said, it's possible that a connection might be re-used on different threads at different times... in other words, I'm assuming that the connections don't have thread affinity. Also, not sure if it matters, but I'm using Entity Framework in .NET 3.5 SP1.
Anyway, when I've got high load situations (8+ threads), I'm getting lock timeout exceptions (regardless of the length of the timeout setting), and the exception always says the lock was on the __SysObjects table.
I'm not doing any DDL, so I don't understand why I would get locking timeouts on that table. Ideas?
I somewhat resolved this issue by making sure that my connections were closed after each use (as opposed to pooling the open connections), but if I let the code run for a long period of time I started getting OutOfMemoryException and AccessViolation exceptions.
This smells like the SqlCeConnection class has some kind of thread affinity dependency. Either that, or it has a memory leak of some kind.
At any rate, I've given up on trying to pool these objects.
EDIT: This actually appears to be an issued address by Cummulative Update 2. Since updating my references to the new libs, I haven't seen this problem. See: http://support.microsoft.com/kb/983516