SQL Azure: Frequent error - sql

We are having trouble with Sql Azure. We deploy a ancient .Net 4.0/C# based website on Azure. It is all good, except that every here and there we start getting Timeout or anything related to database start giving error. Couple of time it stop insert, DataSet.Fill function fails with Timeout or "Column not found".
As I understand it happens when SQL Azure is probably is busy to respond to our request and break connection in between, probably restarting or switching the node. However it looks bad on end client, when such error appears. Our Database is less than 1 GB in size and we got 10-15 at a time. So I don't think we have heavy load [as Azure is ready for 150 GB Database ours is less than 1% of it].
Anyone suggest what we should do to avoid such error, can we detect such upcoming error?

You will need to use Transient Fauly handling, The Transient Fault Handling Application Block is good place to start. When refactoring with legacy code for Sql Azure, I found it easier to use the SqlConnection and SqlCommand Extension methods:
Example of SqlConnection extions:
var retryPolicy = RetryPolicyFactory.GetDefaultSqlConnectionRetryPolicy()
using(var connection = new SqlConnection(connectingString))
{
connection.OpenWithRetry(retryPolicy);
///Do usual stuff with connection
}

Related

The wait operation timed out. .aspx

I created an internal website for our company. It run smoothly for several months and then I add more items to website. When I run in live, it run normally. Then suddenly one of my user from another server sending me an "The Wait operation timed out." error. When I check access that certain link, It run normally for me and some other who I ask to check if they access that page. I already increase the connection timeout but still no luck. Is it the error come from another server? Can someone explain the possible causes?
This is how the another plant faced, every time they firstly open the website, error screen show up, but when they refresh it, they can use the website. I dont know why this happened. I need your help.
Down below is a error detail:
1.Exception Details: System.ComponentModel.Win32Exception: The wait operation timed out
source error :An unhandled exception was generated during the execution of the current web request.
2.Information regarding the origin and location of the exception can be identified using the exception stack trace below.
Thanks in advance
The fact that this happens for a user but not for the testers implies this may occur when the system is under load; database timeouts are pretty common in database queries functioning under stress if the database has been set up "out of the box" without tuning.
I would suggest referring to
The wait operation timed out. ASP
I don't have enough information to troubleshoot more question properly, since I don't know what DBMS you are working with. But as a rule this seems to happen because a call to the database is timing out. In SQL Server, increasing the CommandTimeout (NOT connection timeout) is one of the quick-and-dirty ways to solve the problem.
In SQL Server, CommandTimeout is the time allowed for an operation before exiting with a time out error. Connectiontimeout, by contrast, is the time the system waits when trying to open an initial connection to the database. Changing connectiontimeout won't help with the timeout of an operation, but commandtimeout will.
Other DBMS systems will have other mechanisms for resolving timeout issues.
That's one quick and dirty solution. The longer solution is to add more logging to your system to identify which calls are timing out, then doing some DBA work to optimize the query and database performance. My understanding is that entity frameworks also have tuning options for automatically generated queries, but exactly what those are depends on which one you're using!

Unhandled Exception Error - Login Failed for User

We have a strange error here. In our ASP.NET 4.6 app, using Entity Framework 6.2, we are getting "Login failed for user" when accessing the SQL Azure database. I'm pretty sure the cause of the error is switching tiers in Azure. What I don't get is why the error isn't caught. Every SQL operation we have is inside a try...catch block. The errors fall out of the block and get caught by Globals.asax just before the app crashes.
We have
SetExecutionStrategy("System.Data.SqlClient", Function() New SqlServer.SqlAzureExecutionStrategy(10, TimeSpan.FromSeconds(7)))
which,as I understand it, will retry any SQL execution 10 times for at least 70 seconds from the first error. According to the Microsoft tech support, this isn't engaged because it hasn't made the connection to SQL Azure yet. The ConnectRetryCount and interval in the connection string do not apply since it is talking to the server. The server is just saying, "I know you are there, but I'm not going to let you in!"
According to MS Tech support, the only way around this is to have a try...catch block around all of our SQL commands... which we do! It just falls through and crashes the app!
I can't do a retry in globals.asax because at that point, it is already crashed.
According to MS, there is no way to trap the error in the context and retry from there. So, what's the solution? There must be some answer other than, "just let the app crash and have them refresh the page!"
When the page is refreshed seconds later, all is fine. No errors, no problems.
Example of one of the lines of code throwing the error:
MapTo = ctx.BrowserMaps.FirstOrDefault(Function(x) code.Contains(x.NameOrUserAgent))
It's really very straight forward. this one just happens to come up a lot because this code block is called frequently. The actual SQL request is irrelevant because no matter what line is used, the connection, within EF, fails.
Server logins will be disconnected while scaling up/down to a new tier, and transactions are rolled back. However, contained database logins stay connected during the scaling process, and for that reason they are recommended over server logins.
Having a try and catch may not solve the issue because you may be capturing error # zero and a lot of errors in Azure SQL database fall on that error 0 category.
Just a comment, performance after scaling may be poor right after scaling and improves after a few minutes. Query plans may also change.

Error 40 and SqlAzureExecutionStrategy

I have a service fabric service (guest executable), using entityframework core, talking to sql azure.
From time to time I see the following error:
A network-related or instance-specific error occurred while establishing a connection
to SQL Server. The server was not found or was not accessible. Verify that the
instance name is correct and that SQL Server is configured to allow remote connections.
(provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)
It seems transient as there are numerous database transactions that occur without errors. This seems to occur more when a node is busy.
I've added code in start up to set EnableRetryOnFailure to set the SqlServerRetryingExecutionStrategy:
services.AddEntityFrameworkSqlServer()
.AddDbContext<MyDbContext>(options =>
options.UseSqlServer(_configuration.GetConnectionString("MyDbConnection"),
o => o.EnableRetryOnFailure()))
One major caveat, is at the moment I'm losing context so I don't know what data was attempting to be updated/inserted, so I don't know if it was eventually successful or not.
Couple of questions:
From the Transient Detection Code it doesn't look like error: 40 is caught, but my understanding is that error 40 may actually be another error (unclear). Is that correct?
Is this really a transient issue or does it mean I have another problem?
Without additional logging (working on it), do we know if the retry strategy logs the error, but still retry's and in fact may have been successful?
If this is a transient error, but it's not caught in the default execution strategy, why not? and what would be unintentded consequences of sub classing the SqlAzureExecutionStrategy to include this error.
I've seen this question: Sql Connection Error 40 Under Load, and it feels familiar, but he seems to have resolved it by tuning his database - which I will look at doing, I'm trying to make my code more resilient in the case of database issues.
There is a certain version of EF Core that caches the query or requests if the time span between two database transactions is very small, so update your packages to make sure you are using the most recent.
Query: Threading issues cause NullReferenceException in SimpleNullableDependentKeyValueFactory #5456
check these other links
https://github.com/aspnet/EntityFramework/issues/5456
https://github.com/aspnet/Security/issues/739
https://github.com/aspnet/EntityFramework/issues/6347
https://github.com/aspnet/EntityFramework/issues/4120

How to make Dapper resilient for SqlAzure?

I recently found out Entity Framework has a very easy way to make connections resilient in SQL Azure. Is there a recommended way to accomplish the same in Dapper?
The fastest way to protect against connection issues in C# against Azure is the Microsoft Transient Fault Handling Block.
For example the below code would retry upto 3 times with 1 second intervals in-between when attempting to open a connection to a Windows Azure SQL Database:
var retryStrategy = new FixedInterval(3, TimeSpan.FromSeconds(1));
var retryPolicy =
new RetryPolicy<SqlDatabaseTransientErrorDetectionStrategy>(retryStrategy);
retryPolicy.ExecuteAction(() => myConnection.Open());
FixedInterval is the back off policy, so it will try, wait 1 second, try again, etc until it's tried 3 times.
SqlDatabaseTransientErrorDetectionStrategy is simply does a check on the exception thrown, if it's a connection exception that should be retried, it will tell the RetryPolicy to execute the action again. If it is not a connection exception, then the action will not be executed and the original exception will be thrown as normal.
As for when you should use it with Dapper; you can safely retry opening connections and read operations, however be aware or retrying write operations as there is a risk of duplicating inserts, trying to delete a row twice etc.
More detail here, this library can be found as a NuGet Package here which includes the detection strategies for Windows Azure.

How long should you keep an ODBC SQL connection open?

A long-running application uses the MS C OBDC API to create and use SQL connections to an Oracle DB. The application was originally designed to establish an ODBC connection at startup and keep that connection indefinitely as the application runs, potentially for weeks or months.
We're seeing very infrequent cases where a connection suddenly dies and I wondered if that's because we're using them wrong, or if it's considered OK to hold a connection like this. Can anyone point me to some definitive information on the subject?
I'm not sure there's definitive information regarding this, but with long-running programs you always have to be prepared for this kind of incidents, they just happen (and not only with db connections but also with sockets that remain open for extended periods). I have no experience with Oracle, but I have a very similar setup with Informix, and this is (in pseudocode) what we do
while (programissupposedtorun) {
opendb();
do {
youractivities();
} while(dbisok);
closedbandcleanup();
}
As long as you are able to correctly detect that the connection died and are able to resume processing without losing data you should be OK.
I'm not familiar with ODBC but such use cases are best handled with a connection pool. Your application can simply request a connection from the pool when it has some work and release it as soon as its done -- the pool will take care of actually (re)connecting to the database.
A quick search for ODBC connection pooling brought this up: Driver Manager Connection Pooling