Unhandled Exception Error - Login Failed for User - sql

We have a strange error here. In our ASP.NET 4.6 app, using Entity Framework 6.2, we are getting "Login failed for user" when accessing the SQL Azure database. I'm pretty sure the cause of the error is switching tiers in Azure. What I don't get is why the error isn't caught. Every SQL operation we have is inside a try...catch block. The errors fall out of the block and get caught by Globals.asax just before the app crashes.
We have
SetExecutionStrategy("System.Data.SqlClient", Function() New SqlServer.SqlAzureExecutionStrategy(10, TimeSpan.FromSeconds(7)))
which,as I understand it, will retry any SQL execution 10 times for at least 70 seconds from the first error. According to the Microsoft tech support, this isn't engaged because it hasn't made the connection to SQL Azure yet. The ConnectRetryCount and interval in the connection string do not apply since it is talking to the server. The server is just saying, "I know you are there, but I'm not going to let you in!"
According to MS Tech support, the only way around this is to have a try...catch block around all of our SQL commands... which we do! It just falls through and crashes the app!
I can't do a retry in globals.asax because at that point, it is already crashed.
According to MS, there is no way to trap the error in the context and retry from there. So, what's the solution? There must be some answer other than, "just let the app crash and have them refresh the page!"
When the page is refreshed seconds later, all is fine. No errors, no problems.
Example of one of the lines of code throwing the error:
MapTo = ctx.BrowserMaps.FirstOrDefault(Function(x) code.Contains(x.NameOrUserAgent))
It's really very straight forward. this one just happens to come up a lot because this code block is called frequently. The actual SQL request is irrelevant because no matter what line is used, the connection, within EF, fails.

Server logins will be disconnected while scaling up/down to a new tier, and transactions are rolled back. However, contained database logins stay connected during the scaling process, and for that reason they are recommended over server logins.
Having a try and catch may not solve the issue because you may be capturing error # zero and a lot of errors in Azure SQL database fall on that error 0 category.
Just a comment, performance after scaling may be poor right after scaling and improves after a few minutes. Query plans may also change.

Related

The wait operation timed out. .aspx

I created an internal website for our company. It run smoothly for several months and then I add more items to website. When I run in live, it run normally. Then suddenly one of my user from another server sending me an "The Wait operation timed out." error. When I check access that certain link, It run normally for me and some other who I ask to check if they access that page. I already increase the connection timeout but still no luck. Is it the error come from another server? Can someone explain the possible causes?
This is how the another plant faced, every time they firstly open the website, error screen show up, but when they refresh it, they can use the website. I dont know why this happened. I need your help.
Down below is a error detail:
1.Exception Details: System.ComponentModel.Win32Exception: The wait operation timed out
source error :An unhandled exception was generated during the execution of the current web request.
2.Information regarding the origin and location of the exception can be identified using the exception stack trace below.
Thanks in advance
The fact that this happens for a user but not for the testers implies this may occur when the system is under load; database timeouts are pretty common in database queries functioning under stress if the database has been set up "out of the box" without tuning.
I would suggest referring to
The wait operation timed out. ASP
I don't have enough information to troubleshoot more question properly, since I don't know what DBMS you are working with. But as a rule this seems to happen because a call to the database is timing out. In SQL Server, increasing the CommandTimeout (NOT connection timeout) is one of the quick-and-dirty ways to solve the problem.
In SQL Server, CommandTimeout is the time allowed for an operation before exiting with a time out error. Connectiontimeout, by contrast, is the time the system waits when trying to open an initial connection to the database. Changing connectiontimeout won't help with the timeout of an operation, but commandtimeout will.
Other DBMS systems will have other mechanisms for resolving timeout issues.
That's one quick and dirty solution. The longer solution is to add more logging to your system to identify which calls are timing out, then doing some DBA work to optimize the query and database performance. My understanding is that entity frameworks also have tuning options for automatically generated queries, but exactly what those are depends on which one you're using!

Error 40 and SqlAzureExecutionStrategy

I have a service fabric service (guest executable), using entityframework core, talking to sql azure.
From time to time I see the following error:
A network-related or instance-specific error occurred while establishing a connection
to SQL Server. The server was not found or was not accessible. Verify that the
instance name is correct and that SQL Server is configured to allow remote connections.
(provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)
It seems transient as there are numerous database transactions that occur without errors. This seems to occur more when a node is busy.
I've added code in start up to set EnableRetryOnFailure to set the SqlServerRetryingExecutionStrategy:
services.AddEntityFrameworkSqlServer()
.AddDbContext<MyDbContext>(options =>
options.UseSqlServer(_configuration.GetConnectionString("MyDbConnection"),
o => o.EnableRetryOnFailure()))
One major caveat, is at the moment I'm losing context so I don't know what data was attempting to be updated/inserted, so I don't know if it was eventually successful or not.
Couple of questions:
From the Transient Detection Code it doesn't look like error: 40 is caught, but my understanding is that error 40 may actually be another error (unclear). Is that correct?
Is this really a transient issue or does it mean I have another problem?
Without additional logging (working on it), do we know if the retry strategy logs the error, but still retry's and in fact may have been successful?
If this is a transient error, but it's not caught in the default execution strategy, why not? and what would be unintentded consequences of sub classing the SqlAzureExecutionStrategy to include this error.
I've seen this question: Sql Connection Error 40 Under Load, and it feels familiar, but he seems to have resolved it by tuning his database - which I will look at doing, I'm trying to make my code more resilient in the case of database issues.
There is a certain version of EF Core that caches the query or requests if the time span between two database transactions is very small, so update your packages to make sure you are using the most recent.
Query: Threading issues cause NullReferenceException in SimpleNullableDependentKeyValueFactory #5456
check these other links
https://github.com/aspnet/EntityFramework/issues/5456
https://github.com/aspnet/Security/issues/739
https://github.com/aspnet/EntityFramework/issues/6347
https://github.com/aspnet/EntityFramework/issues/4120

Cannot access SQL azure

Just had a bizarre issue with SQL Azure, and it's happened in a small phase just before full go live with some users doing some data entry.
"Database 'dbname' on server 'xxx' is not currently available. Please rety the connection later. If the problem persists, contact customer support."
When I tried to connect via SQL Azure database website I got:
"Firewall check failed.
Resource ID : 1. The request minimum guarantee is 0,
maximum limit is 180 and the current usage for the database is 0.
However, the server is currently too busy to support request greater than 0 for this database."
Looking at the databases section of the Azure Management website the site reported it couldn't access the DB, but I didn't capture the exact error message unfortunately.
Bizarrely, a couple of my users were still able to login to our system website that access the DB, and view and save data. Eventually they lost connection too however.
After an hour or so, the databases came back to life and we could fully access them again.
I have looked at the servers master db event table using queries from here and there was a couple of connection failures but nothing interesting. No throttling or deadlocks, a couple of failed connections that said "Client may have timed out when establishing connection. Try increasing the connection timeout." in the description
Any ideas where else to look?
Business users have had a massive drop in confidence because of this.
What your describing normally occurs because of :
1) SQL Connection limit being hit. Assuming you don't see this often you unlikely to be the cause. But worth checking putting a limit on your connection pool can help.
2)You neighbours being extremely noisy and thus the node re-adjusts.
3) Hardware failure and Microsoft bringing your database back online in a different node. This can take some time.
Normally I have seen this when Microsoft have throttled or had problems with a box and had to recover everyone over. Because you are on a shared system you have to keep in mind that they are recovering everyone else also in that node also and thus sometimes this takes time.
The best bet if you are worried and need to get a resolution for the business is to open a support ticket with MS and give them the time and error message you saw this. They will investigate and generally they have really good back end telemetry that will point to a reason. This will allow you to give the business a resolution and then you can make a call on future plans and contingencies. You have to keep in mind though that SQL Azure is shared system and transient errors can happen, you might need to design more failover into your designs.

Async=true and Entity Framework

Background WCF Stack, Data Access Implemented in Entity Framework, Simple ASP.NET Front End
This is a two part question.
Recently we ran into an issue with periodic crashes with an exception that read:
A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The specified network name is no longer available
We had been running our application without issues for over a week, and then all the sudden we were hit with this random crash/ If I had to guess I would say it was network related, but we were unable to determine the exact source. Has anyone periodically gotten this message? If so what was the root cause?
Second question is someone suggested to set "async=true" in our Entity Framework connection string. I was under the impression this just enables the async api. Does this do anything when you are using EF? Does switching this flag do anything with the queries that get generated by EF?
To be that guy I will answer this one on my own.
First I posted the question about the "async=true"s effect on entity framrwork to MS and no one answered ... as usual(if they answer i will update this post).
Our issue:
A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The specified network name is no longer available
Was environment related. Something was causing the DB to run a little bit slower, but it was hinting to a larger issue. Apparently EF has horrible issues when you share context between threads (not an easy problem to solve), so we were seeing a race condition with opening connections.
We basically had a "read only context" that only did gets. Our issue was two threads attempt to open the connection at the same time, one wins, the other loses resulting in some variation of the exception below:
The connection was not closed. The connection's current state is connecting.
Our solution was to convert our singleton to be thread specific. Not exactly what we wanted, but it worked, and when we pushed this fix our other issue magically went away.
The second half to this question was what does async=true do. When it comes to EF, it made our system crash. We had a block of code that did a join, and if async=true and MARS=false we got a:
There is already an open DataReader associated with this Command which must be closed first
Once we cut back on MARS, and disabled async things were good again.

SQL Server Express Idle Mode Partial Data Returns?

I'm attempting to help our network engineers troubleshoot a situation for one of our clients. This client purchased a point-of-sale system from quite literally a "mom-and-pop" vendor, and said vendor recommended SQL Server Express 2005 as the back-end database to save the client from having to incur extra licensing fees. (Please don't get me started on that!)
We didn't write the app, and because it's a commercial app, we have no source code available. (Not that it would help us if we did; the thing was built in PowerBuilder, so we don't have tooling for it.) The app does none of its own logging, that we can ascertain. All we have to go on is SQL Server Express's own logging.
In the application, an end user swipes a membership card. Occasionally (a few times a day), the swipe will not return data from the database. The message on screen will say, "Member 123 not found." (The member numbers are actually six digits, "000123.") A rescan immediately afterward returns the member data correctly.
We've eliminated the scanner itself as a source of issues -- it routinely scans the full six-digit number. A scan of SQL Server Express's log indicates that it is coming back online from being idle, often at the point of the scan (but also at several other times per day). (Idle mode is explained here.)
I understand that allocating/deallocating RAM the way SQL Express does is a time-consuming process, especially if we're talking about hundreds of megabytes at a time -- which appears to be the case.
What we're not sure of is whether or not we're getting back partial data, or if the app is simply failing to connect to the database and displaying a generic error message. Since everything is so opaque, and the client is (for obvious reasons) unwilling to pay us to sit in their facility for 8 hours or so to physically see it happen (perhaps with network monitoring/packet sniffing tools), we're kind of at a loss.
At this point, our recommendation is that the client upgrade to SQL Server 2005 Workgroup Edition, with 5 CALs. But that doesn't completely sit well with me as the solution to this issue, because I'm reasonably certain that no SQL Server ever returns partial data -- if you can't connect, you can't connect. (That said, I still recommend it because it's a solution to a number of their other issues!)
I don't have much experience with Express. (I never use it for anything but local development, and there only at home; I certainly never recommend it to my clients.)
My question to those who might have experience with Express is, have you ever seen an instance of SQL Express return partial data, without the app itself being the cause of it? Specifically, have you seen this behavior when returning from idle mode?
(For what it's worth, we're inclined to believe that the app is failing to connect and merely displaying a generic error message, lopping off leading zeroes on the member ID when it does. That seems the most reasonable answer -- a third question might be, do you guys concur with that assessment?)
I've never heard of or experienced SQL Server Express returning partial data. It's essentially the same code base as the full SQL Server.
It is more likely that the application is experiencing a timeout (which defaults to 30 seconds) due to SQL Server Express going idle. The application probably receives a timeout that it does not expect and does not handle it well.
The problem and possible solutions are discussed in this forum thread: http://social.msdn.microsoft.com/forums/en-US/sqlexpress/thread/a8fbf8d6-9949-47a5-a32b-50f8131f1127/
I suspect you have a connection string that looks like this:
Data Source=.\SQLEXPRESS; Integrated Security=True;AttachDbFilename=|DataDirectory|\myDatabase.mdf;User Instance=True
From the referenced thread:
This connection string will cause an
initial connection to the main
instance (.\SQLEXPRESS) and then
instruct the main instance to spawn a
new instance of SQL Server under the
user's context and attach the database
specified to that new User Instance.
The User Instance is a completely
separate running instance of SQL
Server form the main instance that is
unique to the user and that will be
shut down when there are no longer any
connections to it.
This is totally different that
attaching a database to the main
instance, which stays running at all
times, unless you've manually shut it
down. If your question is about the
main instance going into an Idle
state, then your question is not
unique to SQL Express and you should
ask this question in the Database
Engine forum. I believe all Editions
of SQL Server have an Idle state and
the other forum would be where you can
find out how to affect that behavior.