XDB Connection Reset Errot - emc

xDB server was running successfully in our environment.
But after some days, we are getting "Connection reset" exception.
What could be the possible reason and how we can solve this issue?
Thanks & Regards
Arpan

Without further information about your application and the way you use xDB it is very difficult to answer this, but "Connection reset" type of errors usually occur in two situations:
The server-side closed the socket, for instance as a result of an unexpected error/crash.
There was no communication on the socket for long time and the OS closed the channel because of the TCP keep-alive timeout (which I believe is usually 2 hours).
Can you describe in more detail the circumstances when you get the error? Is the server still operational after you start getting the error? If so, are you perhaps pooling xDB sessions and reusing them after potentially long periods of inactivity?

Related

The wait operation timed out. .aspx

I created an internal website for our company. It run smoothly for several months and then I add more items to website. When I run in live, it run normally. Then suddenly one of my user from another server sending me an "The Wait operation timed out." error. When I check access that certain link, It run normally for me and some other who I ask to check if they access that page. I already increase the connection timeout but still no luck. Is it the error come from another server? Can someone explain the possible causes?
This is how the another plant faced, every time they firstly open the website, error screen show up, but when they refresh it, they can use the website. I dont know why this happened. I need your help.
Down below is a error detail:
1.Exception Details: System.ComponentModel.Win32Exception: The wait operation timed out
source error :An unhandled exception was generated during the execution of the current web request.
2.Information regarding the origin and location of the exception can be identified using the exception stack trace below.
Thanks in advance
The fact that this happens for a user but not for the testers implies this may occur when the system is under load; database timeouts are pretty common in database queries functioning under stress if the database has been set up "out of the box" without tuning.
I would suggest referring to
The wait operation timed out. ASP
I don't have enough information to troubleshoot more question properly, since I don't know what DBMS you are working with. But as a rule this seems to happen because a call to the database is timing out. In SQL Server, increasing the CommandTimeout (NOT connection timeout) is one of the quick-and-dirty ways to solve the problem.
In SQL Server, CommandTimeout is the time allowed for an operation before exiting with a time out error. Connectiontimeout, by contrast, is the time the system waits when trying to open an initial connection to the database. Changing connectiontimeout won't help with the timeout of an operation, but commandtimeout will.
Other DBMS systems will have other mechanisms for resolving timeout issues.
That's one quick and dirty solution. The longer solution is to add more logging to your system to identify which calls are timing out, then doing some DBA work to optimize the query and database performance. My understanding is that entity frameworks also have tuning options for automatically generated queries, but exactly what those are depends on which one you're using!

Error 40 and SqlAzureExecutionStrategy

I have a service fabric service (guest executable), using entityframework core, talking to sql azure.
From time to time I see the following error:
A network-related or instance-specific error occurred while establishing a connection
to SQL Server. The server was not found or was not accessible. Verify that the
instance name is correct and that SQL Server is configured to allow remote connections.
(provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)
It seems transient as there are numerous database transactions that occur without errors. This seems to occur more when a node is busy.
I've added code in start up to set EnableRetryOnFailure to set the SqlServerRetryingExecutionStrategy:
services.AddEntityFrameworkSqlServer()
.AddDbContext<MyDbContext>(options =>
options.UseSqlServer(_configuration.GetConnectionString("MyDbConnection"),
o => o.EnableRetryOnFailure()))
One major caveat, is at the moment I'm losing context so I don't know what data was attempting to be updated/inserted, so I don't know if it was eventually successful or not.
Couple of questions:
From the Transient Detection Code it doesn't look like error: 40 is caught, but my understanding is that error 40 may actually be another error (unclear). Is that correct?
Is this really a transient issue or does it mean I have another problem?
Without additional logging (working on it), do we know if the retry strategy logs the error, but still retry's and in fact may have been successful?
If this is a transient error, but it's not caught in the default execution strategy, why not? and what would be unintentded consequences of sub classing the SqlAzureExecutionStrategy to include this error.
I've seen this question: Sql Connection Error 40 Under Load, and it feels familiar, but he seems to have resolved it by tuning his database - which I will look at doing, I'm trying to make my code more resilient in the case of database issues.
There is a certain version of EF Core that caches the query or requests if the time span between two database transactions is very small, so update your packages to make sure you are using the most recent.
Query: Threading issues cause NullReferenceException in SimpleNullableDependentKeyValueFactory #5456
check these other links
https://github.com/aspnet/EntityFramework/issues/5456
https://github.com/aspnet/Security/issues/739
https://github.com/aspnet/EntityFramework/issues/6347
https://github.com/aspnet/EntityFramework/issues/4120

Async=true and Entity Framework

Background WCF Stack, Data Access Implemented in Entity Framework, Simple ASP.NET Front End
This is a two part question.
Recently we ran into an issue with periodic crashes with an exception that read:
A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The specified network name is no longer available
We had been running our application without issues for over a week, and then all the sudden we were hit with this random crash/ If I had to guess I would say it was network related, but we were unable to determine the exact source. Has anyone periodically gotten this message? If so what was the root cause?
Second question is someone suggested to set "async=true" in our Entity Framework connection string. I was under the impression this just enables the async api. Does this do anything when you are using EF? Does switching this flag do anything with the queries that get generated by EF?
To be that guy I will answer this one on my own.
First I posted the question about the "async=true"s effect on entity framrwork to MS and no one answered ... as usual(if they answer i will update this post).
Our issue:
A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The specified network name is no longer available
Was environment related. Something was causing the DB to run a little bit slower, but it was hinting to a larger issue. Apparently EF has horrible issues when you share context between threads (not an easy problem to solve), so we were seeing a race condition with opening connections.
We basically had a "read only context" that only did gets. Our issue was two threads attempt to open the connection at the same time, one wins, the other loses resulting in some variation of the exception below:
The connection was not closed. The connection's current state is connecting.
Our solution was to convert our singleton to be thread specific. Not exactly what we wanted, but it worked, and when we pushed this fix our other issue magically went away.
The second half to this question was what does async=true do. When it comes to EF, it made our system crash. We had a block of code that did a join, and if async=true and MARS=false we got a:
There is already an open DataReader associated with this Command which must be closed first
Once we cut back on MARS, and disabled async things were good again.

SQL Server - Timed Out Exception

We are facing the SQL Timed out issue and I found that the Error event ID is either Event 5586 or 3355 (Unable to connect / Network Issue), also could see few other DB related error event ids (3351 & 3760 - Permission issues) reported at different times.
what could be the reason? any help would be appreciated..
Can you elaborate a little? When is this happening? Can you reproduce the behavior or is it sporadic?
It appears SharePoint is involved. Is it possible there is high demand for a large file?
You should check for blocking/locking that might be preventing your query from completing. Also, if you have lots of computed/calculated columns (or just LOTS of data), your query make take a long time to compute.
Finally, if you can't find something blocking your result or optimize your query, it's possible to increase the timeout duration (set it to "0" for no timeout). Do this in Enterprise Manager under the server or database settings.
Troubleshooting Kerberos Errors. It never fails.
Are some of your webapps running under either the Local Service or Network Service account? If so, if your databases are not on the same machine (i.e. SharePoint is on machine A and SQL on machine B), authentication will fail for some tasks (i.e. timerjob related actions etc.) but not all. For instance it seems content databases are still accessible (weird, i know, but i've seen it happen....).

ODBC SQL Server driver error

I have a VB6 app that access's a database thru a ODBC Connection. It will run fine for a few hours then I get the following Error. Any Ideas?
[Microsoft][ODBC SQL Server Driver][DBNETLIB]ConnectionWrite(WrapperWrite())
From Googling the error, it sounds like that's just ADO's way of saying it can't connect - that the server is unreachable. Are there are any other services on that server or that use the database that become unavailable at the same time as this error? It sounds like the client is just losing its connection, so I'd look for anything around that - dropped network connectivity, or a downed/overwhelmed server, to name a few examples.
Does your program have to reach across a network to get to the Access file?
If so I'd look into any intermittent network connectivity issues, especially if your program is always connected to the data source.
Check any logs you can to see what's happening on your network at the time of the error.
If possible change your app to connect to the data source only when you need to access it and then disconnect when done.
Is there more than one instance of the program running on the same and/or different machines? If so, do they all get the error at the same time?
If possible try to have more than one instance of you program running on the same machine and see if they all get the error at the same time.
Also:
Does the error happen about the same amount of time after initial connection?
Does the error happen about the same amount of inactivity in your application?