Azure SQL Server transport layer error - to retry or not? - sql

I have received a Transport level error using SQL Server. This exception seems to be a network blip and was fixed for the same query the next second.
Exception in SqlRetryStrategyExecutor System.Data.SqlClient.SqlException (0x80131904): A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)
I have a retry executor to retry whitelisted transient exceptions. The exception raised in this instance was not whitelisted.
Here are the error details
Error Number:121,State:0,Class:20
From microsoft link - https://learn.microsoft.com/en-us/previous-versions/sql/sql-server-2008-r2/cc645611(v=sql.105). 121 is not a transient error but an insert violation.
How to identify this transient network blip to retry the operation when error code is 121 ? Should I rely on Class = 20 ?

Related

multiple user connecting to WCF at same time gives socket exception

when multiple users are connecting to WCF Service at the same time, we are getting socket Exception.
Error Message:
{"The socket transfer timed out after 00:00:09.9844000. You have exceeded the timeout set on your binding. The time allotted to this operation may have been a portion of a longer timeout."}
Error code:10060

Network path not found exception encountered randomly

Azure .NET MVC application is encountering an exception i.e.
"The network path was not found"
Problem with the other queries that I searched before asking this question was that its not happening all the time, it's working as expected but this issue is occurring randomly like once every 15-25 days. It has been roughly 50 days since deployment of application on production and encountered this twice, while did not encounter this on Azure UAT environment(where low numbers of users were present).
Issue is temporarily resolved by IIS reset using the command:
iisreset
Source Error:
An unhandled exception was generated during the execution of the
current web request. Information regarding the origin and location of
the exception can be identified using the exception stack trace below.
Stack Trace:
[Win32Exception (0x80004005): The network path was not found]
[SqlException (0x80131904): A network-related or instance-specific
error occurred while establishing a connection to SQL Server. The
server was not found or was not accessible. Verify that the instance
name is correct and that SQL Server is configured to allow remote
connections. (provider: Named Pipes Provider, error: 40 - Could not
open a connection to SQL Server)]
Any idea?
Checklist:
Can you isolate the problem? If you have two web servers and one db, for example, when it happens does it happen on both web servers at the same time?
Set up monitoring of all critical network paths. Start monitoring possible causes (DHCP leases, even if you get a static IP from the server, would be one of the first).
Log as much information as possible. Can you determine if you are getting an ICMP error from a router or an issue from a host?
These things are, IME, usually intermittent network issues and often on peering connections, between data centers, and the like.

Prototype project with RabbitMQ+RavenDB repeated SharedQueue closed errors from RabbitMQ

I've created a simple saga prototype project with RabbitMQ as the transport and RavenDB as the persistence mechanism. The prototype actually runs as expected, but every few seconds i get this error msg:
ERROR NServiceBus.Transports.RabbitMQ.RabbitMqDequeueStrategy Failed to receive messages from [Assembly].Retries
System.AggregateException: One or more errors occurred. --> System.IO.EndOfStreamException: SharedQueue closed
at RabbitMQ.Util.SharedQueue1.EnsureIsOpen()
at RabbitMQ.Util.SharedQueue1.Dequeue(int 32 milliseconds timeout.......
I also get an almost identical message immediately following the above one but it says it Failed to receive messages from RabbitMGPoller.Timeouts
In addition to that there are constant INFO messages that say:
NServiceBus.Transports.RabbitMQ.RabbitMqConnectionManager Disconnected from RabbitMQ broker, reason: AMQP close-reason, initiated Library, code=0 text="End of stream"... cause=System.IOException:Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host...
I have tried adding a DequeueTimeout=600 value to the transport connection, but the same errors still occur. I've also tried adding the following key in the config file, but it still didn't seem to help.
I eventually figured it out, just my lack of understanding of RabbitMQ and NServiceBus. I changed the RequestedHeartbeat value for the RabbitMQ connection to something larger i.e. RequestedHeartbeat=6000. That solved my issue.

Timeout exception calling UserPrincipal.GetGroups from a Windows service

When I run simple console app that calls UserPrincipal.GetGroups, it enumerates the users groups with no problems. However when I run the same code as the same user on the same server but from a windows service hosting WCF, I get the following chain of errors:
Message : The socket transfer timed out after 00:00:10. You have exceeded the timeout set on your binding. The time allotted to this operation may have been a portion of a longer timeout.
Inner Exception
---------------
Message : The read operation failed, see inner exception.
Inner Exception
---------------
Message : The socket transfer timed out after 00:00:10. You have exceeded the timeout set on your binding. The time allotted to this operation may have been a portion of a longer timeout.
Inner Exception
---------------
Message : A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
Could this have something to do with the WCF thread impersonation? WindowsIdentity.GetCurrent().Name returns the same user, however Thread.CurrentPrincipal.Identity.Name is different - empty string in the case of the console app, but the impersonated WCF user in the case of the Windoes Service.

Are there other reasons for service broker to be disabled than a RESTORE DATABASE

We have a production database where service broker was disabled.
We have a profiler that logs every backup / restore operation. I cannot find any restore operation in its trace.
Are there any other reasons than a database restore for service broker to be disabled ?
Note that this database is mirrored using high-availability and a witness server. In the error log, I can find
2011-07-29 09:00:52.53 spid25s Error: 1479, Severity: 16, State: 2.
2011-07-29 09:00:52.53 spid25s The mirroring connection to "TCP://DB84200:5022" has timed out for database "XXX" after 10 seconds without a response. Check the service and network connections.
2011-07-29 09:00:53.05 spid24s Database mirroring is inactive for database 'XXX'. This is an informational message only. No user action is required.
2011-07-29 09:00:53.72 spid24s Error: 1404, Severity: 16, State: 6.
2011-07-29 09:00:53.72 spid24s The command failed because the database mirror is busy. Reissue the command later.
Can mirorring failure disable service broker ? Or maybe is it the opposite : mirorring fails because service broker is disabled ?
Any suggestion to solve this issue would be greatly appreciated !
Service Broker provides automatic poison message detection. Automatic poison message detection sets the queue status to OFF if a transaction that receives messages from the queue rolls back five times.
Check the SQL Server Logs for the roll backs.
This looks more like a mirroring error though.