WCF Net.Msmq Service occasionally faults - wcf

I have a self-hosted WCF service (runs inside a windows service). This service listens for messages on an MSMQ. The service is PerCall, and Transactional running on Windows 2008 R2, .NET 4.0, MSMQ 5.0.
Once every couple of weeks the service will stop processing messages. The windows service remains running but the WCF servicehost itself stops. The servicehost faults with the following exception:
Timestamp: 3/21/2015 5:37:06 PM Message: HandlingInstanceID:
a26ffd8b-d3b4-4b89-9055-4c376d586268 An exception of type
'System.ServiceModel.MsmqException' occurred and was caught.
--------------------------------------------------------------------------------- 03/21/2015 13:37:06 Type : System.ServiceModel.MsmqException,
System.ServiceModel, Version=4.0.0.0, Culture=neutral,
PublicKeyToken=b77a5c561934e089 Message : An error occurred while
receiving a message from the queue: The transaction's operation
sequence is incorrect. (-1072824239, 0xc00e0051). Ensure that MSMQ is
installed and running. Make sure the queue is available to receive
from. Source : System.ServiceModel Help link : ErrorCode :
-1072824239 Data : System.Collections.ListDictionaryInternal TargetSite : Boolean TryReceive(System.TimeSpan,
System.ServiceModel.Channels.Message ByRef) dynatrace_invocationCount
: 0 Stack Trace : at
System.ServiceModel.Channels.MsmqInputChannelBase.TryReceive(TimeSpan
timeout, Message& message) at
System.ServiceModel.Dispatcher.InputChannelBinder.TryReceive(TimeSpan
timeout, RequestContext& requestContext) at
System.ServiceModel.Dispatcher.ErrorHandlingReceiver.TryReceive(TimeSpan
timeout, RequestContext& requestContext)
Searching for the particular exception ("The transaction's operation sequence is incorrect") doesn't yield a lot of info. And most suggestions for how to remedy a faulted services is to restart the servicehost within the faulted event.
I can do that but I hoping that there is a known fixable cause for this exception and/or whether there is a cleaner way to handle it.

We have faced this issue in our product and we opened a ticket with Microsoft, at the end they admits its a bug in .NET Framework and it will be fixed soon.
The issue was reported on windows server 2008 and 2012 but never on 2016 or windows 10.
So we did two solution, recommended all customers to upgrade to Windows 2016, and we added a code to handle the on fault for the service host to restart the service (You can simulate the same error by restarting the MSMQ service while the WCF service host is open.
The code to restore the service is as below:
first you add an event handler for your host to handle "Faulted" event:
SH.Faulted += new EventHandler(SH_Faulted);
//SH is the ServiceHost
Then inside the event handler
private static void SH_Faulted(object sender, EventArgs e)
{
if (SH.State != CommunicationState.Opened)
{
int intSleep = 15 * 1000;
//Abort the host
SH.Abort();
//Remove the event
SH.Faulted -= new EventHandler(SH_Faulted);
//I sleep to make sure that the MSMQ have enough time to recover, better make it optional.
System.Threading.Thread.Sleep(intSleep);
try
{
ReConnectCounter++;
LogEvent(string.Format("Service '{0}' faulted restarting service count # {1}", serviceName, ReConnectCounter));
//Restart the service again here
}
catch (Exception ex)
{
//failed.. .you can retry if you like
}
}
}
Eventually the error will happen again, but your service will continue working fine, till Microsoft solves the issue or you upgrade to 2016
Updated:
After further investigation, and help from Microsoft we found the root cause of the issue, which is the order of the timeout between the below:
MachineLeveDTCTimeOut(20 minutes) >=
DefaultTimeOut(15 minutes) >=
WCF service transactionTimeout >
receiveTimeout()
So by adding the below it should fix this issue:
<system.transactions>
<defaultSettings timeout="00:05:00"/>
</system.transactions>
More detailed article:
https://blogs.msdn.microsoft.com/asiatech/2013/02/18/wcfmsmq-intermittent-mq_error_transaction_sequence-error/

We have the same problem in our production environment. Unfortunately, there is an issue opened with Microsoft about it, but it's marked "Closed as Deferred" since 2013. The following workaround is mentioned by EasySR20:
If you set the service's receiveTimeout a few seconds less than the
service's transactionTimeout this will prevent the exception from
happening and taking down the service host. These are both settings
that can be set in the server's app.config file.
I haven't confirmed this resolves the issue, but it's one option.
We have implemented the service fault restart option instead.

Related

Topshelf Windows Service times out Error 7000 7009

I have a windows service programmed in vb.NET, using Topshelf as Service Host.
Once in a while the service doesn't start. On the event log, the SCM writes errors 7000 and 7009 (service did not respond in a timely fashion). I know this is a common issue, but I (think) I have tried everything with no result.
The service only relies in WMI, and has no time-consuming operations.
I read this question (Error 1053: the service did not respond to the start or control request in a timely fashion), but none of the answers worked for me.
I Tried:
Set topshelf's start timeout.
Request additional time in the first line of "OnStart" method.
Set a periodic timer wich request additional time to the SCM.
Remove TopShelf and make the service with the Visual Studio Service Template.
Move the initialization code and "OnStart" code to a new thread to return inmediately.
Build in RELEASE mode.
Set GeneratePublisherEvidence = false in the app.config file (per application).
Unchecked "Check for publisher’s certificate revocation" in the internet settings (per machine).
Deleted all Alternate Streams (in case some dll was marked as web and blocked).
Removed any "Debug code"
Increased Window's general service timeout to 120000ms.
Also:
The service doesn't try to communicate with the user's desktop in any way.
The UAC is disabled.
The Service Runs on LOCAL SYSTEM ACCOUNT.
I believe that the code of the service itself is not the problem because:
It has been on production for over two years.
Usually the service starts fine.
There is no exception logged in the Event Log.
The "On Error" options for the service dosn't get called (since the service doesn't actually fails, just doesn't respond to the SCM)
I've commented out almost everything on it, pursuing this error! ;-)
Any help is welcome since i'm completely out of ideas, and i've been strugling with this for over 15 days...
For me the 7009 error was produced by my NET core app because I was using this construct:
var builder = new ConfigurationBuilder()
.SetBasePath(Directory.GetCurrentDirectory())
.AddJsonFile("appsettings.json");
and appsettings.json file obviously couldn't be found in C:\WINDOWS\system32.. anyway, changing it to Path.Combine(AppContext.BaseDirectory, "appsettings.json") solved the issue.
More general help - for Topshelf you can add custom exception handling where I finally found some meaningfull error info, unlike event viewer:
HostFactory.Run(x => {
...
x.OnException(e =>
{
using (var fs = new StreamWriter(#"C:\log.txt"))
{
fs.WriteLine(e.ToString());
}
});
});
I've hit the 7000 and 7009 issue, which fails straight away (even though the error message says A timeout was reached (30000 milliseconds)) because of misconfiguration between TopShelf and what the service gets installed as.
The bottom line - what you pass in HostConfigurator.SetServiceName(name) needs to match exactly the SERVICE_NAME of the Windows service which gets installed.
If they don't match it'll fail straight away and you get the two event log messages.
I had this start happening to a service after Windows Creator's Edition update installed. Basically it made the whole computer slower, which is what I think triggered the problem. Even one of the Windows services had a timeout issue.
What I learned online is that the constructor for the service needs to be fast, but OnStart has more leeway with the SCM. My service had a C# wrapper and it included an InitializeComponent() that was called in the constructor. I moved that call to OnStart and the problem went away.

NServiceBus Subscriber failing on server

I have a rather simple Pub/sub setup which works fine on our developer machines but when I deploy to our test serveres it throws this error for all messages:
System.NullReferenceException: Object reference not set to an instance of an object.
at NServiceBus.Unicast.UnicastBus.HandleTransportMessage(IBuilder childBuilder, TransportMessage msg) in c:\BuildAgent\work\nsb.master_6\src\unicast\NServiceBus.Unicast\UnicastBus.cs:line 1328
at NServiceBus.Unicast.UnicastBus.TransportMessageReceived(Object sender, TransportMessageReceivedEventArgs e) in c:\BuildAgent\work\nsb.master_6\src\unicast\NServiceBus.Unicast\UnicastBus.cs:line 1247
at System.EventHandler`1.Invoke(Object sender, TEventArgs e)
at NServiceBus.Unicast.Transport.Transactional.TransactionalTransport.OnTransportMessageReceived(TransportMessage msg) in c:\BuildAgent\work\nsb.master_6\src\impl\unicast\transport\NServiceBus.Unicast.Transport.Transactional\TransactionalTransport.cs:line 480
We allready have other SendOnly, Distributors and workers running on the same servers, so msmq etc. should be installed corretly. This is the first time however we are using Pub/Sub on these servers.
If i use the exact same binaries and config on a developer machine it runs smoothly, but not on the servers which are 2008R2, Powershell V3.
We are using a fluent configuration for the subscriber:
return NServiceBus.Configure.With()
.DefineEndpointName(queuePrefix)
.Log4Net(_serviceBusLog.Build())
.StructureMapBuilder()
.JsonSerializer()
.License(ConfigTable.GetConfigString(ConfigTableKeys.NServiceBus, "License"))
.MsmqTransport()
.IsTransactional(true)
.RunTimeoutManagerWithInMemoryPersistence()
.EnablePerformanceCounters()
.UnicastBus()
.CreateBus()
.Start(() => NServiceBus.Configure.Instance.ForInstallationOn<NServiceBus.Installation.Environments.Windows>().Install());
We also have our own UnicastBus config which scans for message handlers (they're message types) and then automatically creates the endpoint mappings. This was my first concern so I disabled it and used the app.config way of setting up endpoints, but the error still occurs.
Note the error occours for every single message.
Note we are running version 3.3.5 of NSB.
Im still travering the server settings as I believe there must be some difference that makes it tick but i have not found it yet.
Anyone has any recommendations as for what to look for?
Kind regards
It appears that I have found the error.
After testing a raw simple console Pub/Sub on the server I added a try catch in the handler and caught... My own exception....
Im embrassed.
But it appears that the exception is not forwarded correctly to the log in NSB and i was therefore completely thrown of from the real problem.
I do not know if this is something that is fixed in later versions of NSB, but i hope so.
Until then Im using my own try catch logic to add a custom log entry.
Kind regards.

Exposing wcf service in BizTalk 2010

I have ran into this issue the past few weeks ago and I still can not figure out why.
I have a BizTalk orchestration that receives a BizTalk message, processes it and eventually writes it to a file. For simplicity sake, let say, our goal is to take the message (which is actually a survey) and save it to a folder as is. I have created a public receive port to catch the msg and a send port to write the result out to disk.
I have successfully deployed the project and have successfully published it to IIS. I was able to see the WCF service in IIS and have set it to use Framework 4.0. I have set up all the relevant receive and send ports and have started the application successfully from BizTalk management console. (The receive location was automatically created when I published the wcf service, the send location is a type FILE located in a folder in the computer)
From DefaultSite in IIS, I could see my WCF. However, when I tried http://localhost/TestSurvey/TestSurvey_Orchestration_1_getSurveyPort.svc,
I got the following error:
Server Error in '/TestSurvey' Application.
--------------------------------------------------------------------------------
Login failed for user 'Domain\ComputerName$'.
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.Data.SqlClient.SqlException: Login failed for user 'ACAD1\DENBIZDEV$'.
Source Error:
An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.
Stack Trace:
[SqlException (0x80131904): Login failed for user 'Domain\ComputeName$'.]
[TargetInvocationException: Exception has been thrown by the target of an invocation.]
Microsoft.BizTalk.TransportProxy.Interop.IBTTransportProxy.RegisterIsolatedReceiver(String url, IBTTransportConfig callback) +0
Microsoft.BizTalk.Adapter.Wcf.Runtime.WcfIsolatedReceiver`2.RegisterIsolatedReceiver(Uri uri) +1028
Microsoft.BizTalk.Adapter.Wcf.Runtime.WebServiceHostFactory`3.CreateServiceHost(String constructorString, Uri[] baseAddresses) +363
System.ServiceModel.HostingManager.CreateService(String normalizedVirtualPath) +1413
System.ServiceModel.HostingManager.ActivateService(String normalizedVirtualPath) +50
System.ServiceModel.HostingManager.EnsureServiceAvailable(String normalizedVirtualPath) +1172
[ServiceActivationException: The service '/TestSurvey/TestSurvey_Orchestration_1_getSurveyPort.svc' cannot be activated due to an exception during compilation. The exception message is: Exception has been thrown by the target of an invocation..]
System.Runtime.AsyncResult.End(IAsyncResult result) +901424
System.ServiceModel.Activation.HostedHttpRequestAsyncResult.End(IAsyncResult result) +178638
System.Web.AsyncEventExecutionStep.OnAsyncEventCompletion(IAsyncResult ar) +107
--------------------------------------------------------------------------------
Version Information: Microsoft .NET Framework Version:4.0.30319; ASP.NET Version:4.0.30319.272
My question is: what is it that BizTalk was trying to login? Domain\ComputerName$ is definitely not a user. I have not tried to access any database from the orchestration. I believe my IIS was set up correctly as I have created a quick and dirty wcf service test from VS2010 and published it to my IIS, then have successfully created a simple client to consume the wcf service test. I did not encounter the same issue when I created the wcf service test directly.
Any help or hint is highly appreciated!
EDIT
I managed to fix this: it was not the web config as it was created by BizTalk. I had to create a special app pool and tied my application to using it. In the app pool I have to use the custom credential that BizTalk is using. Using the built in credentials caused the issue I experienced. My rookie error! Once BizTalk login credential was used, everything went smoothly.
The database which is throwing the SQL exception is one of the BizTalk databases, probably the management or SSO DB.
The error is almost certainly being caused by the setup of IIS, whether the app pool identity or the web site security settings.
What is hapenning is that when you call the service for the first time IIS is compiling your service, and to do this it obviously needs access to one of the BizTalk databases, but this call is not authenticated for some reason.
This could be because the app pool user must be a member of the BizTalk Isolated Host Users group (which may be called something different depending on how BizTalk was set up.

HELP! - DefaultServiceHostFactory executing before application_startup and container creation

I am using the WCF facility for a service hosted in WAS (net.tcp binding in iis7) and experiencing a weird issue only upon a cold application startup (i.e. not already running).
The following statement should be executed upon first instantiation of my container.
DefaultServiceHostFactory.RegisterContainer(c.Kernel);
When the service is requested, I get the following exception in my WCF tracefile
Kernel was null, did you forgot to call DefaultServiceHostFactory.RegisterContainer()
The issue appears to be that the ServiceHostFactory is attempting to create an instance of the service's host before my container has been created.
Note:
This exception is happening BEFORE the Application_Start is executed
If the application is running (and the container has been initialised) then the service will operate as expected. The application can be started by going to the appropriate IIS site over HTTP or starting a debugging session from Visual Studio.
Steps to recreate issue
Issue a IISReset to shutdown all IIS app pools.
Call the service in question
WCF tracing spits out:
System.ServiceModel.ServiceActivationException: The service '/abcd.svc' cannot be activated due to an exception during compilation. The exception message is: Exception has been thrown by the target of an invocation.. ---> System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.ArgumentNullException: Kernel was null, did you forgot to call DefaultServiceHostFactory.RegisterContainer() ?
Parameter name: kernel
at Castle.Facilities.WcfIntegration.WindsorServiceHostFactory`1..ctor(IKernel kernel)
at Castle.Facilities.WcfIntegration.DefaultServiceHostFactory..ctor()
The problem is that global.asax and all its methods are related only to HTTP processing. Btw. class in global.asax is derived from HttpApplication which should make this pretty clear. Once you host application in WAS (which is case of net.tcp based binding) you can't use these methods. Try to use something like AppInitialize.

Exception in creating a WCF Service using MsmqIntegrationBinding

My machine is Windows 7 ultimate (64 bit). I have installed MSMQ and checked that it is working fine (ran some sample codes for MSMQ).
When i try to create a WCF Service using MsmqIntegrationBinding class, i get the below exception:
"An error occurred while opening the queue:The queue does not exist or you do not have sufficient permissions to perform the operation. (-1072824317, 0xc00e0003). The message cannot be sent or received from the queue. Ensure that MSMQ is installed and running. Also ensure that the queue is available to open with the required access mode and authorization."
I am running the visual studio in Administrator mode and explicitly grant permission to myself via a URL ACL using:
netsh http add urlacl url=http://+:80/ user=DOMAIN\user
Below is the code:
public static void Main()
{
Uri baseAddress = new Uri(#"msmq.formatname:DIRECT=OS:AJITDELL2\private$\Orders");
using (ServiceHost serviceHost = new ServiceHost(typeof(OrderProcessorService), baseAddress))
{
MsmqIntegrationBinding serviceBinding = new MsmqIntegrationBinding();
serviceBinding.Security.Transport.MsmqAuthenticationMode = MsmqAuthenticationMode.None;
serviceBinding.Security.Transport.MsmqProtectionLevel = System.Net.Security.ProtectionLevel.None;
//serviceBinding.SerializationFormat = MsmqMessageSerializationFormat.Binary;
serviceHost.AddServiceEndpoint(typeof(IOrderProcessor), serviceBinding, baseAddress);
serviceHost.Open();
// The service can now be accessed.
Console.WriteLine("The service is ready.");
Console.WriteLine("The service is running in the following account: {0}", WindowsIdentity.GetCurrent().Name);
Console.WriteLine("Press <ENTER> to terminate service.");
Console.WriteLine();
Console.ReadLine();
// Close the ServiceHostBase to shutdown the service.
serviceHost.Close();
}
}
Can you please help?
Make sure you have created the "Orders" queue in MSMQ.
In Windows Server 2008, you can do so from the Server Manager (right click on My Computer and select Manage), then Features -> Message Queuing -> Private Queues. Right click on Private Queues and add your "Orders" queue there.
You may also want to check Nicholas Allen's article: Diagnosing Common Queue Errors. It suggests that your error can only be: "that the queue does not exist, or perhaps you've specified the queue name incorrectly". All the other error cases would have thrown a different exception.