.NET Hosted workflow clustering problem - serialization

I am attempting to reproduce a server farm in my test environment, in order to try something out for an application that I am busy designing.
I have two web servers, running IIS 6.0 and 7.0 respectively, each hosting a workflow service with exactly the same dlls. They share a persistence database.
When ServerA saves the workflow to the persistence store, on a subsequent request ServerB is happy to load the workflow instance and do some work on it. Once ServerB has saved the workflow, ServerA gets a serialization exception when attempting to perform a further call on the workflow. I get the same behaviour if I use a different server as ServerB.
And I can fix the problem by using two different servers and leaving ServerA out of the equation.
My question though is: How can I debug exactly why ServerA will not load workflows saved by other machines?
Update - I did try with two IIS 6.0 servers, same OS and same strongly names assemblies - and had the exact same issue

OK, I figured it out.
ServerA had a hotfix installed for .NET, which meant that the actual binary signature of one of the classes being serialized was different.
By pure chance it can be deserialised one way and not the other.
I loaded the hotfix on all of the servers, and now the serialization works correctly.

Related

ASP .NET Core Application Process Isolation for IIS hosted Kestrel Services

I'm migrating a service based integration platform from .Net Framework to .Net Core. The original versions of the integration platform have proven very successful and compared to replacing it with a 'off the shelf' integration solution, it has a far better ROI.
So after redeveloping the code, all tests has been working very well and have achieved higher levels of performance with a single IIS server that I could with 2 IIS servers with the original versions.
Except... If I go over ~3 message/sec with multiple clients, I start seeing duplicate GUID key errors when trying to save instrumentation data to my DB. All these errors are generated from the on-ramp service. The on-ramp places the message on a queue. The messages are then consumed by an off-ramp service and sent to the destination (for this load test the destination is a file folder).
Even though the off-ramp is also running on the same server as the on-ramp, we do not see any duplication errors generated by the off-ramp. I suspect this is due to the queue creating a linier process, so only one instance of the off-ramp is running at any time vs the on-ramp that has up to 4 clients firing concurrent messages at it's API.
Initially I thought the issue was caused by a static global variable class I had implemented, crossing process boundaries. But I would expect that the issue would be seen with the off-ramp as well, as the service architecture for both are virtually identical.
Summary of thoughts on issue:
If it is a pure coding issue, then errors would happen at low messaging rates.
The error would also be seen on the off-ramp if the GUID duplication was chance.
The on and off ramps are both running on the same server, but duplication only seen on the on ramp. IE on ramp not impacting the off ramp and visa versa.
Duplication has to be due to shared memory between concurrently running on-ramp instances, generated by multiple client scenario.
To try and resolve the issue I removed the static global variable class but I'm still seeing the duplication errors.
This issue was never observed in the original IIS implementation (after millions of message processed). I suspect the issue is with process isolation in the IIS hosted Kestrel .Net Core service host. From what I have read there is good isolation between different apps (based on IIS path) but not within the same app. So basically within the same IIS app pool. This could explain why .Net Core does not support multiple app running in the same IIS app pool.
If any one has a good idea how i can achieve process isolation between instances of the same app running in the same IIS app pool I would appreciate your thoughts/suggestions.
After running more tests I was able to resolve the issue. The problem was with the scope of the instrumentation variable. At low rates there was never a problem, but at high throughput, the same instrumentation object was being accessed by a second instance of the process.
The issue was difficult to track down due to the short lived nature of the integration services.
Thanks to anyone who reviewed the question.
Martin

Behavior of WL.server.createEventSource on a Worklight Cluster Environment

Let's assume I have a cluster of 2 worklight servers sharing the same WL runtime.
On that runtime, I've installed a application with a adapter that is a create event source function.
Just like this IBM article.
https://www.ibm.com/developerworks/community/blogs/worklight/entry/configuring_a_polling_event_source_to_send_push_notifications?lang=en
My question is, what will happen on a cluster environment.
Will repeated work ensue?
By other words, would my two WL Servers will be pooling for events?
Or perhaps that functionality is writing a task on the WL DB that the WL Servers poll regularly to check for work if no instance is taking care of it, so that only a server at a time would be "the event source"?
I'm working with IBM Worklight 6.2 and Websphere Liberty Profile 8.5.5
Thanks in advance!
Here's my attempt to answer this after some consultation:
My question is, what will happen on a cluster environment. Will
repeated work ensue? By other words, would my two WL Servers will be
pooling for events?
While the Worklight Servers share the same runtime, they are still considered as 2 instances. This means that each of them will attempt to perform a polling action. This is considered OK.
However, it is important to note that the backend system that is being polled should likely be smart enough to handle such a situation where 2 polling attempts are done for the same message.
If the backend doesn't know how to handle polling properly, the same message can be pulled more than once. This is true even of you have a single eventsource running. So this is something to keep in mind.

Sharp Architecture WCF

I have unusual problem that I can't figure out.
I have a sharp architecture project that I am developing, and using WCF services which I host using IIS ASP.NET.
When the services were hosted on my machine everything worked out fine. Now I have hosted the services on a different server and running the client from my machine. once I have done that the SaveOrUpdate() methods seems to be not working. No errors are being thrown out and it returns a successfull operation, but the data is not persisted to the db. The issue I can't figure out is why was this working when the services were hosted locall and now not working when they are hosted some where else.
I think I ran into a similar issue before, basically the transaction is never committed. try manually beginning and committing your transactions.

Strange WCF Error - IIS hosted - context being aborted

I have a WCF service that does some document conversions and returns the document to the caller. When developing locally on my local dev server, the service is hosted on ASP.NET Development server, a console application invokes the operation and executes within seconds.
When I host the service in IIS via a .svc file, two of the documents work correctly, the third one bombs out, it begins to construct the word document using the OpenXml Sdk, but then just dies. I think this has something to do with IIS, but I cannot put my finger on it.
There are a total of three types of documents I generate. In a nutshell this is how it works
SQL 2005 DB/IBM DB2 -> WCF Service written by other developer to expose data. This service only has one endpoint using basicHttpBinding
My Service invokes his service, gets the relevant data, uses the Open Xml Sdk to generate a Microsoft Word Document, saves it on a server and returns the path to the user.
The word documents are no bigger than 100KB.
I am also using basicHttpBinding although I have tried wsHttpBinding with the same results.
What is amazing is how fast it is locally, and even more that two of the documents generate just fine, its the third document type that refuses to work.
To the error message:
An error occured while receiving the HTTP Response to http://myservername.mydomain.inc/MyService/Service.Svc. This could be due to the service endpoint binding not using the HTTP Protocol. This could also be due to an HTTP request context being aborted by the server (possibly due to the server shutting down). See server logs for more details.
I have spent the last 2 days trying to figure out what is going on, I have tried everything, including changing the maxReceivedMessageSize, maxBufferSize, maxBufferPoolSize, etc etc to large values, I even included:
<httpRuntime maxRequestLength="2097151" executionTimeout="120"/>
To see maybe if IIS was choking because of that.
Programatically the service does nothing special, it just constructs the word documents from the data using the Open Xml Sdk and like I said, locally all 3 documents work when invoked via a console app running locally on the asp.net dev server, i.e. http://localhost:3332/myService.svc
When I host it on IIS and I try to get a Windows Forms application to invoke it, I get the error.
I know you will ask for logs, so yes I have logging enabled on my Host.
And there is no error in the logs, I am logging everything.
Basically I invoke two service operations written by another developer.
MyOperation calls -> HisOperation1 and then HisOperation2, both of those calls give me complex types. I am going to look at his code tomorrow, because he is using LINQ2SQL and there may be some funny business going on there. He is using a variety of collections etc, but the fact that I can run the exact same document, lets call it "Document 3" within seconds when the service is being hosted locally on ASP WebDev Server is what is most odd, why would it run on scaled down Cassini and blow up on IIS?
From the log it seems, after calling HisOperation1 and HisOperation2 the service just goes into la-la land dies, there is a application pool (w3wp.exe) error in the Windows Event Log.
Faulting application w3wp.exe, version 6.0.3790.1830, stamp 42435be1, faulting module kernel32.dll, version 5.2.3790.3311, stamp 49c5225e, debug? 0, fault address 0x00015dfa.
It's classified as .NET 2.0 Runtime error.
Any help is appreciated, the lack of sleep is getting to me.
Help me Obi-Wan Kenobi, you're my only hope.
I had this message appearing:
An error occured while receiving the HTTP Response to http://myservername.mydomain.inc/MyService/Service.Svc. This could be due to the service endpoint binding not using the HTTP Protocol. This could also be due to an HTTP request context being aborted by the server (possibly due to the server shutting down). See server logs for more details.
And the problem was that the object that I was trying to transfer was not [Serializable]. The object I was trying to transfer was DataTable.
I believe word documents you were trying to transfer are also non serializable so that might be the problem.
Yes, we'd want logs, or at least some idea of what you're logging. I assume you have both message and transport logging on at the WCF level.
One thing to look at is permissions. When you run under Cassini the web server is running as the currently logged in user. This hides any SQL or CAS permission problems (as, lets be honest, your account is usually a local administrator). As soon as you publish to IIS you are now running under the application pool user, which is, by default, a lot more limited.
Try turning on IIS debug dumps and following the steps in KB919789
Fyi, I changed IIS 6 to work in IIS 5.0 Isolation mode and everything works. Odd.
I had the same error when using an IEnumerable<T> DataMember in my WCF service. Turned out that in some cases I was returning an IQueryable<T> as an IEnumerable<T>, so all I had to do was add .ToList<T>() to my LINQ statements.
I changed the IEnumerable<T> to IList<T> to prevent making the same mistake again.

Why Would You Host a wcf service in a Windows service?

What would be reasons that you want to host a wcf service in a windows service and not in IIS?
One reason is that IIS6 only supports bindings based on HTTP. If you want to use TCP, MSMQ, etc., then you need to host in a separate program.
When hosting in IIS you are only allowed to bind to a single port per a base address, in each web site (Meaning you cant specify two bindings with different ports since you can only use a single port, or endpoints that use different ports)
You can only use a single base address in IIS, the only way around this is deploying multiple versions of the same project in different websites (yuck)
The IIS process must recycle eventually, and when it does it dumps everything and restarts, which is good alot of the time since memory is freed and trapped resources are released, but when using singletons this can have an undersired effect depending on your code
[edit] : more points
In a standard setup your worker process always have 2GB Virtual memory available (no matter if you have 1, 2 or 4GB physical memory in the machine).
Freedom. You as the developer don't need someone to administer the box
Sometimes IIS6 is really just overkill
You are using it as interprocess communication conduit
You wish to declare all of the bindings in code. This is far less confusing and more powerful than the xml config files that seem to be all the rage. I can't envision many scenarios where I would want a non-programmer messing with bindings. The xml approach is fine for prototyping and systems that need to be highly dynamic, but overall I don't think its a good idea