Paging in Mule ESB - while-loop

We have a Mule flow which processes a bunch of records. We want to implement paging because one of the steps in the process is calling an external system which can only take a set amount of records at a time.
We have attempted to solve this by adding a choice in the flow that checks if there are more records to process and if yes then call the same flow again (self reference the flow) but this caused stackoverflow errors.
We have also tried using the until-successful scope but we need errors to break out of the loop and be caught by the exception strategy.
Thanx

Mule possesses the ability to process messages in batches
http://www.mulesoft.org/documentation/display/current/Batch+Processing
It is the best option for your requirement.

Related

Mule outbound endpoint level statistics to facilitate integration testing

I am looking for a pragmatic solution to do Integration testing of our Integration tier based on Mule.
This article here has some excellent pointers about it, but looks a tad outdated. I am reproducing an excellent idea from the article here
Keeping track of the delivery of messages to external systems. Interrogating all the systems that have been contacted with test messages after the suite has run to ensure they all received what was expected would be too tedious to realize. How to keep track of these test messages? One option could be to run Mule ESB with its logging level set to DEBUG and analyze the message paths by tracking them with their correlation IDs. This is very possible. I decided to follow a simpler and coarser approach, which would give me enough certitude about what happened to the different messages I have sent. For this, I decided to leverage component routing statistics to ensure that the expected number of messages where routed to the expected endpoints (including error messages to error processing components). Of course, if two messages get cross-sent to wrong destinations, the count will not notice that. But this error would be caught anyway because each destination will complain about the error, hence raising the count of error messages processed.
Using this technique when I test my integration tier I will not have to stand up all the external systems and can test the integration tier in isolation which would be great.
#David Dassot has provided a reference implementation as well, however I think it was based on Mule 2.X and hence I cannot find the classes in the Mule 3.X codebase.
Looking around I did find FlowConstructStatistics but this is flow specific statistics and I am looking for endpoint specific statistics.
I do agree that as a work around we could wrap all outbound endpoints within sub-flows and get this working, but I would like to avoid doing this ...
Any techniques that help query the endpoint for the number of calls made, payload passed through the endpoints would be great!
First take a look to JMX, perhaps what you need is available right there.
Otherwise, if looging is not enough, and upgrading to the enterprise version is not ok for you. Give it a try to the endpoint level notifications.

Will a Mule Flow stop processing a message after some arbitrary default number of exceptions thrown?

I have a flow that inserts objects in a Mongo database using the Mongo connector, which uses the MongoClientImpl provided by the connector. This client has a line that tries to cast the _id to an ObjectId prior to returning that value as a string to the user after the insert has been submitted to the database. As far as I can tell, this line does not impact whether the object is inserted, but it does throw an exception when trying to cast a string to an ObjectId.
My flow is throwing hundreds of ClassCastExceptions. It also does not appear to be processing nearly as many inserts as I would expect. I expect to see tens of thousands, but instead the flow is only inserting 136 documents.
Is there a limit to the number of exceptions that can be thrown and captured by Mule's DefaultMessagingExceptionStrategy before the flow will stop processing a given message?
The answer is no, but you can implement some custom logic and stop the flow. Probably you would need afterwards to manually start it, this is how I would do it:
Use the object store module, for counting the incidences of a given exception.
Upon a defined number of incidences stop programatically the flow, one option to do this would be something like #[groovy:<flowName>.stop()] given that groovy has direct access to the registry.
Send an email notification so you can be aware of the errors (specially useful for production).

How to figure out if mule flow message processing is in progress

I have a requirement where I need to make sure only one message is being processed at a time by a mule flow.Flow is triggered by a quartz scheduler which reads one file from FTP server every time
My proposed solution is to keep a global variable "FLOW_STATUS" which will be set to "RUNNING" when a message is received and would be reset to "STOPPED" once the processing of message is done.
Any messages fed to the flow will check for this variable and abort if "FLOW_STATUS" is "RUNNING".
This setup seems to be working , but I was wondering if there is a better way to do it.
Is there any best practices around this or any inbuilt mule helper functions to achieve the same instead of relying on global variables
It seems like a more simple solution would be to set the maxActiveThreads for the flow to 1. In Mule, each message processed gets it's own thread. So setting the maxActiveThreads to 1 would effectively make your flow singled threaded. Other pending requests will wait in the receiver threads. You will need to make sure your receiver thread pool is large enough to accommodate all of the potential waiting threads. That may mean throttling back your quartz scheduler to allow time process the files so the receiver thread pool doesn't fill up. For more information on the thread pools and how to tune performance, here is a good link: http://www.mulesoft.org/documentation/display/current/Tuning+Performance

Long running workflow in asp.net mvc

I'm developing an intranet site using asp.net mvc4 to manage some of our data. One important feature of this site is to trigger import/export jobs. These jobs can take anywhere between 5 minutes to 1 hour. Users of the site need to be able to determine whether a job is currently running as well as the status of prior jobs. Many jobs will often include warning messages concerning duplicate data and these warnings need to be visible on the site.
My plan is to implement these long running processes as a WCF Workflow Service that the asp.net site will interact with. I've got much of the business logic implemented via activities and have tested it using a simple console application. I should note I'm using a correlation handle in order to partition the service based on specific "Projects" on the site.
My problem is how do I go by querying the status of an active job (if one exists) as well as the warning messages of previous jobs. I suspect the best way to do this would be to use the AppFabric tracking service and have my asp.net query a SQL monitoring store and report back on the current status. After setting up AppFabric and adding custom tracking messages, I ran into a few issues. My first issue is that I cannot figure out how to filter out workflow instances that were not using the correct correlation handle as I'd like to show only workflows for a specific project. The other issue is that the tracking database can be delayed quite a bit which causes issues for me trying to determine if a workflow is currently running.
Another possible solution could be to have the workflow explicitly update a database with its current status and any error messages. I'm leaning towards this solution but could use some expert advice.
TL;DR: I need to know the best way to query the execution status and any warning messages of a WCF Workflow service.
As you want to query workflow status and messages even after the workflow is finished I would start by creating a table where you can convert the correlation values a client send to the related workflow ID. I would create a custom activity to do that and drop it right after the receive that creates the workflow.
Next I would create a regular WCF service the client app uses to query the status. This WCF service can query the WF persistence store to see if a given workflow is still running. If so the active bookmarks column will tell you what SOAP messages the workflow is currently waiting for.
As far as messages go you can either use the AppFabric tracking infrastructure to store and retrieve them or you could create a custom activity and store them in your own database. It really depends if you are also interested in the standard WF tracking messages generated.
Update on cheking for running workflow instances:
There are several downsides to adding an IsRunning message to your workflow. For one you would need to make sure one branch keeps looping and waiting for the message but stops as soon as the other real workflow branch is done. Certainly possible but it complicates the workflow and is a possible source of errors. And as it is not part of the business problem it really has no place in the workflow as far as I am concerned. It also means that you will have to load a workflow from disk and persist it back just to tell you that it is there. If it was finished you will need to wait for a fault to indicate there was no workflow instance. And that usually means you get a timeout exception after, by default, 60 seconds. Add throttling to that and you request might be queued because there are too many other workflow instances or SOAP request being processed. So a timeout might mean that a workflow instance exists but is unreachable due to system constraints. Instead I would opt for the simple thing and check if the record in the instance store is still available. The additional info from the active bookmarks column will tell you what the workflow is waiting on, information I have used in the past to dynamically update the UI by enabling/disabling UI elements.

How to rollback an NHibernate Transaction within NServiceBus

It's my understanding we have essentially 2 kinds of exceptions when using NServiceBus.
Environmental : Meaning any required component is not currently available. Usually resulting in a full rollback of the transaction. This is the description I see behind the rollback within NServiceBus Documentation (Including putting the message back on the bus - which sounds fantastic). How do I do this?
Validation : A message is being processed that cannot succeed because of business logic, rules, etc. Where in I want to rollback all database interaction but there's no value in keeping the command in the queue. In which case I just want to roll back the NHibernate section of the transaction - not the MSMQ portion. How do I do this? Typically I would perform validation before any single message is processed but when you have multiple messages bound together into a single transaction and you want to roll them all back this isn't possible via pre-validation.
My assumption is either the answer is insanely obvious and I've overlooked it or what I'm trying to do isn't possible (in regards to the Validation exception).
NSB takes care of getting the message out of the way by moving it to an error queue(v2.5). In v3 this functionality is enhanced and will give you more options to handle faults(DB, custom, etc.). The error queue is configured in your app.config.
In my experience, it's easiest (and probably also more appropriate) to ensure that messages have a very high probability that they can succeed when they participate in a distributed transaction.
Therefore, most validation logic should already have been carried out when you dispatch the command message, and rollback is reserved for the truly exceptional case.
If your client cannot perform the validation, maybe you should insert a validation service in front of your current service. This validation service could route invalid command messages somewhere else before they reach the real service.
Thank you for your answers. I believe the answer lies somewhere between the two.
We are unfortunately unable to implement a validation service but we've simply added better upfront validation to the message processing logic.
Unfortunately until we get to v3 we are currently unable to use the Error Queue as we are utilizing the message response functionality to alert integrators of issues with their messages. And throwing an unhandled error prevents any responses from being generated.