DMLC and concurrent consumers working - activemq

Does DMLC creates separate threads for each concurrent consumer? What happens under the hood? The documentation writes this:
Actual MessageListener execution happens in asynchronous work units which are created through Spring's TaskExecutor abstraction. By default, the specified number of invoker tasks will be created on startup, according to the "concurrentConsumers" setting.
I am not able to understand this, are these tasks executed in parallel? If yes, what are the default limits for this, like thread count etc.?
Thanks!

Yes a separate thread is used for each consumer (obtained from the task executor). By default, a SimpleAsyncTaskExecutor is used and the thread is destroyed when the consumer is stopped. There is no thread limit beyond the container's concurrency settings.
If you inject a different kind of task executor (such as a ThreadPoolTaskExecutor) you must make sure it has enough available threads to support your container's concurrency settings. Container threads are generally long-lived.

Related

How does ActiveMQ AMQ_SCHEDULED_DELAY message works?

We want to use delay feature from activeMQ to delay particural event. How does AMQ_SCHEDULED_DELAY work internaly? In documentation is information about scheduler but no information what mechanism it utilize to delay message. For that reason we are not sure how delaying is going to affect activeMQ. Does activeMQ utilize pooling or async to achive delay.
I ask this question because people from my organization want to pick diffrent technology. I do not have any proof delay from activeMQ is any better.
Here is link to source code. I was thinking of looking up code but I'm not good in java. Can anyone help?
Default implementation of ActiveMQ does utilize the polling.
Active MQ internally keep polling for the scheduled (or delayed) messages by a background scheduler thread. This thread read the list of scheduled events (or messages) and fires the jobs, reschedule repeating jobs as needed before firing the job event.
The list of scheduled events is stored in a sorted order in internal storage of activemq. So during poll, it just read event which are scheduled for earliest processing. Since the messages are persisted during enquing, scheduling many not have visible performance impact during processing.
However before adopting, you can setup your benchmark, without worries much internal implementation detail, to see that your performance/SLA requirement are getting met.
For more details, you may refer to Javadoc of job scheduler API. For default implementation can you refers to the code.
Hope this helps.
In looking at the source code mentioned by #skadya, the term "polling" is not what I interpret. It appears to use the Java Object class' wait(long timeout) method to determine when to "wake up" the thread that runs the jobs.
So, I wouldn't call it polling. I would call it an asynchronous mechanism in which the delay / timeout is set such that the thread will wake up (e.g. to run the next scheduled job at the appropriate time) via the timeout set to a value that is appropriate for the next scheduled job's commencement.
Javadoc for Object.wait(long timeout)
Note that the implementation for Object.wait is a native (i.e. non-java) implementation provided by the JDK / JRE / JVM for a given platform. For what that's worth.
It is possible to do performance test with activemq web console. There is an option to send message with configurable delay and number of messages to send. It doesn't answer my question but it seems like best option to compare two approaches.

How to figure out if mule flow message processing is in progress

I have a requirement where I need to make sure only one message is being processed at a time by a mule flow.Flow is triggered by a quartz scheduler which reads one file from FTP server every time
My proposed solution is to keep a global variable "FLOW_STATUS" which will be set to "RUNNING" when a message is received and would be reset to "STOPPED" once the processing of message is done.
Any messages fed to the flow will check for this variable and abort if "FLOW_STATUS" is "RUNNING".
This setup seems to be working , but I was wondering if there is a better way to do it.
Is there any best practices around this or any inbuilt mule helper functions to achieve the same instead of relying on global variables
It seems like a more simple solution would be to set the maxActiveThreads for the flow to 1. In Mule, each message processed gets it's own thread. So setting the maxActiveThreads to 1 would effectively make your flow singled threaded. Other pending requests will wait in the receiver threads. You will need to make sure your receiver thread pool is large enough to accommodate all of the potential waiting threads. That may mean throttling back your quartz scheduler to allow time process the files so the receiver thread pool doesn't fill up. For more information on the thread pools and how to tune performance, here is a good link: http://www.mulesoft.org/documentation/display/current/Tuning+Performance

On Heroku, does utilising Node.js prevent the need for queues + worker dynos for third-party API calls?

The Heroku Dev Center on the page about using worker dynos and background jobs states that you need to use worker's + queues to handle API calls, such as fetching an RSS feed, as the operation may take some time if the server is slow and doing this on a web dyno would result in it being blocked from receiving additional requests.
However, from what I've read, it seems to me that one of the major points of Node.js is that it doesn't suffer from blocking under these conditions due to its asynchronous event-based runtime model.
I'm confused because wouldn't this imply that it would be ok to do API calls (asynchronously) in the web dynos? Perhaps the docs were written more for the Ruby/Python/etc use cases where a synchronous model was more prevalent?
NodeJS is an implementation of the reactor pattern. The default build of of NodeJS uses 5 reactors. Once these 5 reactors are being used for IO bound tasks, the main event loop will block.
A common misconception about NodeJS is that it is a system that allows you to do many things at once. This is not necessarily the case, it allows you to do other things while waiting on IO bound tasks, up to 5 at a time.
Any CPU bound tasks are always executed in the main event loop, meaning they will block.
This means if your "job" is IO bound, like putting things in databases then you can probably get away with not using dynos. This of course is dependent on how many things you plan on having go on at once. Remember, any task you put in your main app will take away resources from other incoming requests.
Generally it is not recommended for things like this, if you have a job that does some processing, it belongs in a queue that is executed in its own process or thread.

NServiceBus: GridInterceptingMessageHandler

What is "GridInterceptingMessageHandler"? I did a search and I can find no mention of this on nservicebus.com. Also, I see the samples have the line:
.LoadMessageHandlers(First<GridInterceptingMessageHandler>.Then<SagaMessageHandler>())
What does that do exactly?
If you look at the source and its documentation you'll see the following:
Intercepts all messages, not allowing any through if the endpoint has had its number of worker threads reduced to zero.
GridInterceptingMessageHandler
NSB allows you to dynamically tune the number of work threads and endpoint is using to process messages. If the number of work threads has been reduced to zero, the endpoint becomes disabled and will not continue to process messages. The tuning of threads is useful if you would like to increase the speed of message processing(assuming everything else will scale as well) while not having to restart the endpoint.
This is especially helpful if you want to slowing drain the system of messages so that you can perform upgrades or other maintenance duties. By default this is wired up for you, you would only reference it if you decided to override how the message handlers are loaded(as in the example).

WCF: Is it safe to spawn an asynchronous worker thread on the server?

I have a WCF service method that I want to perform some action asynchronously (so that there's little extra delay in returning to the caller). Is it safe to spawn a System.ComponentModel.BackgroundWorker within the method? I'd actually be using it to call one of the other service methods, so if there were a way to call one of them asynchronously, that would work to.
Is a BackgroundWorker the way to go, or is there a better way or a problem with doing that in a WCF service?
BackgroundWorker is really more for use within a UI. On the server you should look into using a ThreadPool instead.
when-to-use-thread-pool-in-c has a good write-up on when to use thread pools. Essentially, when handling requests on a server it is generally better to use a thread pool for many reasons. For example, over time you will not incur the extra overhead of creating new threads, and the pool places a limit on the total number of active threads at any given time, which helps conserve system resources while under load.
Generally BackgroundWorker is discussed when a background task needs to be performed by a GUI application. For example, the MSDN page for System.ComponentModel.BackgroundWorker specifically refers to a UI use case:
The BackgroundWorker class allows you to run an operation on a separate, dedicated thread. Time-consuming operations like downloads and database transactions can cause your user interface (UI) to seem as though it has stopped responding while they are running. When you want a responsive UI and you are faced with long delays associated with such operations, the BackgroundWorker class provides a convenient solution.
That is not to say that it could not be used server-side, but the intent of the class is for use within a UI.