Kotlin coroutines: concurrent execution throttling - kotlin

Imagine we are reading messages from a message queue and on receive pushing them for processing into a threadpool. There are limited number of threads, so if all the threads are busy, we'll have natural backpressure.
How this can be solved in Kotlin coroutine world? If we'll create a coroutine for each incoming message, we can very fast end up with Out of memory errors (e.g if each task requires to load some data from DB) and other issues.
Are there any mechanisms or patterns to solve this problem?

One way to solve the issue, is to create a Channel and send your data onto it. Other threads can put a consumeEach on the channel to receive data from it. The channel's capacity can be tweaked to your threading needs.
Fan-out and Fan-in examples in the coroutines docs can be helpful too.

Related

Is there a way to automatically unsubscribe from a channel in Redis?

I have simple code that subscribes to a channel, receives one message and then unsubscribes.
I am using Stackexchange.Redis that has as far as I can see one connection to Redis for subscriptions.
The method that i described will be called by multiple threads simultaneously and the channel is dynamic. What I am wondering is what's going to happen if one of the threads cannot perform the unsubscribe (due to an exception e.g.).
If this keeps going on I'll have a lot of useless stale subscriptions that noone is listening to since from what I understand subscription is not closed after ChannelMessageQueue goes out of scope and is garbage collected eventually.
Is there a good way to handle this situation?

ActiveMQ prevent consumer handling specific message

We have a design challenge where the situation is as follow:
There are multiple producers and multiple consumers (on same queue).
Each message represent a task with parameters that consumer needs to handle.
The problem is that there are certain tasks that take lots of memory (and cpu power) which we know the consumer have no capacity to handle this. the good thing is that we know how much memory (and cpu power) it approximately can take in advance, so we could prevent a consumer taking that task and giving a change to other consumer with enough memory to handle.
There is the prefetch setting but i can't see how it can configure to meet this requirement
Finally I found an option to rollback a transaction, so the consumer can basically check if it has enough hardware resources to handle the task and if not rollback which retrieves the message back to queue allowing next consumer take it and so forth.
Not sure if that's the right approach or there is a better way?
The messages could have properties set which indicate whether or not they will require high CPU and/or memory and then consumers could use selectors to only receive the messages which fit their hardware constraints.

How does ActiveMQ AMQ_SCHEDULED_DELAY message works?

We want to use delay feature from activeMQ to delay particural event. How does AMQ_SCHEDULED_DELAY work internaly? In documentation is information about scheduler but no information what mechanism it utilize to delay message. For that reason we are not sure how delaying is going to affect activeMQ. Does activeMQ utilize pooling or async to achive delay.
I ask this question because people from my organization want to pick diffrent technology. I do not have any proof delay from activeMQ is any better.
Here is link to source code. I was thinking of looking up code but I'm not good in java. Can anyone help?
Default implementation of ActiveMQ does utilize the polling.
Active MQ internally keep polling for the scheduled (or delayed) messages by a background scheduler thread. This thread read the list of scheduled events (or messages) and fires the jobs, reschedule repeating jobs as needed before firing the job event.
The list of scheduled events is stored in a sorted order in internal storage of activemq. So during poll, it just read event which are scheduled for earliest processing. Since the messages are persisted during enquing, scheduling many not have visible performance impact during processing.
However before adopting, you can setup your benchmark, without worries much internal implementation detail, to see that your performance/SLA requirement are getting met.
For more details, you may refer to Javadoc of job scheduler API. For default implementation can you refers to the code.
Hope this helps.
In looking at the source code mentioned by #skadya, the term "polling" is not what I interpret. It appears to use the Java Object class' wait(long timeout) method to determine when to "wake up" the thread that runs the jobs.
So, I wouldn't call it polling. I would call it an asynchronous mechanism in which the delay / timeout is set such that the thread will wake up (e.g. to run the next scheduled job at the appropriate time) via the timeout set to a value that is appropriate for the next scheduled job's commencement.
Javadoc for Object.wait(long timeout)
Note that the implementation for Object.wait is a native (i.e. non-java) implementation provided by the JDK / JRE / JVM for a given platform. For what that's worth.
It is possible to do performance test with activemq web console. There is an option to send message with configurable delay and number of messages to send. It doesn't answer my question but it seems like best option to compare two approaches.

RabbitMQ: throttling fast producer against large queues with slow consumer

We're currently using RabbitMQ, where a continuously super-fast producer is paired with a consumer limited by a limited resource (e.g. slow-ish MySQL inserts).
We don't like declaring a queue with x-max-length, since all messages will be dropped or dead-lettered once the limit is reached, and we don't want to loose messages.
Adding more consumers is easy, but they'll all be limited by the one shared resource, so that won't work. The problem still remains: How to slow down the producer?
Sure, we could put a flow control flag in Redis, memcached, MySQL or something else that the producer reads as pointed out in an answer to a similar question, or perhaps better, the producer could periodically test for queue length and throttle itself, but these seem like hacks to me.
I'm mostly questioning whether I have a fundamental misunderstanding. I had expected this to be a common scenario, and so I'm wondering:
What is best practice for throttling producers? How is this done with RabbitMQ? Or do you do this in a completely different way?
Background
Assume the producer actually knows how to slow himself down with the right input. E.g. a hardware sensor or hardware random number generator, that can generate as many events as needed.
In our particular real case, we have an API that users can use to add messages. Instead of devouring and discarding messages, we'd like to apply back-pressure by having our API return an error if the queue is "full", so the caller/user knows to back-off, or have the API block until the consumer catches up. We don't control our user, so regardless of how fast the consumer is, I can create a producer that is faster.
I was hoping for something like the API for a TCP socket, where a write() can block and where a select() can be used to determine if a handle is writable. So either having the RabbitMQ API block or have it return an error if the queue is full.
For the x-max-length property, you said you don't want messages to be dropped or dead-lettered. I see there was an update in adding some more capabilities for this. As I see it is specified in the documentation:
"Use the overflow setting to configure queue overflow behaviour. If overflow is set to reject-publish, the most recently published messages will be discarded. In addition, if publisher confirms are enabled, the publisher will be informed of the reject via a basic.nack message"
So as I understand it, you can use queue limit to reject the new messages from publishers thus pushing some backpressure to the upstream.
I don't think that this is in any way rabbitmq specific. Basically you have a scenario, where there are two systems of different processing capabilities, and this mismatch will either pose a risk of overflowing the queue (whatever it would be), or even in case of a constant mismatch between producer and consumer, simply create more and more time-distance between event creation and its handling.
I used to deal with this kind of scenarios, and unfortunately there is no magic bullet. You either have to speed up even handling (better hardware, more suited software?) or throttle the event creation (which has nothing to do with MQ really).
Now, I would ask you what's the goal and how the events are produced. Are the events are produced constantly, with either unlimitted or just very high rate (for example readings from sensors - the more, the better), or are they created in batches/spikes (for example: user requests in specific time periods, batch loads from CRM system). I assume that the goal is to process everything cause you mention you don't want to loose any queued message.
If the output is constant, then some limiter (either internal counter, if the producer is the only producer, or external queue length checks if queue can be filled with some other system) is definitely in place.
IF eventsInTimePeriod/timePeriod > estimatedConsumerBandwidth
THEN LowerRate()
ELSE RiseRate()
In real world scenarios we used to simply limit the output manually to the estimated values and there were some alerts set for queue length, time from queue entry to queue leaving etc. Where such limiters were omitted (by mistake mostly) we used to find later some tasks that were supposed to be handled in few hours, that were waiting for three months for their turn.
I'm afraid it's hard to answer to "How to slow down the producer?" if we know nothing about it, but some ideas are: aforementioned rate check or maybe a blocking AddMessage method:
AddMessage(message)
WHILE(getQueueLength() > maxAllowedQueueLength)
spin(1000); // or sleep or whatever
mqAdapter.AddMessage(message)
I'd say it all depends on specific of the producer application and in general your architecture.

blocking call on two Queues?

I have an algorithm (task in VxWorks) that is reading data from multiple queues to be able to manage priorities accordingly. Now , the msgQReceive( ) function, can be set to WAIT_FOREVER which would make it a blocking call until something is available to receive and process. Now how can I do this if I have multiple queues? Currently I check in a while(1) loop if any of the queues have any contents and receive them if so but if nothing is there, my algorithm just spins and spins and spins and eats CPU resources for nothing. How can I prevent this best?
You should be able to use VxWorks events coupled with a Message Queue.
See msgQEvStart function and Kernel Programmer's Guide, section 7.9.
This is akin to using a select() for I/O operation.
You do a blocking eventReceive which returns a bitmask indicating which queue has content and you then do a non-blocking msgQReceive to retrieve the data.
Or you can look at How can a task wait on multiple vxworks Queues? which I wrote a while ago,
As already mentioned, you could use events, alternatively if you can use a pipe instead of msgQ, you could potentially use select.
As another alternative, perhaps consider having multiple tasks, each servicing a single msgQ