We have a setup where we have many consumers to a Queue.
The problem it seems like only a subset of those consumers are actually doing work.
Example
One Queue has 120 Consumers and has about 1000 messages?
It seems to only process 20 messages at a time though.
Any ideas?
It sounds like you're running into the prefetch count limit. I believe the default is 20.
From https://rabbitmq.docs.pivotal.io/36/rabbit-web-docs/consumer-prefetch.html
channel.basicQos(10, false); // Per consumer limit
channel.basicQos(15, true); // Per channel limit
Just remember that there are design complexities to working with large numbers of concurrent operations. (It can be done, but be careful that you're maintaining data integrity.)
Is there an upper limit to the number of unique IEndpointInstances that be hosted within in a single process?
I'm considering a design that will see up to a 100 unique IEndpointInstances, all listening on separate queues, be active simultaneously.
Will this cause a problem for NServiceBus? Could the process deadlock or spin up so many threads as to be unresponsive and useless?
The question NServiceBus - How to get separate queue for each message type receiver subscribes to? seems to suggest that you can not have multiple endpoints in a process, but this is an older post. I have built a small sample against NServiceBus 6--beta4 that does work.
There is a similar question NServiceBus Single Process, but Multiple Input queues that concluded, based on the OP's context using Satellite Features was the recommended approach. However, in my case, I have 100 (functionally different) sagas (1 per queue), where each saga could need to receive similar messages, but I need to make sure that only the correct saga receives the message. Therefor, I don't think implementing a custom feature will meet my requirements. Or will Satellite Features support Sagas?
One of the options is to use self multi hosting. Using this approach, you self the endpoints yourself in the same process. There are a few things to take into consideration, such as:
Assembly scanning (might require custom scanning logic per endpoint).
Throughput (for heavy throughput endpoints I'd recommend a separate hosting process).
To update/redeploy a single endpoint, you'll be taking all of the other 99 endpoints down as well.
While there's no hard limit on how many endpoints can be co-hosted, 100 sounds a bit a lot. Saying that, it also depends how heavy the load on those endpoints is. If you process 1 msg/sec or 1K msg/sec determine a lot if this is a viable option or not.
Have a look at the sample that does exactly that.
I'm looking to solve a problem that I have with the FIFO nature of messaging severs and queues. In some cases, I'd like to distribute the messages in a queue to the pool of consumers on a criteria other than the message order it was delivered in. Ideally, this would prevent users from hogging shared resources in the system. Take this overly simplified scenario:
There is a feature within an application where a user can empty their trash can.
This event dispatches a DELETE message for each item in trash can
The consumers for this queue invoke a web service that has a rate limited API.
Given that each user can have very large volumes of messages in their trash can, what options do we have to allow concurrent processing of each trash can without regard to the enqueue time? It seems to me that there are a few obvious solutions:
Create a separate queue and pool of consumers for each user
Randomize the message delivery from a single queue to a single pool of consumers
In our case, creating a separate queue and managing the consumers for each user really isn't practical. It can be done but I think I really prefer the second option if it's reasonable. We're using RabbitMQ but not necessarily tied to it if there is a technology more suited to this task.
I'm entertaining the idea of using Rabbit's message priorities to help randomize delivery. By randomly assigning a message a priority between 1 and 10, this should help distribute the messages. The problem with this method is that the messages with the lowest priority may be stuck in the queue forever if the queue is never completely emptied. I thought I could use a TTL on the message and then re-queue the message with an escalated priority but I noticed this in the docs:
Messages which should expire will still only expire from the head of
the queue. This means that unlike with normal queues, even per-queue
TTL can lead to expired lower-priority messages getting stuck behind
non-expired higher priority ones. These messages will never be
delivered, but they will appear in queue statistics.
I fear that I may heading down the rabbit hole with this approach. I wonder how others are solving this problem. Any feedback on creative routing, messaging patterns, or any alternative solutions would be appreaciated.
So I ended up taking a page out of the network router handbook. This a problem they routers need to solve to allow fair traffic patterns. This video has a good breakdown of the problem and the solution.
The translation of the problem into my domain:
And the solution:
The load balancer is a wrapper around a channel and a known number of queues that uses a weighted algorithm to balance between messages received on each queue. We found a really interesting article/implementation that seems to be working well so far.
With this solution, I can also prioritize workspaces after messages have been published to increase their throughput. That's a really nice feature.
The biggest challenge ahead of me is management of the queues. There will be too many queues to leave bound to the exchange for an extended period of time. I'm working on some tools to manage their lifecycle.
One solution could be to interpose a Resequencer. The principle is outlined in the diag in that link. In your case, something like:
The app dispatches its DELETE messages into the delete queue as originally.
The Resequencer (a new component you write) is interposed between the original publishers and original consumers. It:
pulls messages off the DELETE queue into memory
places them into (in-memory) queues-by-user
republishes them to a new queue (eg FairPriorityDeleteQueue), round-robinning to interleave fairly any messages from different original users
limits its republish rate into FairPriorityDeleteQueue, either such that the length of FairPriorityDeleteQueue (obtainable via polling the rabbitmq management api periodically) never exceeds some integer you choose N, or limited to some rate related to the rate-limited delete API the consumers use.
doesn't ack any message it pulled off the original DELETE queue, until it's republished it to FairPriorityDeleteQueue (so you never lose a message)
The original consumers subscribe instead to FairPriorityDeleteQueue.
You set the preFetchCount on these consumers fairly low (<10), to prevent them in turn bulk-buffering the contents of FairPriorityDeleteQueue in memory.
--
Some points to watch:
Rate- or length-limiting publishing into and/or drawing messages out of FairPriorityDeleteQueue is essential. If you don't limit, Resequencer may just hand messages on as fast as it receives them, limiting the potential for resequencing.
Resequencer of course acts as a kind of in-memory buffer while resequencing. If the original publishers can publish very large numbers of messages in to the queue suddenly, you may need to memory-limit the Resequencer process so that it doesn't ingest more than it can hold.
Your particular scenario is greatly helped by the fact that you have an external factor (the final delete API) limiting throughput. Without such an extrinsic limiting factor, it is much harder to choose the optimum parameters for such a resequencer, to balance throughput-versus-resequencing in a particular environment.
I don't think a resequencer is needed in this case. Maybe it is, if you need to ensure the items are deleted in a specific order. But that only comes into play when you send multiple messages at roughly the same time and need to guarantee order on the consumer end.
You should also avoid the timeout scenario, for the reasons you've mentioned. timeout is meant to tell RabbitMQ that a message doesn't need to be processed - or that it needs to be routed to a dead letter queue so that i can be processed by some other code. while you might be able to make timeout work, i don't think it's a good choice.
Priorities may solve part of the problem, but could introduce a scenario where files never get processed. if you have a priority 1 message sitting back in the queue somewhere, and you keep putting priority 2, 3, 5, 10, etc. into the queue, the 1 might not be processed. the timeout doesn't solve this, as you've noted.
For my money, I would suggest a different approach: sending delete requests serially, for a single file.
that is, send 1 message to delete 1 file. wait for a response to say it's done. then send the next message to delete the next file.
here's why i think that will work, and how to manage it:
Long-Running Workflow, Single File Delete Requests
In this scenario, I would suggest taking a multi-step approach to the problem using the idea of a "saga" (aka a long-running workflow object).
when a user requests to delete their trashcan, you send a single message through rabbitmq to the service that can handle the delete process. that service creates an instance of the saga for that user's trashcan.
the saga gathers a list of all files in the trashcan that need to be deleted. then it starts to send the requests to delete the individual files, one at a time.
with each request to delete a single file, the saga waits for the response to say the file was deleted.
when the saga receives the message to say the previous file has been deleted, it sends out the next request to delete the next file.
once all the files are deleted, the saga updates itself and any other part of the system to say the trash can is empty.
Handling Multiple Users
When you have a single user requesting a delete, things will happen fairly quickly for them. they will get their trash emptied soon.
u1 = User 1 Trashcan Delete Request
|u1|u1|u1|u1|u1|u1|u1|u1|u1|u1done|
when you have multiple users requesting a delete, the process of sending one file delete request at a time means each user will have an equal chance of getting the next file delete.
u1 = User 1 Trashcan Delete Request
u2 = User 2 Trashcan Delete Request
|u1|u2|u1|u1|u2|u2|u1|u2|u1|u2|u2|u1|u1|u1|u2|u2|u1|u2|u1|u1done|u2|u2done|
This way, there will be shared use of the resources to delete the files. Over-all, it will take a little longer for each person's trashcan to be emptied, but they will see progress sooner and that's an important aspect of people thinking the system is fast / responsive to their request.
Optimizing Small File Set vs Large File Set
In a scenario where you have a small number of users with a small number of files, the above solution may prove to be slower than if you deleted all the files at once. after all, there will be more messages sent across rabbitmq - at least 2 for every file that needs to be deleted (one delete request, one delete confirmation response)
To optimize this further, you could do a couple of things:
have a minimum trashcan size before you split up the work like this. below that minimum, you just delete it all at once
chunk the work into groups of files, instead of one at a time. maybe 10 or 100 files would be a better group size, than 1 file at a time
Either (or both) of these solutions would help to improve the over-all performance of the process by reducing the number of messages being sent, and batching the work a bit.
You would need to do some testing in your real scenario to see which of these (or maybe both) would help and at what settings.
Many Users Problem
There's one additional problem you may face - many users. If you have 2 or 3 users requesting deletes, it won't be a big deal.
But if you have 100 or 1000 users requesting deletes, it could take a very long time for an individual to get their trashcan emptied.
You may need to have a higher level controlling process for this situation, where all requests to empty trashcans would be managed by yet another Saga. This saga would rate-limit the number of active trashcan-deletion sagas.
For example, if you have 10 active requests for deleting trashcans, the rate-limiting saga would only start 3 of them and it would wait for one to finish before starting the next one.
Again, you would need to test your actual scenario to see if this is needed and see what the limits should be, for performance reasons.
There may be additional scenarios that have to be considered in your actual scenario, but I hope this gets you down the path! :)
All,
I have a problem with the performance with RabbitMQ when consuming messages when there is a large amount of messages to be consumed e.g. 280,000. It seems to go up and down from a performance perspective. The graph illustrated in the diagram taken from the management console demonstrates this where a consumer averages around 40 messages per second but then jumps up to around 120 messages per second:
The pattern will repeat itself again where it will go back to 40 and up to 120 again etc
Also, if I run the same test 1 hour later, the same up and down effect occurs but the range can vastly vary e.g. from 140 to 400 messages per second.
Note: The consumer does nothing with the messages
Note: Single consumer and ConsumerMessagePrefetchCount = 500
In relation to performance I have the following questions:
Is this up and down behaviour normal and expected or should the consumption speed of messages be steady?
Are the numbers that I am quoting expected or should they be much better/worse?
Any help appreciated
Billy
This behavior is quite normal, the queues are designed to be always close to zero messages. 280,000 is an high number, it means that the producer is faster than the consumer(s) so you have to increase the consumers number.
If you have a spike load, 280,000 could be not high number because you have a time to consume the messages.
There are lots techniques to increase the performances, for example:
Increase the consumer threads, (How many threads do you use to
consume the messages?)
Send messages with noAck
PrefetchCount is very important, an high value couldn’t be a right
solution.
The consumers should be steady, but also the producers should be steady, in load spike situation you need more time or more resources.
A few questions:
What rate do you have ?
Do you consume the messages from the same queue?
Do you need the ACK?
Why do you have 280.000 messages to the queue? Is it just a test or
a real situation ?
I hope it can be useful
As said Alexis Richardson (RabbitMQ) :
The easiest way to increase performance is to change what you are measuring ;-)
I'm setting up a web service with pyramid. A typical request for a view will be very long, about 15 min to finish. So my idea was to queue jobs with celery and a rabbitmq broker.
I would like to know what would be the best way to ensure that bad things cannot happen.
Specifically I would like to prevent the task queue from overflow for example.
A first mesure will be defining quotas per IP, to limit the number of requests a given IP can submit per hour.
However I cannot predict the number of involved IPs, so this cannot solve everything.
I have read that it's not possible to limit the queue size with celery/rabbitmq. I was thinking of retrieving the queue size before pushing a new item into it but I'm not sure if it's a good idea.
I'm not used to good practices in messaging/job scheduling. Is there a recommended way to handle this kind of problems ?
RabbitMQ has flow control built into the QoS. If RabbitMQ cannot handle the publishing rate it will adjust the TCP window size to slow down the publishers. In the event of too many messages being sent to the server it will also overflow to disk. This will allow your consumer to be a bit more naive although if you restart the connection on error and flood the connection you can cause problems.
I've always decided to spend more time making sure the publishers/consumers could work with multiple queue servers instead of trying to make them more intelligent about a single queue server. The benefit is that if you are really overloading a single server you can just add another one (or another pair if using RabbitMQ HA. There is a useful video from Pycon about Messaging at Scale using Celery and RabbitMQ that should be of use.