I am using activemq 5.5.0 version in my project.
I am facing one problem due to some reason my inflight count keep on increasing ,
Rate of increment is not high.
After some days it will become equal to prefetch size then my queue stops responding.
Could anyone can help me how to make inflight count 0 without deleting the queue ?
You can make the in-flight count go to 0 by purging the queue from the web console (there's a Purge link on each queue on the Queues page); that will delete all messages in the queue without deleting the queue itself.
That will satisfy the question as you've asked it, though I suspect it's not really what you want and what you meant was "without deleting any messages from the queue". But if I'm wrong and this is what you wanted, then great; if not, answer my questions in the comment I added and we'll go from there.
Related
To keep it short, here is a simplified situation:
I need to implement a queue for background processing of imported data files. I want to dedicate a number of consumers for this specific task (let's say 10) so that multiple users can be processed at in parallel. At the same time, to avoid problems with concurrent data writes, I need to make sure that no one user is processed in multiple consumers at the same time, basically all files of a single user should be processed sequentially.
Current solution (but it does not feel right):
Have 1 queue where all import tasks are published (file_queue_main)
Have 10 queues for file processing (file_processing_n)
Have 1 result queue (file_results_queue)
Have a manager process (in this case in node.js) which consumes messages from file_queue_main one by one and decides to which file_processing queue to distribute that message. Basically keeps track of in which file_processing queues the current user is being processed.
Here is a little animation of my current solution and expected behaviour:
Is RabbitMQ even the tool for the job? For some reason, it feels like some sort of an anti-pattern. Appreciate any help!
The part about this that doesn't "feel right" to me is the manager process. It has to know the current state of each consumer, and it also has to stop and wait if all processors are working on other users. Ideally, you'd prefer to keep each process ignorant of the others. You're also getting very little benefit out of your processing queues, which are only used when a processor is already working on a message from the same user.
Ultimately, the best solution here is going to depend on exactly what your expected usage is and how likely it is that the next message is from a user that is already being processed. If you're expecting most of your messages coming in at any one time to be from 10 users or fewer, what you have might be fine. If you're expecting to be processing messages from many different users with only the occasional duplicate, your processing queues are going to be empty much of the time and you've created a lot of unnecessary complexity.
Other things you could do here:
Have all consumers pull from the same queue and use some sort of distributed locking to prevent collisions. If a consumer gets a message from a user that's already being worked on, requeue it and move on.
Set up your queue routing so that messages from the same user will always go to the same consumer. The downside is that if you don't spread the traffic out evenly, you could have some consumers backed up while others sit idle.
Also, if you're getting a lot of messages in from the same user at once that must be processed sequentially, I would question if they should be separate messages at all. Why not send a single message with a list of things to be processed? Much of the benefit of event queues comes from being able to treat each event as a discrete item that can be processed individually.
If the user has a unique ID, or the file being worked on has a unique ID then hash the ID to get the processing queue to enter. That way you will always have the same user / file task queued on the same processing queue.
I am not sure how this will affect queue length for the processing queues.
In activemq 5, each queue had a folder containing its data and messages, everything.
Which would mean that, in case of an issue, for example an out of disk space error. Some files would get corrupted before the server crash. In that case, in activemq 5, we would find logs indicating corrupted files, and we could delete the queue folder that was corrupted, resulting in small loss of messages instead of ALL messages.
In artemis, it seems that messages are stored in the same files, independently from the queue they are stored in. Which means if i get an out of disk space error, i might have to delete all my messages.
First, can you confirm the change of behaviour, and secondly, is there a way to recover ? And a bonus, if anyone know why this change happened, I would like to understand.
Artemis uses a completely new message journal implementation as compared to 5.x. The same journal is used for all messages. However, it isn't subject to the same corruption problems as you've seen with 5.x. If records from the journal can't be processed then they are simply skipped.
If you get an out of disk space error you should never need to delete all your messages. The journal files themselves are allocated and filled with zeroes to meet their configured size before they are actually used so if you were going to run out of disk space you'd do so during that process before any messages were written to them.
The Artemis journal implementation was written from the ground up for high performance specifically in conjunction with the broker's non-blocking architecture.
I am new to Active MQ but sometimes the queues are not being processed and keep piling up, Is it a good practice to purge?, Isnt there any other solution that may prevent me from keeping all my messages for reprocessing apart from purging? I really dont want to loose the queues, Is this possible?
The correct way to deal with this is to set an expiration on messages such that after a given time the broker can discard them. Letting messages just pile into queues without regard to their lifetime will lead you into all sorts of problems most notably storage.
You need to develop a strategy for how long the messages should live so that the broker can start getting rid of them once they are no longer of use. If you don't do that then purging the queue is you only option.
If I declare a queue with x-max-length, all messages will be dropped or dead-lettered once the limit is reached.
I'm wondering if instead of dropped or dead-lettered, RabbitMQ could activate the Flow Control mechanism like the Memory/Disk watermarks. The reason is because I want to preserve the message order (when submitting; FIFO behaviour) and would be much more convenient slowing down the producers.
Try to realize queue length limit on application level. Say, increment/decrement Redis key and check it max value. It might be not so accurate as native RabbitMQ mechanism but it works pretty good on separate queue/exchange without affecting other ones on the same broker.
P.S. Alternatively, in some tasks RabbitMQ is not the best choice and old-school relational databases (MySQL, PostgreSQL or whatever you like) works the best, but RabbitMQ still can be used as an event bus.
There are two open issues related to this topic on the rabbitmq-server github repo. I recommended expressing your interest there:
Block publishers when queue length limit is reached
Nack messages that cannot be deposited to all queues due to max length reached
I am using celery on rabbitmq. I have been sending thousands of messages to the queue and they are being processed successfully and everything is working just fine. However, the number of messages in several rabbitmq queues are growing quite large (hundreds of thousands of items in the queue). The queues are named celeryev.[...] (see screenshot below). Is this appropriate behavior? What is the purpose of these queues and shouldn't they be regularly purged? Is there a way to purge them more regularly, I think they are taking up quite a bit of disk space.
You can use the CELERY_EVENT_QUEUE_TTL celery option (only working with amqp), that will set the message expiry time, after which it will be deleted from the queue.
For anyone else who is running into problems with a celeryev queue becoming very large and threatening the disk space on your rabbitmq server, beware the accepted answer! Here's my suggestion. Just issue this command on your rabbitmq instance:
rabbitmqctl set_policy limit_celeryev_queues "^celeryev\." '{"max-length":1000000}' --apply-to queues
This will limit any queue beginning with "celeryev" to 1 Million entries. I did some experimenting with a stuck flower instance causing a runaway celeryev queue, and setting CELERY_EVENT_QUEUE_TTL / CELERY_EVENT_QUEUE_EXPIRES did not help control the queue size.
In my testing, I started a flower process, then SIGSTOP'ed it, and watched its celeryev queue start running away. Neither of these two settings helped at all. I confirmed SIGCONT'ing the flower process would bring the queue back to 0 rapidly. I am not certain why these two knobs didn't help, but it may have something to do with how RabbitMQ implements these two settings.
First, the Per-Message TTL corresponding to CELERY_EVENT_QUEUE_TTL only establishes an expiration time on each queue entry -- AIUI it will not automatically delete the message out of the queue to save space upon expiration. Second, the Queue TTL corresponding to CELERY_EVENT_QUEUE_EXPIRES says that it "... guarantees that the queue will be deleted, if unused for at least the expiration period". However, I believe that their definition of "unused" may be too strict to kick in for e.g. an overburdened, stuck, or killed flower process.
EDIT: Unfortunately, one problem with this suggestion is that the set_policy ... apply-to queues will only impact existing queues, and flower can and will create new queues which may overflow.
Celery use celeryev prefixed queues (and exchange) for monitoring, you can configure it as you want or disable at all (celery control disable_events).
You just have to set a config to your Celery.
If you want to avoid Celery from creating celeryev.* queues:
CELERY_SEND_EVENTS = False # Will not create celeryev.* queues
If you need these queues for monitoring purpose (CeleryFlower for instance), you may regularly purge them:
CELERY_EVENT_QUEUE_EXPIRES = 60 # Will delete all celeryev. queues without consumers after 1 minute.
The solution came from here: https://www.cloudamqp.com/docs/celery.html
You can limit the queue size in RabbitMQ with x-max-length queue declaration argument
http://www.rabbitmq.com/maxlength.html