I'm running tornado-celery and using the rabbitmq as broker and I'm running the same issue, already posted on SO question:14636534 but the accepted answer didn't fits my needs.
What I want is to find a way to delete the result queue after the task is done (I need the results).
I have the auto_delete and the x-expires is set to a reasonable short time, but what happens is that the queue is never deleted until the subscribed client is removed (actually the application shutdown).
I think that I'm closed to the solution and what i'm looking for is a way to tell celery, after receiving the result to "unsubscribe" from that queue. Am I in the right way? If so how can I do?
Related
We have been trying to use RabbitMQ to transfer data from Project A to Project B.
We created a producer who takes the data from Project A and puts it in a queue, and that was relatively easy. Then, create a k8s pod for Project B, which listens to the appropriate queue with the ConsumerMixin of kombu.
Overall, the integration was reasonable and straightforward. But when we started to process long messages, we noticed that they were coming back into the queue repeatedly.
After research, we found out that whenever the processing of the message takes more than 20 seconds, the message showed up in the queue again, even though the processing was successful.
The source of this issue lies with the heartbeat of RabbitMQ. We set the heartbeat for 10 seconds, and the RabbitMQ checks the connection twice before it kills it. However, because the process of the callback takes more than 20 seconds, and the .ack() (acknowledge) of the message happens at the end of the callback (to ensure it was successful), the heartbeat is being blocked by the process of this message (as described here: https://github.com/celery/kombu/issues/621#issuecomment-251836611).
We have been trying to find a workaround with Threading, to process the message on a different thread and avoid the block of the heartbeat, but it didn't work. Also, it feels like we were trying to hack things and not solve the problem.
So my question here is if there is a proper workaround to handle this situation, or what alternatives do we have? RabbitMQ seemed like the right choice since we use it in standalone projects with Celery, and it is also recommended on the internet.
RabbitMQ introduced streams last year. They claim streams work with AMQP 0.9 and 1.0 as well as mentioned here. That is, theoretically we should be able to create a queue backed by a stream, connect as many workers we need to fan-out to the queue and each worker should get the message delivered.
My question is, has anyone tried to use streams with celery yet? If so, please share any info on how to configure streams in Celery and your experience with them so far. There are unfortunately no blog posts nor any documentation I could find on this topic. I am hoping this post brings together all this information in one place.
The big advantage of streams is they allow large fan-out using the existing infra of RabbitMQ + Celery.
As far as I am aware of there is no way Celery can utilise streams. However, you can probably spin up a long running Celery task that processes particular stream. This is probably reason why nobody attempted (or better say recorded as a blog post or something similar) to do this. - Why bother using Celery for something it is not made for?
I'm trying to use RabbitMQ in a more unconventional way (though at this point i can pick any other message queue implementation if needed)
I have one queue (I can have more if needed) that where customers are fetching N messages asynchronous. After they do their work I send the results from the client to the db.
I have two problems: first I don't want that they will work on the same message, second I want to grantee that I wont lose messages in case that my customer will close the browser or just stop working.
I looked at the documentation and saw the TTL which was perfect for me if I could alter that message that got timeout isn't going to be deleted but to move to another queue. can't find a way to alter this.
Moreover I looked at the confirmation option which in the first glance looked what I wanted,that mechanism is working like this: when the consumer gets a message he send confirmation to queue, I thought I can delay this confirm and send it when the work is done on the client side.
my problem was that I can't program the queue that if any message didn't get confirm then return it to the queue (or to another).
I also find how to do a scheduled message but it didn't help either because I don't want that the message will be inserted to the queue in five min,I want that when a customer will receive a message it will be locked in the queue for 5 min until confirm to delete is set otherwise return it to the queue.
Can I do temporary queue that enables my mechanism?
If someone can help with one of the problems or suggest another architecture or option to do it in another MQ it would be great.
Resources:
confirmation:
http://www.rabbitmq.com/blog/2011/02/10/introducing-publisher-confirms/
post about locks but his problem was a batcher component:
Locks and batch fetch messages with RabbitMq
TTL:
https://www.rabbitmq.com/ttl.html
Schedule a message:
https://www.rabbitmq.com/blog/2015/04/16/scheduling-messages-with-rabbitmq/
my problem was that I can't program the queue that if any message
didnt get confirm then return it to the queue (or to another).
RabbitMQ does this anyhow, so all you have to do is switch off the auto-ack flag, you figured this out
I thought I can delay this confirm and send it when the work is done
on the client side.
so just send the ACK once you've finished with processing the message.
All the unacknowledged messages remain in the queue and are re-delivered to next consumer (or the same one when it's up again, depending on your setup)
I was wondering if this is possible. I want to pull a task from a queue and have some work that could potentially take anywhere from 3 seconds or longer (possibly) minutes before an ack is sent back to RabbitMQ notifying that the work has been completed. The work is done by a user, hence this is why the time it takes to process the job varies.
I don't want to ack the message immediately after I pop off the queue because I want the message to be requeued if no ack is received. Can anyone give me any insights into how to solve my problem?
Having a long timeout should be fine, and certainly as you say you want redelivery if something goes wrong, so you want to only ack after you finish.
The best way to achieve that, IMO, would be to have multiple consumers on the queue (i.e. multiple threads/processes consuming from the same queue). That should be fine as long as there's no particular ordering constraint on your queue contents (i.e. the way there might be if the queue were to contain contents representing Postgres data that involves FK constraints).
This tutorial on the RabbitMQ website provides more info (Python linked, but there should be similar tutorials for other languages): https://www.rabbitmq.com/tutorials/tutorial-two-python.html
Edit in response to comment from OP:
What's your heartbeat set to? If your worker doesn't acknowledge the heartbeat within the set period of time, the server will consider the connection to be dead.
Not sure which language you're using, but for Java you would use the setRequestedHeartbeat method to specify the heartbeat.
The way you implement your workers, it's vital that the heartbeat can still be sent back to the RabbitMQ server. If something blocks the client from sending the heartbeat, the server will kill the connection after the time interval expires.
I have a somewhat unique use case with RabbitMQ and I'm not sure how to go about solving the problem. I want to have one queue with multiple consumers bound to it and then have RabbitMQ send out one message to only one consumer at at time and wait for an ACK before sending out another message to any other consumer.
I realize this kills throughput and can essentially starve the other consumers but for me that's OK. The reason for this odd use case is that the service that the consumers talk to can only handle one concurrent request at a time so I need a way to limit this but consumers can also die unexpectedly and I need another consumer to pick up processing the messages if this happens. I know there is the prefetch option but that still allows multiple users to get a and exclusive queues but I'm not sure those accomplish what I want. Is it possible configure RabbitMQ to do this?
No; there is no way to limit competing consumers on the same queue such that there is one and only one message in process across all consumers until the ack is received.
A similar question came up some time ago; I don't remember if it was here or in the Spring forums but I believe the solution was to have the consumers acquire a global lock of some kind, using something like hazelcast, or even a simple database table row lock (with prefetch=1 so each consumer had only one "in process" message which was processed as and when each one got the lock).