I have two servers running Celery and one Redis database. They both listen to the same queue as they are meant to divide the "workload". Tasks are queued onto Redis, but it looks like both my Celery servers pick up the task at the same time, hence executing it twice (once on each server.) Is there a way to prevent this with the Redis/Celery setup?
Thank you,
Each of my servers were using the same name for the celery workers. Since then I've added %h at the end of the worker name (-n my_worker_%h) to show the hostname. This way Celery Flower displays all of the workers in their own line, and there is no confusion more possible.
Related
Within celery, I see sometimes that the worker is offline. I run Flower in one Docker container and the Celery worker in another one. I use a RabbitMQ broker.
I see that the worker jumps between offline <-> online quite often.
What does it mean that a worker is offline? How does Flower figure that out?
Worker is considered "offline" if it does not broadcast heartbeat signal for some (short) period of time.
I have set up a Celery task that is using RabbitMQ as the broker and Redis as the backend. After running I noticed that my Redis server was still using a lot of memory. Upon inspection I found that there were still keys for each task that was created.
Is there a way to get Celery to clean up these keys only after the response has been received? I know some MessageBrokers use acks, is there an equivalent for a redis backend in Celery?
Yes, use result_expires. Please note that celery beat should run as well, as written in the documentation:
A built-in periodic task will delete the results after this time (celery.backend_cleanup), assuming that celery beat is enabled. The task runs daily at 4am.
Unfortunately Celery doesn't have acks for its backend, so the best solution for my project was to call forget on my responses after I was done with them.
Is there a way for a celery job to be retried if the server where the worker is running dies? I don't just mean the sub-process that execute the job, but the entire server becomes unavailable.
I tried with RabbitMQ and Redis as brokers. In both cases, if a job is currently being processed, it is entirely forgotten. When a worker restarts, it doesn't even try to reprocess the job, and looking at Rabbit or Redis, their queues are empty. The result backend is also empty.
It looks like the worker grabs the message and assume it will put it back if the subprocess fails, but if the worker dies also, it can't put it back.
(yes, I work in an environment where this happens more than once a year, and I don't want to lose tasks)
In theory, set task_acks_late=True should do the trick. (doc)
With a Redis broker, the task will be redelivered after visibility_timeout, which defaults to one hour. (doc)
With RabbitMQ, the task is redelivered as soon as Rabbit noticed that the worker died.
I'm running Celery on my laptop, with rabbitmq being the broker and redis being the backend. I just used all the default settings and ran celery -A tasks worker --loglevel=info, then it all worked. The workers can get jobs done and I get fetch the execution results by calling result.get(). My question here is that why it works even if I didn't run the rebbitmq and redis servers at all. I did not set the accounts on the servers either. In many tutorials, the first step is to run the broker and backend servers before starting celery.
I'm new to these tools and do not quite understand how they work behind the scene. Any input would be greatly appreciated. Thanks in advance.
Never mind. I just realized that redis and rabbitmq automatically run after installation or shell startup. They must be running for celery to work.
I'm trying to figure out how HA works. (high availability queues)
The current configuration I have is: every machine has multiple celery workers and points to itself as broker. Each machine can do this rather than point at one broker machine because of HA; in this way, there is less load on any one machine, as all are brokers and have copies of the same queue.
My question is, is my above logic correct? Or do all workers need to point to one broker machine regardless of HA?
If you have looked at HA and clustering and have ensured that the queues mirror each other then what you are doing should be fine. But that may seem a tad inefficient to run it on every server where you run your workers.
The other option is to run your queues on a few servers for HA and have other servers running the workers to point to them. But since the celery worker config can only point to one broker url, you would need to work around that by possibly using a load balancer to which all workers will point to. This is to the best of what I've come to understand over the past few years on RabbitMQ HA for celery.