I was using celery as task queue and RabbitMQ as message queue, When pushing my tasks using the delay function to the queue. I see that there were 3 queues created in the rabbit mq. I don't understand what and why do we need these 2 extra queue. Also how do I identify onto which queue my tasks are actually getting pushed into?
Started celery :
celery -A myproject worker -l info
[tasks]
. app1.tasks.add
[2022-06-10 06:16:14,132: INFO/MainProcess] Connected to amqp://himanshu:**#IPADDRESS/vhostcheck
[2022-06-10 06:16:14,142: INFO/MainProcess] mingle: searching for neighbors
[2022-06-10 06:16:15,165: INFO/MainProcess] mingle: all alone
[2022-06-10 06:16:15,182: WARNING/MainProcess] /etc/myprojectenv/lib/python3.8/site-packages/celery/fixups/django.py:203: UserWarning: Using settings.DEBUG leads to a memory
leak, never use this setting in production environments!
warnings.warn('''Using settings.DEBUG leads to a memory
[2022-06-10 06:16:15,182: INFO/MainProcess] celery#ubuntu-s-1vcpu-1gb-blr1-01 ready.
[2022-06-10 06:17:38,485: INFO/MainProcess] Task app1.tasks.add[be566921-b320-466c-b406-7a6ed7ab06e7] received
[2022-06-10 06:16:15,182: INFO/MainProcess] celery#ubuntu-s-1vcpu-1gb-blr1-01 ready.
[2022-06-10 06:17:38,485: INFO/MainProcess] Task app1.tasks.add[be566921-b320-466c-b406-7a6ed7ab06e7] received
[2022-06-10 06:19:18,544: INFO/ForkPoolWorker-1] Task app1.tasks.add[be566921-b320-466c-b406-7a6ed7ab06e7] succeeded in 100.05838803993538s: 13
SO whenever I run my celery worker I see these 3 queues being generated.
RabbitMQ Management
What are those 3 queue and what for is celery using them for?
Also since queues are basically persistent database and therefore persistent queues, so why do they get deleted when I stop my workers. I see there is only 1 queue here after I stop celery.
The celery queue is there so that you can send tasks to that particular queue. Every Celery worker subscribed to this queue will be able to reserve and run tasks sent to it.
The .pidbox queue is created by every Celery worker to support execution of remote commands.
The celeryev queue is also created by every Celery worker and is used for monitoring. Every Celery worker will every few seconds broadcast heartbeat message for an example. These messages go to the celeryev queue.
Celery documentation does not give any details about these queues, so people had to look for answeres in the Celery/Kombu source code. Here is one example: https://github.com/celery/celery/issues/6371#issuecomment-716839203
Related
Within celery, I see sometimes that the worker is offline. I run Flower in one Docker container and the Celery worker in another one. I use a RabbitMQ broker.
I see that the worker jumps between offline <-> online quite often.
What does it mean that a worker is offline? How does Flower figure that out?
Worker is considered "offline" if it does not broadcast heartbeat signal for some (short) period of time.
I was going through celery code. Acks_late is called once the task function runs via (task_trace).
However, in Redis, the once a task is received (i.e pop from Redis Queue) RedisWorkerController creates a task request for it. How is it enqueued again in the event the worker node dies?
The messages aren't enqueued again in case of them not being acknowledged (It would be impossible if the worker dies. They do exist in Redis as unacknowledged).
According to celery docs, Redis broker has a visibility timeout mechanism.
So we should be able to expect the message to be delivered again to a worker if it was not acknowledged within the visibility timeout. And that's what happens. If the power goes out during the processing of an acks_late task, the task is received again by an online worker after the visibility timeout is passed.
Is there a way for a celery job to be retried if the server where the worker is running dies? I don't just mean the sub-process that execute the job, but the entire server becomes unavailable.
I tried with RabbitMQ and Redis as brokers. In both cases, if a job is currently being processed, it is entirely forgotten. When a worker restarts, it doesn't even try to reprocess the job, and looking at Rabbit or Redis, their queues are empty. The result backend is also empty.
It looks like the worker grabs the message and assume it will put it back if the subprocess fails, but if the worker dies also, it can't put it back.
(yes, I work in an environment where this happens more than once a year, and I don't want to lose tasks)
In theory, set task_acks_late=True should do the trick. (doc)
With a Redis broker, the task will be redelivered after visibility_timeout, which defaults to one hour. (doc)
With RabbitMQ, the task is redelivered as soon as Rabbit noticed that the worker died.
I started my celery worker queue (in background):
celery worker -Q my_queue -l info
After this, its broker (redis) was stopped, and meanwhile the background celery worker keeps trying to re-connect to redis after growing amount of time.
Now my goal is re-start a non-duplicate my_queue after restarting redis. I realize that the following celery API will not return my_queue until the re-connection is made:
celery.task.control.inspect().active_queues()
Now if I start a new my_queue, I will end up with duplicate my_queue if the previous celery worker in the background is re-connected afterward.
A solution might be letting celery worker to actively quit if its broker is found stopped, but I don't find the right way to do this. I also don't want to kill it by previous-saved PID. Any suggestions or alternatives will be appreciated.
Well, I know it's contradictory to my requirement, but it seems that I do need the help from a PID file:
celery worker -Q my_queue -l info --pidfile=pid.log
which will raise an exception if the pid saved in pid.log is already running.
This is still not the ideal solution, and any suggestion regarding how to let celery worker actively quit if its broker is found stopped will still be appreciated.
I am running Celery and Flower, with RabbitMQ as a message broker. When I have no running workers and start a task, it sits on the queue until a worker starts. Then, when I start my workers, the task is consumed and executed as expected. However, when I try to use the Flower API to get task info, args and kwargs are null. This never happens when my workers are already running when I call a task. Why is this, and how can I fix it? Thanks.