Consuming from rabbitmq using Celery - rabbitmq

I am using Celery with Rabbitmq broker on Server A. Some tasks require interaction with another server say, Server B and I am using Rabbitmq queues for this interaction.
Queue 1 - Server A (Producer), Server B (Consumer)
Queue 2 - Server B (Producer), Server A (Consumer)
My celery is unexpectedly hanging and I have found the reason to be incorrect implementation of Server A consumer code.
channel.start_consuming() keeps polling Rabbitmq as expected however putting this in a celery task creates multiple pollers which don't expire. I can add expiry but the time completion for the data being sent to Server B cannot be guaranteed. The code pasted below is one method I used to tackle the issue but I am not convinced this is best solution.
I wish to know what I am doing wrong and what is the right way to implement this because I have failed searching for articles on the web. Any tips, insights and even links to articles would be extremely helpful.
Finally, my code -
#celery.task
def task_a(data):
do_some_processing
# Create only 1 Rabbitmq consumer instance to avoid celery hangups
task_d.delay()
#celery.task
def task_b(data):
do_some_processing
if data is not None:
task_c.delay()
#celery.task
def task_c():
data = some_data
data = json.dumps(data)
conn_params = pika.ConnectionParameters(host=RABBITMQ_HOST)
connection = pika.BlockingConnection(conn_params)
channel = connection.channel()
channel.queue_declare(queue=QUEUE_1)
channel.basic_publish(exchange='',
routing_key=QUEUE_1,
body=data)
channel.close()
#celery.task
def task_d():
def queue_helper(ch, method, properties, body):
'''
Callback from queue.
'''
data = json.loads(body)
task_b.delay(data)
conn_params = pika.ConnectionParameters(host=RABBITMQ_HOST)
connection = pika.BlockingConnection(conn_params)
channel = connection.channel()
channel.queue_declare(queue=QUEUE_2)
channel.basic_consume(queue_helper,
queue=QUEUE_2,
no_ack=True)
channel.start_consuming()
channel.close()

Related

Task queues and result queues with Celery and Rabbitmq

I have implemented Celery with RabbitMQ as Broker. I rely on Celery v4.4.7 since I have read that v5.0+ doesn't support RabbitMQ anymore. RabbitMQ is a MUST in my case.
Everything has been containerized then deployed as pods within Kubernetes 1.19. I am able to execute long running tasks and everything apparently looks fine at first glance. However, I have few concerns which require your expertise.
I have declared inbound and outbound queues but Celery created his owns and I do not see any message within those queues (inbound or outbound) :
inbound_queue = "_IN"
outbound_queue = "_OUT"
app = Celery()
app.conf.update(
broker_url = 'pyamqp://%s//' % path,
broker_heartbeat = None,
broker_connection_timeout = int(timeout)
result_backend = 'rpc://',
result_persistent = True,
task_queues = (
Queue(algorithm_queue, Exchange(inbound_queue), routing_key='default', auto_delete=False),
Queue(result_queue, Exchange(outbound_queue), routing_key='default', auto_delete=False),
),
task_default_queue = inbound_queue,
task_default_exchange = inbound_exchange,
task_default_exchange_type = 'direct',
task_default_routing_key = 'default',
)
#app.task(bind=True,
name='osmq.tasks.add',
queue=inbound_queue,
reply_to = outbound_queue,
autoretry_for=(Exception,),
retry_kwargs={'max_retries': 5, 'countdown': 2})
def execute(self, data):
<method_implementation>
I have implemented callbacks to get results back via REST APIs. However, randomly, it can return or not some results when the status is successfull. This is probably related to message persistency. In details, when I implement flower API to get info, status is successfull and the result is partially displayed (shortened json messages) - when I call AsyncResult, for the same status, result is either None or the right one. I do not understand the mechanism between rabbitmq queues and kombu which seems to cache the resulting message. I must guarantee to retrieve results everytime the task has been successfully executed.
def callback(uuid):
task = app.AsyncResult(uuid)
Specifically, it was that Celery 5.0+ did not support amqp:// as a result back end anymore. However, as your example, rpc:// is supported.
The relevant snippet is here: https://docs.celeryproject.org/en/stable/getting-started/backends-and-brokers/index.html#rabbitmq
We tend to always ignore_results=True in our implementation, so I can't give any practical tips of how to use rpc://, other than to infer that any response is put on an application-specific queue, instead of being able to put on a specified queue (or even different broker / rabbitmq instance) via amqp://.

Celery fanout signals with `send_task`

I'm trying to establish communication between different processes running celery. I successfully sent tasks from one process to others using app.send_task on a celery instance. I am struggling now to broadcast tasks through a fanout rabbitmq exchange to all other instances (basically a publish-subscribe pattern for celery).
It must be related to the routing across exchanges and queues but I simply can't make it work.
This is the master emitter which broadcasts a task named signal through the default exchange of type fanout:
from celery import Celery
from kombu import Exchange, Queue
app = Celery('emitter',
broker='pyamqp://test#localhost//',
backend='db+sqlite:///results.db')
default_queue_name = 'default'
default_exchange_name = 'default'
default_routing_key = 'default'
default_exchange = Exchange(default_exchange_name, type='fanout')
default_queue = Queue(
default_queue_name,
default_exchange,
routing_key=default_routing_key)
app.conf.task_queues = (
default_queue,
)
app.conf.task_default_queue = default_queue_name
app.conf.task_default_exchange = default_exchange_name
app.conf.task_default_routing_key = default_routing_key
if __name__ == '__main__':
app.send_task(name='signal', exchange='default')
To my understanding the routing, queue and exchange setup on the other app needs to be identical. Thus, this is a very similarly looking piece of code but defining a task that gets called:
from celery import Celery
from kombu import Exchange, Queue
app = Celery('CLIENT_A',
broker='pyamqp://test#localhost//',
backend='db+sqlite:///results.db')
default_queue_name = 'default'
default_exchange_name = 'default'
default_routing_key = 'default'
default_exchange = Exchange(default_exchange_name, type='fanout')
default_queue = Queue(
default_queue_name,
default_exchange,
routing_key=default_routing_key)
app.conf.task_queues = (
default_queue,
)
app.conf.task_default_queue = default_queue_name
app.conf.task_default_exchange = default_exchange_name
app.conf.task_default_routing_key = default_routing_key
#app.task(name='signal')
def signal():
print('client_a signal')
return 'signal'
The second client will look exactly the same as the first except for the name and the print message:
# [...]
app = Celery('CLIENT_B', ...
# [...] identical to the part above
#app.task(name='signal')
def signal():
print('client_b signal')
I'm starting both client workers with different node names (otherwise celery will complain):
celery -A client_a worker -n node_a
celery -A client_b worker -n node_b
If I then call the emitter (first piece of code) I see the signals being triggered alternately by client_a and client_b but never both as I would like them to be triggered.
The rabbitmq management platform looks as expected with the default exchange defined as fanout and the routing looks alright.
I'm not sure if I'm on the completely wrong track here but that's what I imagined should be possible with correct routing.

kombu not reconnecting to RabbitMQ

I have two servers, call them A and B. B runs RabbitMQ, while A connects to RabbitMQ via Kombu. If I restart RabbitMQ on B, the kombu connection breaks, and the messages are no longer delivered. I then have to reset the process on A to re-establish the connection. Is there a better approach, i.e. is there a way for Kombu to re-connect automatically, even if the RabbitMQ process is restarted?
My basic code implementation is below, thanks in advance! :)
def start_consumer(routing_key, incoming_exchange_name, outgoing_exchange_name):
global rabbitmq_producer
incoming_exchange = kombu.Exchange(name=incoming_exchange_name, type='direct')
incoming_queue = kombu.Queue(name=routing_key+'_'+incoming_exchange_name, exchange=incoming_exchange, routing_key=routing_key)#, auto_delete=True)
outgoing_exchange = kombu.Exchange(name=outgoing_exchange_name, type='direct')
rabbitmq_producer = kombu.Producer(settings.rabbitmq_connection0, exchange=outgoing_exchange, serializer='json', compression=None, auto_declare=True)
settings.rabbitmq_connection0.connect()
if settings.rabbitmq_connection0.connected:
callbacks=[]
queues=[]
callbacks.append(callback)
# if push_queue:
# callbacks.append(push_message_callback)
queues.append(incoming_queue)
print 'opening a new *incoming* rabbitmq connection to the %s exchange for the %s queue' % (incoming_exchange.name, incoming_queue.name)
incoming_exchange(settings.rabbitmq_connection0).declare()
incoming_queue(settings.rabbitmq_connection0).declare()
print 'opening a new *outgoing* rabbitmq connection to the %s exchange' % outgoing_exchange.name
outgoing_exchange(settings.rabbitmq_connection0).declare()
with settings.rabbitmq_connection0.Consumer(queues=queues, callbacks=callbacks) as consumer:
while True:
settings.rabbitmq_connection0.drain_events()
On the consumer side, kombu.mixins.ConsumerMixin handles reconnecting when the connection goes away (and also does heartbeats, etc., and lets you write less code). There doesn't seem to be a ProducerMixin, unfortunately but you could potentially dig into the code and adapt it...?

RabbitMQ routing key does not route

i'm trying to do a simple message queue with RabitMQ i push a message with create_message
and then trying to get the message by the routing key.
it works great when the routing key is the same. the problem is when the routing key is different i keep on getting the message with the wrong routing key:
for example
def callback(ch, method, properties, body):
print("%r:%r" % (method.routing_key, body))
def create_message(self):
connection = pika.BlockingConnection(pika.ConnectionParameters(
'localhost'))
channel = connection.channel()
channel.exchange_declare(exchange='www')
channel.queue_declare(queue='hello')
channel.basic_publish(exchange='www',
routing_key="11",
body='Hello World1111!')
connection.close()
self.get_analysis_task_celery()
def get_message(self):
connection = pika.BlockingConnection(pika.ConnectionParameters(
'localhost'))
channel = connection.channel()
channel.exchange_declare(exchange='www')
timeout = 1
connection.add_timeout(timeout, on_timeout)
channel.queue_bind(exchange="www", queue="hello", routing_key="10")
channel.basic_consume(callback,
queue='hello',
no_ack=True,
consumer_tag= "11")
channel.start_consuming()
example for my output: '11':'Hello World1111!'
what am i doing wrong?
tnx for the help
this is a total guess, since i can't see your rabbitmq server..
if you open the RabbitMQ management website and look at your exchange, you will probably see that the exchange is bound to the queue for routing key 10 and 11, both of which are bound to the same queue.
since both go to the same queue, your message will always be delivered to that queue, the consumer will always pick up the message
again, i'm guessing since i can't see your server. but check the server to make sure you don't have leftover / extra bindings

Pika + RabbitMQ: setting basic_qos to prefetch=1 still appears to consume all messages in the queue

I've got a python worker client that spins up a 10 workers which each hook onto a RabbitMQ queue. A bit like this:
#!/usr/bin/python
worker_count=10
def mqworker(queue, configurer):
connection = pika.BlockingConnection(pika.ConnectionParameters(host='mqhost'))
channel = connection.channel()
channel.queue_declare(queue=qname, durable=True)
channel.basic_consume(callback,queue=qname,no_ack=False)
channel.basic_qos(prefetch_count=1)
channel.start_consuming()
def callback(ch, method, properties, body):
doSomeWork();
ch.basic_ack(delivery_tag = method.delivery_tag)
if __name__ == '__main__':
for i in range(worker_count):
worker = multiprocessing.Process(target=mqworker)
worker.start()
The issue I have is that despite setting basic_qos on the channel, the first worker to start accepts all the messages off the queue, whilst the others sit there idle. I can see this in the rabbitmq interface, that even when I set worker_count to be 1 and dump 50 messages on the queue, all 50 go into the 'unacknowledged' bucket, whereas I'd expect 1 to become unacknowledged and the other 49 to be ready.
Why isn't this working?
I appear to have solved this by moving where basic_qos is called.
Placing it just after channel = connection.channel() appears to alter the behaviour to what I'd expect.