Celery fanout signals with `send_task` - rabbitmq

I'm trying to establish communication between different processes running celery. I successfully sent tasks from one process to others using app.send_task on a celery instance. I am struggling now to broadcast tasks through a fanout rabbitmq exchange to all other instances (basically a publish-subscribe pattern for celery).
It must be related to the routing across exchanges and queues but I simply can't make it work.
This is the master emitter which broadcasts a task named signal through the default exchange of type fanout:
from celery import Celery
from kombu import Exchange, Queue
app = Celery('emitter',
broker='pyamqp://test#localhost//',
backend='db+sqlite:///results.db')
default_queue_name = 'default'
default_exchange_name = 'default'
default_routing_key = 'default'
default_exchange = Exchange(default_exchange_name, type='fanout')
default_queue = Queue(
default_queue_name,
default_exchange,
routing_key=default_routing_key)
app.conf.task_queues = (
default_queue,
)
app.conf.task_default_queue = default_queue_name
app.conf.task_default_exchange = default_exchange_name
app.conf.task_default_routing_key = default_routing_key
if __name__ == '__main__':
app.send_task(name='signal', exchange='default')
To my understanding the routing, queue and exchange setup on the other app needs to be identical. Thus, this is a very similarly looking piece of code but defining a task that gets called:
from celery import Celery
from kombu import Exchange, Queue
app = Celery('CLIENT_A',
broker='pyamqp://test#localhost//',
backend='db+sqlite:///results.db')
default_queue_name = 'default'
default_exchange_name = 'default'
default_routing_key = 'default'
default_exchange = Exchange(default_exchange_name, type='fanout')
default_queue = Queue(
default_queue_name,
default_exchange,
routing_key=default_routing_key)
app.conf.task_queues = (
default_queue,
)
app.conf.task_default_queue = default_queue_name
app.conf.task_default_exchange = default_exchange_name
app.conf.task_default_routing_key = default_routing_key
#app.task(name='signal')
def signal():
print('client_a signal')
return 'signal'
The second client will look exactly the same as the first except for the name and the print message:
# [...]
app = Celery('CLIENT_B', ...
# [...] identical to the part above
#app.task(name='signal')
def signal():
print('client_b signal')
I'm starting both client workers with different node names (otherwise celery will complain):
celery -A client_a worker -n node_a
celery -A client_b worker -n node_b
If I then call the emitter (first piece of code) I see the signals being triggered alternately by client_a and client_b but never both as I would like them to be triggered.
The rabbitmq management platform looks as expected with the default exchange defined as fanout and the routing looks alright.
I'm not sure if I'm on the completely wrong track here but that's what I imagined should be possible with correct routing.

Related

Task queues and result queues with Celery and Rabbitmq

I have implemented Celery with RabbitMQ as Broker. I rely on Celery v4.4.7 since I have read that v5.0+ doesn't support RabbitMQ anymore. RabbitMQ is a MUST in my case.
Everything has been containerized then deployed as pods within Kubernetes 1.19. I am able to execute long running tasks and everything apparently looks fine at first glance. However, I have few concerns which require your expertise.
I have declared inbound and outbound queues but Celery created his owns and I do not see any message within those queues (inbound or outbound) :
inbound_queue = "_IN"
outbound_queue = "_OUT"
app = Celery()
app.conf.update(
broker_url = 'pyamqp://%s//' % path,
broker_heartbeat = None,
broker_connection_timeout = int(timeout)
result_backend = 'rpc://',
result_persistent = True,
task_queues = (
Queue(algorithm_queue, Exchange(inbound_queue), routing_key='default', auto_delete=False),
Queue(result_queue, Exchange(outbound_queue), routing_key='default', auto_delete=False),
),
task_default_queue = inbound_queue,
task_default_exchange = inbound_exchange,
task_default_exchange_type = 'direct',
task_default_routing_key = 'default',
)
#app.task(bind=True,
name='osmq.tasks.add',
queue=inbound_queue,
reply_to = outbound_queue,
autoretry_for=(Exception,),
retry_kwargs={'max_retries': 5, 'countdown': 2})
def execute(self, data):
<method_implementation>
I have implemented callbacks to get results back via REST APIs. However, randomly, it can return or not some results when the status is successfull. This is probably related to message persistency. In details, when I implement flower API to get info, status is successfull and the result is partially displayed (shortened json messages) - when I call AsyncResult, for the same status, result is either None or the right one. I do not understand the mechanism between rabbitmq queues and kombu which seems to cache the resulting message. I must guarantee to retrieve results everytime the task has been successfully executed.
def callback(uuid):
task = app.AsyncResult(uuid)
Specifically, it was that Celery 5.0+ did not support amqp:// as a result back end anymore. However, as your example, rpc:// is supported.
The relevant snippet is here: https://docs.celeryproject.org/en/stable/getting-started/backends-and-brokers/index.html#rabbitmq
We tend to always ignore_results=True in our implementation, so I can't give any practical tips of how to use rpc://, other than to infer that any response is put on an application-specific queue, instead of being able to put on a specified queue (or even different broker / rabbitmq instance) via amqp://.

How to keep receiving messages when the master node goes down? [RabbitMQ]

I have two machines running RabbitMQ, Master(ip1) and Slave(ip2), and I mirrored them, added policy, synced a queue and exchange, but when I'm sending messages and the master goes down the slave stop responding and my application is not able to send messages anymore.
I have a virtual-host, durable queue and exchange, a policy to promote the slave.
The policy
Pattern ^test
Apply to all
Definition
ha-mode: exactly
ha-params: 5
ha-promote-on-failure: always
ha-promote-on-shutdown: always
ha-sync-mode: automatic
queue-master-locator: random
Priority 0
This is the Application that I'm using to publish messages (EDIT: The queue is already set on RabbitMQ UI)
import pika
import time
credentials = pika.PlainCredentials('usr', 'pass')
parameters = pika.ConnectionParameters('ip1',
5672,
'virtual_host',
credentials)
connection = pika.BlockingConnection(parameters)
channel = connection.channel()
channel.exchange_declare(
exchange="test_exchange",
exchange_type="direct",
passive=False,
durable=True,
auto_delete=False)
##channel.queue_declare(queue='test', durable=True, exclusive=False, auto_delete=False)
input('Press ENTER to begin')
for i in range(10000):
channel.basic_publish(exchange='',
routing_key='test',
body=('Hello World! '+ str(i)))
time.sleep(0.001)
print(" [x] Sent 'Hello World!'")
I wish that when the master goes down, the slave becomes promoted and receives the messages.

Consuming from rabbitmq using Celery

I am using Celery with Rabbitmq broker on Server A. Some tasks require interaction with another server say, Server B and I am using Rabbitmq queues for this interaction.
Queue 1 - Server A (Producer), Server B (Consumer)
Queue 2 - Server B (Producer), Server A (Consumer)
My celery is unexpectedly hanging and I have found the reason to be incorrect implementation of Server A consumer code.
channel.start_consuming() keeps polling Rabbitmq as expected however putting this in a celery task creates multiple pollers which don't expire. I can add expiry but the time completion for the data being sent to Server B cannot be guaranteed. The code pasted below is one method I used to tackle the issue but I am not convinced this is best solution.
I wish to know what I am doing wrong and what is the right way to implement this because I have failed searching for articles on the web. Any tips, insights and even links to articles would be extremely helpful.
Finally, my code -
#celery.task
def task_a(data):
do_some_processing
# Create only 1 Rabbitmq consumer instance to avoid celery hangups
task_d.delay()
#celery.task
def task_b(data):
do_some_processing
if data is not None:
task_c.delay()
#celery.task
def task_c():
data = some_data
data = json.dumps(data)
conn_params = pika.ConnectionParameters(host=RABBITMQ_HOST)
connection = pika.BlockingConnection(conn_params)
channel = connection.channel()
channel.queue_declare(queue=QUEUE_1)
channel.basic_publish(exchange='',
routing_key=QUEUE_1,
body=data)
channel.close()
#celery.task
def task_d():
def queue_helper(ch, method, properties, body):
'''
Callback from queue.
'''
data = json.loads(body)
task_b.delay(data)
conn_params = pika.ConnectionParameters(host=RABBITMQ_HOST)
connection = pika.BlockingConnection(conn_params)
channel = connection.channel()
channel.queue_declare(queue=QUEUE_2)
channel.basic_consume(queue_helper,
queue=QUEUE_2,
no_ack=True)
channel.start_consuming()
channel.close()

celery - Programmatically list queues

How can I programmatically, using Python code, list current queues created on a RabbitMQ broker and the number of workers connected to them? It would be the equivalent to:
rabbitmqctl list_queues name consumers
I do it this way and display all the queues and their details (messages ready, unacknowledged etc.) on a web page -
import kombu
conn = kombu.Connection(broker_url)# example 'amqp://guest:guest#localhost:5672/'
conn.connect()
client = conn.get_manager()
queues = client.get_queues('/')#assuming vhost as '/'
You will need kombu to be installed and queues will be a dictionary with keys representing the queue names.
I think I got this when digging through the code of celery flower (The tool used for monitoring celery).
Update: As pointed out by #zaq178miami, you will also need the management plugin that has the http API. I had forgotten that I had enabled than in rabbitmq.
This way did it for me:
def get_queue_info(queue_name):
with celery.broker_connection() as conn:
with conn.channel() as channel:
return channel.queue_declare(queue_name, passive=True)
This will return a namedtuple with the name, number of messages waiting and consumers of that queue.
ksrini answer is correct too and can be used when you require more information about a queue.
Thanks to Ask Solem who gave me the hint.
As a rabbitmq client you can use pika. However it doesn't have option for list_queues. The easiest solution would be calling rabbitmqctl command from python using subprocess:
import subprocess
command = "/usr/local/sbin/rabbitmqctl list_queues name consumers"
process = subprocess.Popen(command.split(), stdout=subprocess.PIPE)
print process.communicate()
I would use simply this:
Just replace the user(default= guest), passwd(default= guest) and port with your values.
import requests
import json
def call_rabbitmq_api(host, port, user, passwd):
url = 'https://%s:%s/api/queues' % (host, port)
r = requests.get(url, auth=(user,passwd),verify=False)
return r
def get_queue_name(json_list):
res = []
for json in json_list:
res.append(json["name"])
return res
if __name__ == '__main__':
host = 'rabbitmq_host'
port = 55672
user = 'guest'
passwd = 'guest'
res = call_rabbitmq_api(host, port, user, passwd)
print ("--- dump json ---")
print (json.dumps(res.json(), indent=4))
print ("--- get queue name ---")
q_name = get_queue_name(res.json())
print (q_name)
Referred from here: https://gist.github.com/hiroakis/5088513#file-example_rabbitmq_api-py-L2

Pika + RabbitMQ: setting basic_qos to prefetch=1 still appears to consume all messages in the queue

I've got a python worker client that spins up a 10 workers which each hook onto a RabbitMQ queue. A bit like this:
#!/usr/bin/python
worker_count=10
def mqworker(queue, configurer):
connection = pika.BlockingConnection(pika.ConnectionParameters(host='mqhost'))
channel = connection.channel()
channel.queue_declare(queue=qname, durable=True)
channel.basic_consume(callback,queue=qname,no_ack=False)
channel.basic_qos(prefetch_count=1)
channel.start_consuming()
def callback(ch, method, properties, body):
doSomeWork();
ch.basic_ack(delivery_tag = method.delivery_tag)
if __name__ == '__main__':
for i in range(worker_count):
worker = multiprocessing.Process(target=mqworker)
worker.start()
The issue I have is that despite setting basic_qos on the channel, the first worker to start accepts all the messages off the queue, whilst the others sit there idle. I can see this in the rabbitmq interface, that even when I set worker_count to be 1 and dump 50 messages on the queue, all 50 go into the 'unacknowledged' bucket, whereas I'd expect 1 to become unacknowledged and the other 49 to be ready.
Why isn't this working?
I appear to have solved this by moving where basic_qos is called.
Placing it just after channel = connection.channel() appears to alter the behaviour to what I'd expect.