I have two servers, call them A and B. B runs RabbitMQ, while A connects to RabbitMQ via Kombu. If I restart RabbitMQ on B, the kombu connection breaks, and the messages are no longer delivered. I then have to reset the process on A to re-establish the connection. Is there a better approach, i.e. is there a way for Kombu to re-connect automatically, even if the RabbitMQ process is restarted?
My basic code implementation is below, thanks in advance! :)
def start_consumer(routing_key, incoming_exchange_name, outgoing_exchange_name):
global rabbitmq_producer
incoming_exchange = kombu.Exchange(name=incoming_exchange_name, type='direct')
incoming_queue = kombu.Queue(name=routing_key+'_'+incoming_exchange_name, exchange=incoming_exchange, routing_key=routing_key)#, auto_delete=True)
outgoing_exchange = kombu.Exchange(name=outgoing_exchange_name, type='direct')
rabbitmq_producer = kombu.Producer(settings.rabbitmq_connection0, exchange=outgoing_exchange, serializer='json', compression=None, auto_declare=True)
settings.rabbitmq_connection0.connect()
if settings.rabbitmq_connection0.connected:
callbacks=[]
queues=[]
callbacks.append(callback)
# if push_queue:
# callbacks.append(push_message_callback)
queues.append(incoming_queue)
print 'opening a new *incoming* rabbitmq connection to the %s exchange for the %s queue' % (incoming_exchange.name, incoming_queue.name)
incoming_exchange(settings.rabbitmq_connection0).declare()
incoming_queue(settings.rabbitmq_connection0).declare()
print 'opening a new *outgoing* rabbitmq connection to the %s exchange' % outgoing_exchange.name
outgoing_exchange(settings.rabbitmq_connection0).declare()
with settings.rabbitmq_connection0.Consumer(queues=queues, callbacks=callbacks) as consumer:
while True:
settings.rabbitmq_connection0.drain_events()
On the consumer side, kombu.mixins.ConsumerMixin handles reconnecting when the connection goes away (and also does heartbeats, etc., and lets you write less code). There doesn't seem to be a ProducerMixin, unfortunately but you could potentially dig into the code and adapt it...?
Related
I'm trying to achieve a reject/delay loop using Rabbit's operations, i.e. :
I Have:
Main Queue with Main Exchange binded to it and DLX to StandBy Exchange.
StandBy Queue with StandBy Exchange binded to it with 60s TTL and DLX to Main Exchange
Basically I want to:
Consume from Main Queue
Rejects message (under certain circunstances)
Will get redirect it to StandBy Queue because rejection
When TTL expire, re-queue message to Main Queue.
The steps 1, 2 and 3 are OK but the last one drop the message instead of re-queue it.
Some theory from RabbitMQ's docs what I used to design this was:
Messages from a queue can be 'dead-lettered'; that is, republished to another exchange when any of the following events occur:
The message is rejected (basic.reject or basic.nack) with requeue=false,
The TTL for the message expires; or
The queue length limit is exceeded.
...
It is possible to form a cycle of message dead-lettering. For instance, this can happen when a queue dead-letters messages to the default exchange without specifiying a dead-letter routing key. Messages in such cycles (i.e. messages that reach the same queue twice) will be dropped if there was no rejections in the entire cycle.
The theory says that it should be re-queue because it has a rejection in the cycle from step #2, so, can you help me figure it out why it drops the message instead of re-queue it?
UPDATE:
The version I was targeting was 2.8.4 and it seems that in that moment the if there was no rejections in the entire cycle wasn't in the uses cases, anyway you can check this yourselves RabbitMQ 2.8.x Docs
I'll accept #george answer as the original objective can be achieved by this code.
Rafael, I am not sure what client you are using but with the Pika client in Python you could implement something like this. For simplicity I only use one exchange. Are you sure you are setting the exchange and the routing-key properly?
sender.py
import sys
import pika
connection = pika.BlockingConnection(pika.ConnectionParameters(
'localhost'))
channel = connection.channel()
channel.exchange_declare(exchange='cycle', type='direct')
channel.queue_declare(queue='standby_queue',
arguments={
'x-message-ttl': 10000,
'x-dead-letter-exchange': 'cycle',
'x-dead-letter-routing-key': 'main_queue'})
channel.queue_declare(queue='main_queue',
arguments={
'x-dead-letter-exchange': 'cycle',
'x-dead-letter-routing-key': 'standby_queue'})
channel.queue_bind(queue='main_queue', exchange='cycle')
channel.queue_bind(queue='standby_queue', exchange='cycle')
channel.basic_publish(exchange='cycle',
routing_key='main_queue',
body="message body")
connection.close()
receiver.py
import sys
import pika
def callback(ch, method, properties, body):
print "Processing message: {}".format(body)
# replace with condition for rejection
if True:
print "Rejecting message"
ch.basic_nack(method.delivery_tag, False, False)
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()
channel.basic_consume(callback, queue='main_queue')
channel.start_consuming()
I am using Celery with Rabbitmq broker on Server A. Some tasks require interaction with another server say, Server B and I am using Rabbitmq queues for this interaction.
Queue 1 - Server A (Producer), Server B (Consumer)
Queue 2 - Server B (Producer), Server A (Consumer)
My celery is unexpectedly hanging and I have found the reason to be incorrect implementation of Server A consumer code.
channel.start_consuming() keeps polling Rabbitmq as expected however putting this in a celery task creates multiple pollers which don't expire. I can add expiry but the time completion for the data being sent to Server B cannot be guaranteed. The code pasted below is one method I used to tackle the issue but I am not convinced this is best solution.
I wish to know what I am doing wrong and what is the right way to implement this because I have failed searching for articles on the web. Any tips, insights and even links to articles would be extremely helpful.
Finally, my code -
#celery.task
def task_a(data):
do_some_processing
# Create only 1 Rabbitmq consumer instance to avoid celery hangups
task_d.delay()
#celery.task
def task_b(data):
do_some_processing
if data is not None:
task_c.delay()
#celery.task
def task_c():
data = some_data
data = json.dumps(data)
conn_params = pika.ConnectionParameters(host=RABBITMQ_HOST)
connection = pika.BlockingConnection(conn_params)
channel = connection.channel()
channel.queue_declare(queue=QUEUE_1)
channel.basic_publish(exchange='',
routing_key=QUEUE_1,
body=data)
channel.close()
#celery.task
def task_d():
def queue_helper(ch, method, properties, body):
'''
Callback from queue.
'''
data = json.loads(body)
task_b.delay(data)
conn_params = pika.ConnectionParameters(host=RABBITMQ_HOST)
connection = pika.BlockingConnection(conn_params)
channel = connection.channel()
channel.queue_declare(queue=QUEUE_2)
channel.basic_consume(queue_helper,
queue=QUEUE_2,
no_ack=True)
channel.start_consuming()
channel.close()
I've got a bunch of celery tasks that take their results and post them to a RabbitMQ message queue. The results that get posted can become quite large (up to a few meg). Opinion is mixed as to whether putting large amounts of data in a RabbitMQ message is a good idea, but I've seen this work in other situations and as long as memory is kept under control, it seems to work.
However, for my current set of tasks, rabbit appears to be just dropping messages that seem to be too big. I've reduced it down to a fairly simple test case:
#!/usr/bin/env python
import string
import random
import pika
import os
qname='examplequeue'
connection = pika.BlockingConnection(pika.ConnectionParameters(
host='mq.example.com'))
channel = connection.channel()
channel.queue_declare(queue=qname,durable=True)
N=100000
body = ''.join(random.choice(string.ascii_uppercase) for x in range(N))
promise = channel.basic_publish(exchange='', routing_key=qname, body=body, mandatory=0, immediate=0, properties=pika.BasicProperties(content_type="text/plain",delivery_mode=2))
print " [x] Sent 'Hello World!'"
connection.close()
I have a 3-node RabbitMQ cluster, and mq.example.com round-robins to each node. Client is using Pika 0.9.5 on Ubuntu 12.04 and the RabbitMQ cluster is running RabbitMQ 2.8.7 on Erlang R14B04.
Executing this script prints the print statement and exits without any exceptions being raised. The message never appears in RabbitMQ.
Changing N to 10000 makes it work as expected.
Why?
I suppose you have problem with tcp-backpressure mechanizm in RabbitMq. You can read about http://www.rabbitmq.com/memory.html.
I see two ways to solve this problem:
Add tcp-callback and make reconnect every tcp-call from rabbit
Use compressing messages before sending it to rabbit, It will make easier push to rabbit.
def compress(s):
return binascii.hexlify(zlib.compress(s))
def decompress(s):
return zlib.decompress(binascii.unhexlify(s))
This is what I do to send and receive packets. It is somewhat more efficient than hexlify, because base64 may use one byte where two bytes are needed by hexlify to represent one character.
import zlib
import base64
def hexpress(send: str):
print(f"send: {send}")
bsend = send.encode()
print(f"byte-encoded send: {bsend}")
zbsend = zlib.compress(bsend)
print(f"zipped-byte-encoded-send: {zbsend}")
hzbsend = base64.b64encode(zbsend)
print(f"hex-zip-byte-encoded-send: {hzbsend}")
shzbsend = hzbsend.decode()
print(f"string-hex-zip-byte-encoded-send: {shzbsend}")
return shzbsend
def hextract(recv: str):
print(f"string-hex-zip-byte-encoded-recv: {recv}")
zbrecv = base64.b64decode(recv)
print(f"zipped-byte-encoded-recv: {zbrecv}")
brecv = zlib.decompress(zbrecv)
print(f"byte-encoded-recv: {brecv}")
recv = brecv.decode()
print(f"recv: {recv}")
return recv
print("sending ...\n")
send = "hello this is dog"
packet = hexpress(send)
print("\nover the wire -------->>>>>\n")
print("receiving...\n")
recv = hextract(packet)
I've got a python worker client that spins up a 10 workers which each hook onto a RabbitMQ queue. A bit like this:
#!/usr/bin/python
worker_count=10
def mqworker(queue, configurer):
connection = pika.BlockingConnection(pika.ConnectionParameters(host='mqhost'))
channel = connection.channel()
channel.queue_declare(queue=qname, durable=True)
channel.basic_consume(callback,queue=qname,no_ack=False)
channel.basic_qos(prefetch_count=1)
channel.start_consuming()
def callback(ch, method, properties, body):
doSomeWork();
ch.basic_ack(delivery_tag = method.delivery_tag)
if __name__ == '__main__':
for i in range(worker_count):
worker = multiprocessing.Process(target=mqworker)
worker.start()
The issue I have is that despite setting basic_qos on the channel, the first worker to start accepts all the messages off the queue, whilst the others sit there idle. I can see this in the rabbitmq interface, that even when I set worker_count to be 1 and dump 50 messages on the queue, all 50 go into the 'unacknowledged' bucket, whereas I'd expect 1 to become unacknowledged and the other 49 to be ready.
Why isn't this working?
I appear to have solved this by moving where basic_qos is called.
Placing it just after channel = connection.channel() appears to alter the behaviour to what I'd expect.
I am using celery+rabbitmq. I can't find convenient way to clear queue in celery+rabbitmq. I do it with remove and create vhost.
rabbitmqctl delete_vhost <vhostpath>
rabbitmqctl add_vhost <vhostpath>
Is it prefer way to clear some celery queue ?
I'm not quite sure how celery works, but I suspect you want to purge a RabbitMQ queue (you're currently simulating this by deleting the queues and having celery re-create them).
You could install RabbitMQ's Management Plugin. Its WebUI will allow you to purge the required queue. This should also tell you which queue you're aiming for, so you wouldn't need to delete everything.
Once you know which queue it is, you could purge it programatically. For instance, using py-amqplib, you would do something like:
from amqplib import client_0_8 as amqp
conn = amqp.Connection(host="localhost:5672", userid="guest", password="guest", virtual_host="/", insist=False)
conn = conn.channel()
conn.queue_purge("the-target-queue")
There's probably a better way to do it, though.
If you are facing this problem because you used rabbitmq for the result backend and as a result you got too many queues, then i would suggest using a different result backend (redis or mongodb)
This is one well known flaw with the celery. It will create a separate queue for each result if you amqp for result backend.
If you still want to stick to amqp as result backend. It will clear itself in 24 hours. You can however set it to a smaller value using CELERY_AMQP_TASK_RESULT_EXPIRES setting.
If you need to delete ALL items in queue (especially when the list is long)
1) Saves all items into the file
sudo rabbitmqctl list_queues -p /yourvhost name > queues.txt
don't forget to remove first and last lines from 'queues.txt'
2) Use mentioned python code to do the job
from amqplib import client_0_8 as amqp
conn = amqp.Connection(host="127.0.0.1:5672", userid="guest", password="guest", virtual_host="/yourvhost", insist=False)
conn = conn.channel()
queues = None
with open('queues.txt', 'r') as f:
queues = f.readlines()
for q in queues:
if q:
#print 'deleting %s' % q
conn.queue_purge(q.strip())
print 'purged %d items' % len(queues)