Finding which task went to which queue - rabbitmq

I am using RabbitMQ with Celery and I have set some custom routing settings for the task. A specific type of task goes to one queue and all the other tasks goes to another queue. Now I want to verify it is working or not.
For this, I want to inspect which tasks went to which queue. Unfortunately, I didn't find anything which could help me on this. celeryev monitor just provides information about which tasks have been received and what is their completion status. rabbitmqctl gives me information about the current running and waiting tasks only - so I cannot see to which queue did my intended task go to.
Could anyone help me with this?

You normally can't inspect messages on a queue with AMQP (not sure about Celery, though).
If you just need this as a one-off test, the simplest way would probably be to write a quick program in Python that gets all messages from the queues and prints them out.
Using py-ampqlib, this should do it:
from amqplib import client_0_8 as amqp
conn = amqp.Connection(host="localhost:5672", userid="guest", password="guest", virtual_host="/", insist=False)
chan = conn.channel()
queue_name = "the_queue"
print "Draining", queue_name
while True:
msg = chan.basic_get(queue_name)
if msg is None:
break
print msg.body
print "All done"
If you need more help, a good place to ask is the RabbitMQ Discuss mailing list. The RabbitMQ developers do their best to answer all the questions posted there, and Celery's author also reads it.

Related

Can I use Celery for publishing and subscribing to topics?

All the examples I have seen of executing/scheduling Celery tasks are like this:
add.delay()
I was wondering if I could do something like this with Celery:
celery_app.publish(topic='my-topic')
And in other codebase/service:
#task(topic='my-topic')
def mytask():
do_stuf()
This way I don't need to know which tasks have to do something when an event happens.
I probably have some missconceptions causing this question, but I couldn't find the answer myself.
No topics, just queues. And yes, you can send task to any queue. Subscribing to a queue is a worker-level, remote command, so that is possible too.
Also, you can't send arbitrary messages to queues, just Celery tasks. If you want to produce/consume arbitrary messages, use kombu.

rabbitmq pika don't take more than one job when using a thread

I have a problem of ConnectionResetError when I block the IOLoop that channel.start_consuming() is running for a long time. So I've read this code:
https://github.com/pika/pika/blob/0.12.0/examples/basic_consumer_threaded.py
In this code the job is running in a background thread.
The problem is, when my job is running in a thread, The worker can still take more jobs, (i.e, keep getting on_message callbacks). But I don't want my worker to process more than one job at a time.
What should I do? Is it possible to inform the queue that the worker is 'busy' and can't accept jobs for some time?
As long as you are setting the channel's QoS value via the channel.basic_qos method, your consumer will not get more unacknowledged messages than specified by prefetch_count.
If you use the prefetch_count=1 argument, your consumer will only get one message at a time and will not get more until basic_ack is called for that message.
If, for some reason, you see something different, share all of your code as an attachment or link in a message on the pika-python mailing list and I will check it out.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

Celery + HaProxy + RabbitMQ lose task messages

Probably not the best place to ask (maybe Server Fault), but I'd try here:
I have django celery sending tasks via HaProxy to a RabbitMQ cluster, and we are losing messages (tasks not being executed) every now and then.
Observed
We turned of the workers and monitored the queue size, we noticed that we started 100 jobs but only 99 showed up in the queue.
It seems to happen when other processes are using RabbitMQ for other jobs
Tried
I tried flooding RabbitMQ with dummy messages with many connections and tried to put some proper tasks in to the queue, I couldn't replicate the issue constantly.
Just wondering if anyone had experienced this before?
UPDATE:
So I dove into the code and eventually stumble on to celery/app/amqp.py, I was debugging by adding extra publish method to an non-existing exchange, see below:
log.warning(111111111)
self.publish(
body,
exchange=exchange, routing_key=routing_key,
serializer=serializer or self.serializer,
compression=compression or self.compression,
headers=headers,
retry=retry, retry_policy=_rp,
reply_to=reply_to,
correlation_id=task_id,
delivery_mode=delivery_mode, declare=declare,
**kwargs
)
log.warning(222222222)
self.publish(
body,
exchange='celery2', routing_key='celery1',
serializer=serializer or self.serializer,
compression=compression or self.compression,
headers=headers,
retry=retry, retry_policy=_rp,
reply_to=reply_to,
correlation_id=task_id,
delivery_mode=delivery_mode, declare=declare,
**kwargs
)
log.warning(333333333)
Then tried to trigger 100 tasks from project code, and then result was only 1 message got put into the celery queue, I think it's caused by ProducerPool or ConnectionPool

Auto spawn rabbit mq listener

I'm working on an application where-in I have a listener on a rabbit mq queue. Depending on the kind of message, the listener goes ahead and performs a task. My problem is I need a way to spawn a new listener if a single listener isn't able to cope up with the queue. As far as I can tell, I can use the rabbitmq json api to find the len of the queue and take actions based on that. So, I write a script that checks using curl the queue length and spawns a new listener process. Am I on the right path here? Is there a better way to achieve this? I'm looking for a solution that kinda scales with load to a certain limit atleast.
Checking the RabbitMQ API to see the length of the queue is one way, and it would definitely work.
You should try to predict when the load is spiking so that you slowly can increase the number of consumers if needed, so that you don't see a sudden spike of instances spawning. Having many instances spawning simultaneously could cause unnecessary load on your system.

RabbitMQ reordering messages

RabbitMQ ticks all the boxes for the project I am planning, save one. I would have different workers listening on a queue and it is important that they process the newest messages (i.e., latest sequence number) first (LIFO).
My application is such that newer messages pretty much obsolete older messages. If you have workers to spare you could still process the older messages but it is important the newer ones are done first.
After trawling the various forums and such I can only see one solution and that is for a client to process a message it should first:
consume all messages
re-order them according to the sequence number
re-submit to the queue
consume the first message
Ugly and problematic if the client dies halfway. But mabye somebody here has a better solution.
My research is based (in part) on:
http://groups.google.com/group/rabbitmq-discuss/browse_thread/thread/e79e77d86bc7a3b8?fwc=1
http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2010-July/007934.html
http://groups.google.com/group/rabbitmq-discuss/browse_thread/thread/e40d1069dcebe2cc
http://old.nabble.com/Priority-Queue-implementation-and-performance-td29946348.html
Note: the expected traffic of messages will roughly be in the range of 1 msg/hour for some queues and 100/minute for others. So nothing stellar.
Since there is no reply I guess I did my homework rather well ;)
Anyway, after discussing the requirements with the other stakeholders it was decided I can drop the LIFO requirement for now. We can worry about that when it comes to it.
A solution that we will probably end up adopting is for the worker to open a second queue that the master can use to let the worker know what jobs to ignore + provide additional control/monitoring information (which it looks like we will need anyway).
The RabbitMQ implementing the AMQP 1.0 spec may also help here.
So I will mark this question as answered for now. Somebody else is still free to add or improve.
One possibility might be to use basic.get in a loop and wait for the response basic-ok.message-count to become zero (throwing away all other messages):
while (<get ok> = <call basic.get>) {
if (<get ok>.message-count == 0) {
// Now <get ok> is the most recent message on this queue
break;
} else if (<is get-empty>) {
// Someone else got it
}
}
Of course, you'd have to set up the message routing patterns on the broker such that 1 consumer throwing away messages doesn't mess with another. Try to avoid re queueing messages as they will re queue at the top of the stack, making them look like the most recent.