API design around RabbitMQ for publisher/subscriber - rabbitmq

TL;DR - Whats the best way to expose RabbitMQ to a consumer via REST API?
I'm creating an API to publish and consume message from RabbitMQ. In my current design, the publisher is going to make a POST request. My API will route the POST request to the exchange. In this way, the publisher doesn't have to know the server address, exchange name etc. while publishing.
Now the consumer part is where I'm not sure how to proceed.
At the beginning there will be no queues. When a new consumer wants to subscribe to a TOPIC, then I will create a queue and bind it to the exchange. I need help with answers to few questions -
Once I create a queue for the consumer, what's the next step to let the consumer get messages from that queue?
I make the consumer ask for a batch of messages(say 50 messages) from the queue. Then once I receive an ack from the consumer I will send the next 50 messages from queue. If I don't receive an ack I will requeue the 50 messages back into the queue. Isn't this expensive in terms of opening and closing connection between the consumer and my API?
If there is a better approach then please suggest

In general, your idea of putting RMQ behind a REST API is a good one. You don't want to expose RMQ to the world, directly.
For the specific questions:
Once I create a queue for the consumer, what's the next step to let the consumer get messages from that queue?
Have you read the tutorials? I would start there, for the language you are working with: http://www.rabbitmq.com/getstarted.html
Isn't this expensive in terms of opening and closing connection between the consumer and my API?
Don't open and close connections for each batch of messages.
Your application instance (the "consumer" app) should have a single connection. That connection stays open as long as you need it - across as many calls to RabbitMQ as you want.
I typically open my RMQ connection as soon as the app starts, and I leave it open until the app shuts down.
Within the consumer app, using that one single connection, you will create multiple channels through the connection. A channel is where the actual work is done.
Depending on your language, you will have a single channel per thread; a single channel per queue being consumed; etc
You can create and destroy channels very quickly, unlike connections.
More specifically with your idea of batch processing, this will be handled by putting a consumer prefetch limit on your consumer and then requiring messages to be acknowledged after processing it.

Related

RabbitMQ pause a queue

I am using a RabbitMQ Server (v3.8.9) with Java clients.
Use case is:
Our Backend creates messages for different clients. We send them out to their respective Endpoints.
1 Producer -> Outbound Queue -> 1 Consumer
The producer creates messages for n clients
Which the consumer should send out to the clients' endpoints
Messages must be kept in the correct order regarding each client
Works fine, unless all clients are up and running. Problem: If one client becomes unavailable, we need to have a bulletproof retry mechanism for that.
Say:
Wait 1 Minute and try again
All following messages must NOT be delivered before the first failed one and kept in the correct order
If a retry works, then ALL other messages should be send to the client immediately
As you can see, it is not a solution to just "supsend" the consumer, because it should still deliver msg to the other (alive) clients. Due to application limitations and a dynamic number of clients, we cannot spawn one consumer per client queue.
My best approach right now is to dynamically create one queue per client, which are then routed to a single outbound queue. If one msg to one client cannot be delivered by the consumer, I would like to "pause" the clients queue for x minutes. An API call like "queue_pause('client_q1', '5 Minutes')" would help. But even then I have to deal with the other, already routed messages to that particular client and keep them in the correct order...
Any better ideas?
I think the key here is that a single consumer script can consume from multiple queues. So if I'm understanding correctly, you could model this as:
Each client has its own queue. These could be created by the consumer script when it starts up, or by a back-end process when a new client is created.
The consumer script subscribes to each queue separately
When a message is received, the consumer tries to send it immediately to the client; if it succeeds, it is manually acknowledged with basic.ack, and the consumer is ready to send the next message to that client.
When a message cannot be delivered to the client, it is requeued (basic.nack or basic.reject with requeue=1), retaining its position in the client's queue.
The consumer then needs to pause consuming from that particular queue. Depending on how its written, that could be as simple as a sleep in that particular thread, but if that's not practical, you can effectively "pause" the subscription to the queue:
Cancel the subscription to that queue, leaving other subscriptions in tact
Store the queue name and the retry time in an appropriate variable
If the consumer script is implemented with an event/polling loop, check the list of "paused" subscriptions each time around that loop; if the retry time has been reached, re-subscribe.
Alternatively, if the library / framework supports it, register a delayed event that will fire at the appropriate time and re-subscribe the queue. The exact mechanics of this depend on the technologies you're using.
All the other subscriptions will continue, so messages to other clients will be delivered. The queue with no subscribers will retain the messages for the offline client in order until the consumer script starts consuming them again.

Message Delivery Guarantee for Multiple Consumers in Pub/Sub and Messaging Queues

Requirement
A system undergoes some state change, and multiple other parts of the system has to know this(lets call them observers) so that they can perform some actions based on the current state, the actions of the observers are important, if some of the observers are not online(not listening currently due to some trouble, but will be back soon), the message should not be discarded till all the observers gets the message.
Trying to accomplish this with pub/sub model, here are my findings, (please correct if this understanding is wrong) -
The publisher creates an event on specific topic, and multiple subscribers can consume the same message. This model either provides no delivery guarantee(in redis), or delivery is guaranteed once(with messaging queues), ie. when one of the consumer acknowledges a message, the message is discarded(rabbitmq).
Example
A new Person Profile entity gets created in DB
Now,
A background verification service has to know this to trigger the verification process.
Subscriptions service has to know this to add default subscriptions to the user.
Now both the tasks are important, unrelated and can run in parallel.
Now In Queue model, if subscription service is down for some reason, a BG verification process acknowledges the message, the message will be removed from the queue, or if it is fire and forget like most of pub/sub, the delivery is anyhow not guaranteed for both the services.
One more point is both the tasks are unrelated and need not be triggered one after other.
In short, my need is to make sure all the consumers gets the same message and they should be able to acknowledge them individually, the message should be evicted only after all the consumers acknowledged it either of the above approaches doesn't do this.
Anything I am missing here ? How should I approach this problem ?
This scenario is explicitly supported by RabbitMQ's model, which separates "exchanges" from "queues":
A publisher always sends a message to an "exchange", which is just a stateless routing address; it doesn't need to know what queue(s) the message should end up in
A consumer always reads messages from a "queue", which contains its own copy of messages, regardless of where they originated
Multiple consumers can subscribe to the same queue, and each message will be delivered to exactly one consumer
Crucially, an exchange can route the same message to multiple queues, and each will receive a copy of the message
The key thing to understand here is that while we talk about consumers "subscribing" to a queue, the "subscription" part of a "pub-sub" setup is actually the routing from the exchange to the queue.
So a RabbitMQ pub-sub system might look like this:
A new Person Profile entity gets created in DB
This event is published as a message to an "events" topic exchange with a routing key of "entity.profile.created"
The exchange routes copies of the message to multiple queues:
A "verification_service" queue has been bound to this exchange to receive a copy of all messages matching "entity.profile.#"
A "subscription_setup_service" queue has been bound to this exchange to receive a copy of all messages matching "entity.profile.created"
The consuming scripts don't know anything about this routing, they just know that messages will appear in the queue for events that are relevant to them:
The verification service picks up the copy of the message on the "verification_service" queue, processes, and acknowledges it
The subscription setup service picks up the copy of the message on the "subscription_setup_service" queue, processes, and acknowledges it
If there are multiple consuming scripts looking at the same queue, they'll share the messages on that queue between them, but still completely independent of any other queue.
Here's a screenshot from this interactive visualisation tool that shows this scenario:
As you mentioned it is not something that you can control with Redis Pub/Sub data structure.
But you can do it easily with Redis Streams.
Streams will allow you to post messages using the XADD command and then control which consumers are dealing with the message and acknowledge that message has been processed.
You can look at these sample application that provides (in Java) example about:
posting and consuming messages
create multiple consumer groups
manage exceptions
Links:
Getting Started with Redis Streams and Java
Redis Streams in Action ( Project that shows how to use ADD/ACK/PENDING/CLAIM and build an error proof streaming application with Redis Streams and SpringData )

Scatter Gather : Wait for all "Gather-Workers" to complete [duplicate]

I've configured a rabbitmq fanout exchange called "ex_foo" for a RPC workload. When clients connect to the server, they create their own non-durable RPC receive queue and connect to it with a BasicConsumer. The apps listen for messages/commands and respond to the queue defined in the reply_to part of the request.
One of the simple messages/commands I'm sending out the the fanout exchange (and thus, every application/client connected to it) is a type of ping request message, and my problem is that I don't know how many ping responses I will get (or should expect), because I don't know how many clients are connected to the fanout exchange at any one time. All clients connected to the fanout exchange should reply.
If gets delivered to 10 queues on the fanout exchange (ie: 10 clients are connected), how do I know how many responses to expect? In order to know that, would I have to know how many times it was delivered? Is there anything more sophisticated and a sleep timer? Simply, my admin tool can't just wait indefinitely and needs to quit after it has recveived all pings (or a time-out has elapsed).
What you are looking for is something like a Scatter-Gather (http://www.eaipatterns.com/BroadcastAggregate.html) pattern, isn’t it?
You don’t know the consumers bound to the fan-out, so you can:
implement an keep-alive from the consumer(s) using for example an queue where the producer is bound.
Each consumer sends a keep-alive each one second, if you don’t receive a message you can considerer the consumer off-line.
Use an in-memory database where the consumer are registered (always with a keep-alive).
Use the HTTP API to know the consumers list bound to the fan-out, in this way:
http://rabbitmqip/vhost/yourfanout/bindings/source and the result is like this:
[{"source":"yourfanout","vhost":"/","destination":"amq.gen-xOpYc8m10Qy1s4KCNFCgFw","destination_type":"queue","routing_key":"","arguments":{},"properties_key":"~"},{"source":" yourfanout","vhost":"/","destination":"myqueue","destination_type":"queue","routing_key":"","arguments":{},"properties_key":"~"}]
Once count the consumers you know the replies count.
Call the API before send a request.
NOTE the last-one can works only if you use a temporary queue bound to the consumers.
I found this resource that could help you (http://geekswithblogs.net/michaelstephenson/archive/2012/08/06/150373.aspx)
I don't know exactly your final scope, but with a keep-alive you can wait max one second before decide if the consumer is alive.

How does RabbitMQ send messages to consumers?

I am a newbie to RabbitMQ, hence need guidance on a basic question:
Does RabbitMQ send messages to consumer as they arrive?
OR
Does RabbitMQ send messages to consumer as they become available?
At message consumption endpoint, I am using com.rabbitmq.client.QueueingConsumer.
Looking at the sprint client source code, I could figure out that
QueueingConsumer keeps listening on socket for any messages the broker sends to it
Any message that is received is parsed and stored as Delivery in a LinkedBlockingQueue encapsulated inside the QueueingConsumer.
This implies that even if the message processing endpoint is busy, messages will be pushed to QueueingConsumer
Is this understanding right?
TLDR: you poll messages from RabbitMQ till the prefetch count is exceeded in which case you will block and only receive heart beat frames till the fetch messages are ACKed. So you can poll but you will only get new messages if the number of non-acked messages is less than the prefetch count. New messages are put on the QueueingConsumer and in theory you should never really have much more than the prefetch count in that QueueingConsumer internal queue.
Details:
Low level wise for (I'm probably going to get some of this wrong) RabbitMQ itself doesn't actually push messages. The client has to continuously read the connections for Frames based on the AMQP protocol. Its hard to classify this as push or pull but just know the client has to continuously read the connection and because the Java client is sadly BIO it is a blocking/polling operation. The blocking/polling is based on the AMQP heartbeat frames and regular frames and socket timeout configuration.
What happens in the Java RabbitMQ client is that there is thread for each channel (or maybe its connection) and that thread loops gathering frames from RabbitMQ which eventually become commands that are put in a blocking queue (I believe its like a SynchronousQueue aka handoff queue but Rabbit has its own special one).
The QueueingConsumer is a higher level API and will pull commands off of that handoff queue mentioned early because if commands are left on the handoff queue it will block the channel frame gathering loop. This is can be bad because timeout the connection. Also the QueueingConsumer allows work to be done on a separate thread instead of being in the same thread as the looping frame thread mentioned earlier.
Now if you look at most Consumer implementations you will probably notice that they are almost always unbounded blocking queues. I'm not entirely sure why the bounding of these queues can't be a multiplier of the prefetch but if they are less than the prefetch it will certainly cause problems with the connection timing out.
I think best answer is product's own answer. As RMQ has both push + pull mechanism defined as part of the protocol. Have a look : https://www.rabbitmq.com/tutorials/amqp-concepts.html
Rabbitmq mainly uses Push mechanism. Poll will consume bandwidth of the server. Poll also has time gaps between each poll. It will not able to achieve low latency. Rabbitmq will push the message to client once there are consumers available for the queue. So the connection is long running. ReadFrame in rabbitmq is basically waiting for incoming frames

Can any of my consumer take the messages from queue?

I am developing an app. and I am using activemq. Is there any way to do that one producer always send messages to one broker but on the opposite side there 3 consumers.Each consumer listens broker and can take any of message from queue.Is this possible?
I am using activemq for writing my app. logs to db.As u know writing logs to db is time taking process.That's why consumer is more and more slow than producer.For ex. I send 100.000 message(huge objects).Producer finishes sending messages in 20 mins.But When the producer finished, consumer has finished 4.000 message processing yet.
Yes, what you are describing is possible. In fact, you can have any number of consumers listening on a single queue. The messages are dispatched in a round-robin fashion between consumers.
What you should be aware of is that ActiveMQ performs much better sending small messages than large ones. If you need to send very large payloads (e.g. 100mb), you are far better off saving the message to a location that is accessible by both the producer and consumers (e.g. a network file system), and sending the location of the message instead. The consumer can then use that to read the message manually. This way you get a relatively small amount of traffic through the message broker.