I have an exchange with a type of topic that only redirects messages to queue payments
Somewhere in the future, I will decide to add another queue payment_analyze to analyze all old and new messages that have been enqueued.
durable exchanges and queues survive rabbit MQ restarts, persistent messages get written to disk but when binding a new queue to an old durable exchange, old messages do not get redirected (only new ones do get redirected)
From my understanding, this is the intended behavior as exchanges do not store messages and only act as a "proxy"
How do I achieve this?
Possible Solution
Creating a queue named parking and adding every enqueued message to it, whenever a new queue is added, consume messages from parking without acknowledging to keep the new queue "semi" up to date.
Even though your configured persistent messages on the payments queue, this just means messages will survive a broker restart - once a message has been consumed and acknowledged it would be removed.
If you know you're going to need the payment_analyze queue at some point in the future, is it viable to just create this queue/binding upfront and route messages to both payment_analyze and payments? Messages on the payment_analyze will bank up until you're ready to start consuming them. Note: If you're producing a large number of messages this approach might result in storage issues...
As an alternative, you could write the messages to BLOB storage (or some other data store) as part of your payments queue consumer (or a different queue/consumer altogether) and then when you're ready to introduce the payment_analyze queue, you could write a script to read all the old messages from BLOB storage and send them to the RabbitMQ exchange. With 'topic' exchanges - see here - you can probably be clever with wildcards and routing keys in your queue bindings to ensure both old messages (from BLOB storage) as well as new messages are both routed to the payment_analyze queue, but only new messages are routed to the payments queue (so that your payments queue consumer is not reprocessing old messages).
Another option (assuming you're not overly invested in RabbitMQ) could be to consider Apache Kafka instead which deals with this scenario quite nicely as messages aren't automatically removed from a partition once they've been processed by a subscriber.
Anyways, just a few options to consider...
Related
Requirement
A system undergoes some state change, and multiple other parts of the system has to know this(lets call them observers) so that they can perform some actions based on the current state, the actions of the observers are important, if some of the observers are not online(not listening currently due to some trouble, but will be back soon), the message should not be discarded till all the observers gets the message.
Trying to accomplish this with pub/sub model, here are my findings, (please correct if this understanding is wrong) -
The publisher creates an event on specific topic, and multiple subscribers can consume the same message. This model either provides no delivery guarantee(in redis), or delivery is guaranteed once(with messaging queues), ie. when one of the consumer acknowledges a message, the message is discarded(rabbitmq).
Example
A new Person Profile entity gets created in DB
Now,
A background verification service has to know this to trigger the verification process.
Subscriptions service has to know this to add default subscriptions to the user.
Now both the tasks are important, unrelated and can run in parallel.
Now In Queue model, if subscription service is down for some reason, a BG verification process acknowledges the message, the message will be removed from the queue, or if it is fire and forget like most of pub/sub, the delivery is anyhow not guaranteed for both the services.
One more point is both the tasks are unrelated and need not be triggered one after other.
In short, my need is to make sure all the consumers gets the same message and they should be able to acknowledge them individually, the message should be evicted only after all the consumers acknowledged it either of the above approaches doesn't do this.
Anything I am missing here ? How should I approach this problem ?
This scenario is explicitly supported by RabbitMQ's model, which separates "exchanges" from "queues":
A publisher always sends a message to an "exchange", which is just a stateless routing address; it doesn't need to know what queue(s) the message should end up in
A consumer always reads messages from a "queue", which contains its own copy of messages, regardless of where they originated
Multiple consumers can subscribe to the same queue, and each message will be delivered to exactly one consumer
Crucially, an exchange can route the same message to multiple queues, and each will receive a copy of the message
The key thing to understand here is that while we talk about consumers "subscribing" to a queue, the "subscription" part of a "pub-sub" setup is actually the routing from the exchange to the queue.
So a RabbitMQ pub-sub system might look like this:
A new Person Profile entity gets created in DB
This event is published as a message to an "events" topic exchange with a routing key of "entity.profile.created"
The exchange routes copies of the message to multiple queues:
A "verification_service" queue has been bound to this exchange to receive a copy of all messages matching "entity.profile.#"
A "subscription_setup_service" queue has been bound to this exchange to receive a copy of all messages matching "entity.profile.created"
The consuming scripts don't know anything about this routing, they just know that messages will appear in the queue for events that are relevant to them:
The verification service picks up the copy of the message on the "verification_service" queue, processes, and acknowledges it
The subscription setup service picks up the copy of the message on the "subscription_setup_service" queue, processes, and acknowledges it
If there are multiple consuming scripts looking at the same queue, they'll share the messages on that queue between them, but still completely independent of any other queue.
Here's a screenshot from this interactive visualisation tool that shows this scenario:
As you mentioned it is not something that you can control with Redis Pub/Sub data structure.
But you can do it easily with Redis Streams.
Streams will allow you to post messages using the XADD command and then control which consumers are dealing with the message and acknowledge that message has been processed.
You can look at these sample application that provides (in Java) example about:
posting and consuming messages
create multiple consumer groups
manage exceptions
Links:
Getting Started with Redis Streams and Java
Redis Streams in Action ( Project that shows how to use ADD/ACK/PENDING/CLAIM and build an error proof streaming application with Redis Streams and SpringData )
Using RabbitMQ 3.7.16, with spring-amqp 2.2.3.RELEASE.
Multiple clients publish messages to the DataExchange topic exchange in our RabbitMQ server, using a unique routing key. In the absence of any bindings, the exchange will route all the messaged to the data.queue.generic through the AE.
When a certain client (client ID 1 and 2 in the diagram) publishes lots of messages, in order to scale the consumption of their messages independently from other clients, we are starting consumers and assign them to only handle a their client ID. To achieve this, each client-consumer is defining a new queue, and it binds it to the topic exchange with the routing key events.<clientID>.
So scaling up is covered and works well.
Now when the messages rate for this client goes down, we would like to also scale down its consumers, up to the point of removing all of them. The intention is to then have all those messages being routed to the GenericExchange, where there's a pool of generic consumers taking care of them.
The problem is that if I delete data.queue.2 (in order to remove its binding which will lead to new messages being routed to the GenericExchange) all its pending messages will be lost.
Here's a simplified architecture view:
It would be an acceptable solution to let the messages expire with a TTL in the client queue, and then dead letter them to the generic exchange, but then I also need to stop the topic exchange from routing new messages to this "dying" queue.
So what options do I have to stop the topic exchange from routing messages to the client queue where now there's no consumer connected to it?
Or to explore another path - how to dead letter messages in a deleted/expired queue?
If the client queue is the only one with a matching binding as your explanation seems to suggest, you can just remove the binding between the exchange and the queue.
From then on, all new messages for the client will go through the alternate exchange, your "generic exchange", to be processed by your generic consumers.
As for the messages left over in the client queue, you could use a shovel to send them back to the topic exchange, for them to be routed to the generic exchange.
This based on the assumption the alternate exchange is internal. If it's not internal, you can target it directly with the shovel.
As discussed with Bogdan, another option to resolve this while ensuring no message loss is occuring is to perform multiple steps:
remove the binding between the specific queue and the exchange
have some logic to have the remaining messages be either consumed or rerouted to the generic queue
if the binding removal occurs prior to the consumer(s) disconnect, have the last consumer disconnect only once the queue is empty
if the binding removal occurs after the last consumer disconnect, then have a TTL on messages with alternate exchange as the generic exchange
depending on the options selected before, have some cleanup mecanism to remove the lingering empty queues
I have been looking at message queues (currently between Kafka and RabbitMQ) for one of my projects where these are biggest must have features.
Must have features
Messages in queues should be persistent. (only until they are processed successfully by consumers.)
Messages in queues should be removed only when downstream consumers were able to process the message successfully. Basically, a consumer should ACK. that it processed a message successfully.
Good to have features
To increase throughput, consumers should be able to pull batch of messages from queue.
If you are going with Kafka it will only retains message for a configurable duration of time after which the messages will be discarded to free up spaces no matter consumed or not.
And it is simply the responsibilities of the Kafka consumers to keep a track of what has been consumed.
IMHO if you require to keep the messages persisted for ever than consider using a different storage medium (database may be).
I have a rabbitmq cluster used as a working queue. There are 5 kinds of consumers who want to consume exactly the same data.
What I know for now is using fanout exchange to "copy" the data to 5 DIFFERENT queues. And the 5 consumers can consume different queue. This is kind of wasting resources because the data is the same in file queues.
My question is, does rabbitmq support to push the same data to multi consumers? Just like a message need to be acked for a specified times to be deleted.
I got the following answer from rabbitmq email group. In short, the answer is no... and what I did above is the correct way.
http://rabbitmq.1065348.n5.nabble.com/Does-rabbitmq-support-to-push-the-same-data-to-multi-consumers-td36169.html#a36170
... fanout exchange to "copy" the data to 5 DIFFERENT queues. And the 5 consumers can consume different queue. This is kind of wasting resources because the data is the same in file queues.
You can consume with 5 consumers from one queue if you do not want to duplicate messages.
does rabbitmq support to push the same data to multiple consumers
In AMQP protocol terms you publish message to exchange and then broker (RabbitMQ) decide what to do with messages - assume it figured out the queue message intended for (one or more) and then put that message on top of that queue (queues in RabbitMQ are classic FIFO queues which is somehow break AMQP implementation in RabbitMQ). Only after that message may be delivered to consumer (or die due to queue length limit or per-queue or per-message ttl, if any).
message need to be acked for a specified times to be deleted
There are no way to change message body or attributes after message being published (actually, Dead Letter Exchanges extension and some other may change routing key, for example and add,remove and change some headers, but this is very specific case). So if you want to track ack's number you have to re-publish consumed message with changed body or header (depends on where do you plan to store ack's counter, but headers fits pretty nice for this.
Also note, that there are redeliverd message attribute which denotes whether message was already was consumed, but then redelivered. This flag doesn't count redelivers number so it usage is quite limited.
I am using ActiveMQ 5.4 with KahaDB as message store.
While Publishing Messages (with Persistence true) to a Topic, which has Durable subscriber, the persistence store is increasing even the messages are dispatched to Subscriber. So this is causing an issue as the message store is getting full and not accepting any more messages.
So my question is why the Persistence store is not discarding the messages in the KahaDB, even the messages are getting dispatched?
Regards,
Srinivas
What you are seeing is an interaction between the ActiveMQ message store behaviour and that for durable subscriptions on topics.
When you have durable subscriptions, a topic is treated like a queue for each subscriber's clientId (set on the Connection). The logic being that the client doesn't want to miss any messages when they disconnect. So if they disconnect, the durable subscription hangs around and keeps the messages alive.
The AMQ message store uses data logs for it's message journal. These are written sequentially, and never actually removed from (that would require random access). There is a second file which keeps track of which messages have been consumed. Once all the messages in a data file have been consumed, that file is deleted.
So what you're seeing is that some of the messages in the data file are not being consumed by these durable subscriptions and just hang around. ClientIds for durable subscribers not being consistently used would cause this issue. It's likely that there is something wrong with the way the feature is being used, if you use JMX to inspect the subscriptions on the broker that should help you track down the root cause.
As a general rule, whenever you think that you might want to use a durable subscription, use virtual topics instead - they are much easier to reason about, inspect and load balance. On the other hand if you just want to get the last couple of messages when you reconnect a topic subscriber rather than all the messages you may have missed, use retroactive consumers.
An easy way to get around this issue is to always use a time to live when you send a message - pretty much every use case has a time limit of when a message ought to be consumed by anyway. ActiveMQ will expire messages beyond this point, and free up the messages in the data files for deletion.