We have a weird issue.
There are two equivalent topics: topicA and topicB.
The topics are created dynamically. Consumers subscribe to VirtualTopic.topicA and VirtualTopic.topicB correspondingly.
Consumers get the messages. We see equal number of arrived and dequeued messages in each virtual topic.
However, topicA says it has active subscribers, whereas topicB does not. All messages in topicA are dequeued. topicB shows the same number of messages as VirtualTopic.topicB contains, but none of the message is dequeued in topicB.
It looks like ActiveMQ just silently copies messages from regular to virtual topic. But it is not clear why it does it only for one of the topics.
Code which subscribes consumers is the same. Code that produces messages is the same. Configuration differs only in name of topics.
Can anyone give a hint what can define that difference in behaviour?
Related
I'm building a software solution which creates JMS topics per new category of something. The topic is created when the first round of data is integrated and must be comunicated.
Durable subscriptions to that topic are created by consumers, but only some time after the category and first data are created. All the data belonging to the category is sent as messages to the consumers, so that they are updated too.
Between the moment when the category is created, and when the durable subscriptions are created, it would be better if the messages are discarded. The consumer first does an initial sync of the existing data, then created the durable subscription and listens for create/update messages.
One option would be to let the consumers create the topic when registering the first durable subscription. In the meantime, if data is added to the category, it is not sent by the produces, thus not creating the topic too.
Another option would be to discard the messages if no consumers exist. I'm not talking about active consumers, I'm talking about no consumers at all. Any idea if this can be implemented? Since there are no durable/non-durable subscriptions for the topic, I was expecting that the messages would be discarded automatically, but I was wrong.
Which option would you choose?
If you look at the image below you will see a topic which never had subscribers with 4498 messages enqueued. Am I interpreting this information in a wrong manner?
Messages sent to a topic when no subscriptions exist (whether durable or not) should be discarded. That's the expected behavior.
The "Messages Enqueued" metric visible on the web console does not mean what you think it means. This metric simply indicates the total number of messages sent to the topic since the last restart. It doesn't indicate how many messages have been retained in subscriptions on that topic (if any).
While looking at the Pub/Sub pattern, i came across the fellowing scenario:
Assume that you have a horizontally scaled app, that has X instances. All of them subscribe to a topic where messages like "Transfer $10 from account A to account B". When someone publish a message to that topic, all subscriber will get that message?
In the case above, clearly, the message should be taken by only 1 subscriber and handled only once.
How does one handle this scenario? Do you abandon the pub/sub and starts pooling?
Let me explain few things with example before you understand that completely. I have worked on Azure service bus so i will explain in that context.
In Pub/sub you have one topic and possible multiple subscription. Lets say we have topic "Shopping-Topic". We have 2 Subscriptions called "Payment-Subscription", "Cart-Subscription". Now we publish message "Payment-processed" on the topic. It's the discretion of subscription to pick that message and reason is that subscription have to mention that which messages it want pick.
In Azure service bus we have something called rule (message label). Default rule is that subscription is listening to all the messages but we can overwrite this behavior and say i am only interested in particular message. In the above case rule added against "Payment-Subscription" to listen the message "Payment-processed" so the message is added to "Payment-Subscription" subscription for it to process. Even though "Cart-Subscription" is also subscribed to the same topic but it is ignoring this message so it's not added to its subscription. This way any intended subscription can listen to particular message not necessarily all of them.
Now we discuss individual subscription. Let's say we have message added to "Payment-Subscription". This subscription has 2 instances/processes that are ready to process the message "Payment-processed". The first process to pick the message will process the message and remove it from subscription.
In RabbitMQ Normally, active consumers connected to the same queue receive messages from it in a round-robin fashion. So this insures that a message is processed exactly once.
So in your case you should design a queue where all the messages for
"Transfer $10 from account A to account B"
Are routed to and all the consumers register themselves on this queue itself , this insures that one message will go to only one subscriber.
Another point not related to your question but is important to know is that there is another concept called "Consumer Priorities" which allows you to ensure that high priority consumers receive messages while they are active, with messages only going to lower priority consumers when the high priority consumers block.
More info can be found here
One of the main characteristics of a message queue service, RabbitMQ included, is preserving message publication order. This is confirmed in the RabbitMQ documentation:
[QUOTE 1] Section 4.7 of the AMQP 0-9-1 core specification explains the
conditions under which ordering is guaranteed: messages published in
one channel, passing through one exchange and one queue and one
outgoing channel will be received in the same order that they were
sent. RabbitMQ offers stronger guarantees since release 2.7.0.
Let's assume in the following that there are no consumers active, to simplify things. We are publishing over one single channel.
So far, so good.
RabbitMQ also provides possibility to inform the publisher that a certain publication has been completely and correctly processed [*]. This is explained here. Basically, the broker will either send a basic.ack or basic.nack message. The documentation also says this:
[QUOTE 2] basic.ack for a persistent message routed to a durable queue will be
sent after persisting the message to disk.
In most cases, RabbitMQ will acknowledge messages to publishers in the
same order they were published (this applies for messages published on
a single channel). However, publisher acknowledgements are emitted
asynchronously and can confirm a single message or a group of
messages. The exact moment when a confirm is emitted depends on the
delivery mode of a message (persistent vs. transient) and the
properties of the queue(s) the message was routed to (see above).
Which is to say that different messages can be considered ready for
acknowledgement at different times. This means that acknowledgements
can arrive in a different order compared to their respective messages.
Applications should not depend on the order of acknowledgements when
possible.
At first glance, this makes sense: persisting a message takes much more time than just storing it in memory, so it's perfectly possibly that the acknowledgment of a later transient message will arrive before the acknowledgement of an earlier persistent message.
But, if we re-read the first quote regarding message order [QUOTE 1] here above, it gets confusing. I'll explain. Assume we are sending two messages to the same exchange: first a persistent and then a transient message. Since RabbitMQ claims to preserve message order, how can it send an acknowledgment of the second/transient message before it knows that the first/persistent message is indeed completely written to disk?
In other words, does the remark regarding illogical acknowledgement order [QUOTE 2] here above only apply in case the two messages are each routed to completely different target queue(s) (which might happen if they have different routing keys, for example)? In that case, we don't have to guarantee anything as done in [QUOTE 1].
[*] In most cases, this means 'queued'. However, if there are no routing rules applicable, it cannot be enqueued in a target queue. However, this is still a positive outcome regarding publication confirmation.
update
I read this answer on a similar question. This basically says that there are no guarantees whatsoever. Even the most naive implementation, where we delay the publication of message 2 to the point after we got an acknowledgment of message 1, might not result in the desired message order. Basically, [QUOTE 1] is not met.
Is this correct?
From this response on rabbitmq-users:
RabbitMQ knows message position in a queue regardless of whether it is transient or not.
My guess (I did not write that part of the docs) the ack ordering section primarily tries to communicate that if two messages are routed to two different queues, those queues will handle/replicate/persist them concurrently. Reasoning about ordering in more than one queue is pretty hard. A message can go into more than one queue as well.
Nonetheless, RabbitMQ queues know what position a message has in what queues. Once all routing/delivery acknowledgements are received by a channel that handled the publish, it is added to the list of acknowledgements to send out. Note that that
list may or may not be ordered the same way as the original publishes and worrying about that is not practical for many reasons, most importantly: the user typically primarily cares about the ordering in the queues.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.
I know that achieving round-robin behaviour in a topic exchange can be tricky or impossible so my question in fact is if there is anything I can make out of RabbitMQ or look away to other message queues that support that.
Here's a detailed explanation of my application requirements:
There will be one producer, let's call it P
There (potentially) will be thousands of consumers, let's call them Cn
Each consumer can "subscribe" to 1 or more topic exchange and multiple consumers can be subscribed to the same topic
Every message published into the topic should be consumed by only ONE consumer
Use case #1
Assume:
Topics
foo.bar
foo.baz
Consumers
Consumer C1 is subscribed to topic #
Consumer C2 is subscribed to topic foo.*
Consumer C3 is subscribed to topic *.bar
Producer P publishes the following messages:
publish foo.qux: C1 and C2 can potentially consume this message but only one receives it
publish foo.bar: C1, C2 and C3 can potentially consume this message but only one receives it
Note
Unfortunately I can't have a separate queue for each "topic" therefore using the Direct Exchange doesn't work since the number of topic combinations can be huge (tens of thousands)
From what I've read, there is no out-of-the box solution with RabbitMQ. Does anybody know a workaround or there's another message queue solution that would support this, ex. Kafka, Kinesis etc.
Thank you
There appears to be a conflation of the role of the exchange, which is to route messages, and the queue, which is to provide a holding place for messages waiting to be processed. Funneling messages into one or more queues is the job of the exchange, while funneling messages from the queue into multiple consumers is the job of the queue. Round robin only comes into play for the latter.
Fundamentally, a topic exchange operates by duplicating messages, one for each queue matching the topic published with the message. Therefore, any expectation of round-robin behavior would be a mistake, as it goes against the very definition of the topic exchange.
All this does is to establish that, by definition, the scenario presented in the question does not make sense. That does not mean the desired behavior is impossible, but the terms and topology may need some clarifying adjustments.
Let's take a step back and look at the described lifetime for one message: It is produced by exactly one producer and consumed by one of many consumers. Ordinarily, that is the scenario addressed by a direct exchange. The complicating factor in this is that your consumers are selective about what types of messages they will consume (or, to put it another way, your producer is not consistent about what types of messages it produces).
Ordinarily in message-oriented processing, a single message type corresponds to a single consumer type. Therefore, each different type of message would get its own corresponding queue. However, based on the description given in this question, a single message type might correspond to multiple different consumer types. One issue I have is the following statement:
Unfortunately I can't have a separate queue for each "topic"
On its face, that statement makes no sense, because what it really says is that you have arbitrarily many (in fact, an unknown number of) message types; if that were the case, then how would you be able to write code to process them?
So, ignoring that statement for a bit, we are led to two possibilities with RabbitMQ out of the box:
Use a direct exchange and publish your messages using the type of message as a routing key. Then, have your various consumers subscribe to only the message types that they can process. This is the most common message processing pattern.
Use a topic exchange, as you have, and come up with some sort of external de-duplication logic (perhaps memcached), where messages are checked against it and discarded if another consumer has started to process it.
Now, neither of these deals explicitly with the round-robin requirement. Since it was not explained why or how this was important, it is assumed that it can be ignored. If not, further definition of the problem space is required.
We're seeing an issue where consumers of our message queues are picking up messages from queues at the top of the alphabetical range. We have two applications: a producer, and a subscriber. We're using RabbitMQ 3.6.1.
Let's say that the message queues are setup like so:
Our first application, the producer, puts say 100 messages/second onto each queue:
Our second application, the subscriber, has five unique consumer methods that can deal with messages on each respective queue. Each method binds to it's respective queue. A subscriber has a prefetch of 1 meaning it can only hold one message at a time, regardless of queue. We may run numerous instances of the subscriber like so:
So the situation is thus: each queue is receiving 100 msg/sec, and we have four instances of subscriber consuming these messages, so each queue has four consumers. Let's say that the consumer methods can deal with 25 msg/sec each.
What happens is that instead of all the queues being consumed equally, the alphabetically higher queues instead get priority. It's seems as though when the subscriber becomes ready, RabbitMQ looks down the list of queues that this particular ready channel is bound to, and picks the first queue with pending messages.
In our situation, A_QUEUE will have every message consumed. B_QUEUE may have some consumed in certain race conditions, but C_QUEUE/D_QUEUE and especially E_QUEUE will rarely get touched.
If we turn off the publisher, the queues will eventually drain, top to bottom.
Is it possible to configure either RabbitMQ itself or possibly even the channel to use some sort of round robin distribution policy or maybe even random policy so that when a channel has numerous bound queues, all with messages pending, the distribution is even?
to clarify: you have a single subscriber application with multiple consumers in it, right?
I'm guessing you're using a single RabbitMQ Connection within the subscriber app.
Are you also re-using a single RabbitMQ Channel for all of your consumers? If so, that would be a problem. Be sure to use a new Channel for each consumer you start.
Maybe the picture is wrong, but if it's not then your setup is wrong. You don't need 4 queues if you are going to have subscribers that listen to each and every queue. You'd just need one queue, that has multiple instances of the same subscriber consuming from it.
Now to answer, yes (but no need to configure, as long as prefetch is 1), actually rabbitmq does distribute messages evenly. You can find about about that here, and on the same place actually how your setup should look like. Here is a quote from the link.
RabbitMQ just dispatches a message when the message enters the queue.
It doesn't look at the number of unacknowledged messages for a
consumer. It just blindly dispatches every n-th message to the n-th
consumer.