Pub/sub with durable messages with Rebus - servicebus

I need a way to publish messages to unknown number of subscribers. The messages should be durable/persisted and categorized into three priorities (high, medium and low). One of the subscribers can only handle a limited load and some messages are just more important. High-prioritized messages processed first etc.
How do I do that with Rebus? I guess I need three queues per subscriber?
Where can I find a publish/subscribe example with durable queues and MSMQ?

First, some info: Rebus likes to work with durable queues, durable messaging, and guaranteed delivery. In fact, unless you actively do stuff to opt out, that's the way everything works. So if you manage to make pub/sub work with Rebus, it's durable :)
Publishing by definition works with an "unknown number of subscribers" - at least that's a bus concern, and not an application concern.
In reality, subscribers initiate pub/sub conversation by issuing a SubscriptionMessage (which can be seen as a subscription request), which is then followed by the publisher publishing some number of events (which can be seen as "subscription replies"). The "bus part" of the publisher keeps track of who subscribed to any given event type.
So far, so good.
Regarding priorities, there's no out-of-the-box way to achieve that with Rebus. One way to ensure a maximum latency on certain message types is, as you're suggesting, by making separate endpoints whose input queues will not be clogged by low priority messages.
But there is some stuff around how Rebus is configured that strongly suggests having only one single input queue in each process, so that would probably imply that you should create separate processes that subscribe to those high priority message types.
I know that MSMQ supports some kind of priority on messages, so I guess it could be supported by having MsmqMessageQueue understand certain headers (similar to how express delivery and time-to-be-received are implemented - see here) - pull requests are happily accepted and strongly encouraged :)

Related

RabbitMQ - Reprioritize message already in queue

We are building spark based jobs. Processing each message delivered by the queue takes time. There is a need to be able to reprioritize one already sent to the queue.
I am aware there is priority queue implementation available, but not sure how to re-prioritize the existing message in the queue?
One bad workaround is to push that message again as higher priority, so that it handled on priority. Later drop the message with same content which had low or no priority when it's turns comes next.
Is there a natural way we can handle this situation or any other queues that supports scenario better?
Unfortunately there isn't. Queues are to be considered as lists of messages in flight. It is not possible to delete/update them.
Your approach of submitting a higher priority message is the only feasible solution.
RabbitMQ is a messaging system (such as the postal one), it is not a DataBase or a storage service. The storage in form of queues is a necessary feature as much as the postal service needs storage for postcards in transit. It is optimized for the purpose and does not allow to access the messages easily.

RabbitMQ direct exchange, with routing key and no queues or subscribers, is this ok for performance?

I have an exchange that's going to receive roughly 50 messages per second. These messages have a unique identifier which relates to each unit in the field. This unique identifier will be the routing key. Every now and again we need to debug or analyse a unit. At that point in time we will spin up a queue, with the correct routing key, and bind it to the exchange. This way, that queue will start receiving the messages for that unit and any consumers monitoring that queue, will then receive the messages.
What this does mean is that 99% of the time, the exchange will have no queues and no routing key. Then, every now and again a queue and routing key will be created and subscribe.
It feels kind of wasteful to be sending 50 messages per second at an exchange, when its just going to immediately discard them. That said, it feels like this how RabbitMQ exchanges are supposed to be used. I guess from a developer perspective i feel like this is wasteful but I also think my understanding of rabbit says that this is the correct way to do.
Is there any overhead to doing this? Any performance concerns I should have? or maybe I am approaching this entirely wrong?
I did try to search before asking but nothing really describes a scenario where an exchange has no queue or routing key, but is still receiving messages.
This is basically how RabbitMQ works, as you have described. The broker is not responsible for how often and how many events you decide to publish. It will nonetheless protect from too much pressure. It has a credit based flow control mechanism. RabbitMQ flow control.
RabbitMQ has different ways in which unroutable messages can be handled.Unroutable Message Handling How to deal with unroutable messages
To sum up a bit the information you will find on those links:
If the publisher does not set the message as mandatory, it will either be discarded or republished to a different alternate exchange that you can configure. This only makes sense if you want to persist all unroutable messages regardless of the source in a single queue, that you can handle later.
If the publisher sets the message as mandatory, the message will be returned to the publisher and the publisher can have a returned message handler setup in order to handle those events.
These strategies in addition to the flow control mechanism, also assure RabbitMQ reliability and protection.
In your situation if you want to limit the messages from producer even more, you need to create a mechanism, as an example, so the producer will not start publishing only when a consumer becomes active. So basically the consumer process will communicate the producer process that it is active and it can start publishing. But from my experience I don't think it's worth the overhead, at least at first, because 50 messages per seconds isn't much. You can monitor the RabbitMQ server and check how is the resource consumption to check if you need to optimize, at first. Optimization is best done with metrics and understanding.

Message bus: sender must wait for acknowledgements from multiple recipients

In our application the publisher creates a message and sends it to a topic.
It then needs to wait, when all of the topic's subscribers ack the message.
It does not appear, the message bus implementations can do this automatically. So we are leaning towards making each subscriber send their own new message for the client, when they are done.
Now, the client can receive all such messages and, when it got one from each destination, do whatever clean-ups it has to do. But what if the client (sender) crashes part way through the stream of acknowledgments? To handle such a misfortune, I need to (re)implement, what the buses already implement, on the client -- save the incoming acknowledgments until I get enough of them.
I don't believe, our needs are that esoteric -- how would you handle the situation, where the sender (publisher) must wait for confirmations from multiple recipients (subscribers)? Sort of like requesting (and awaiting) Return-Receipts from each subscriber to a mailing list...
We are using RabbitMQ, if it matters. Thanks!
The functionality that you are looking for sounds like a messaging solution that can perform transactions across publishers and subscribers of a message. In The Java world, JMS specifies such transactions. One example of a JMS implementation is HornetQ.
RabbitMQ does not provide such functionality and it does for good reasons. RabbitMQ is built for being extremely robust and to perform like hell at the same time. The transactional behavior that you describe is only achievable with the cost of reasonable performance loss (especially if you want to keep outstanding robustness).
With RabbitMQ, one way to assure that a message was consumed successfully, is indeed to publish an answer message on the consumer side that is then consumed by the original publisher. This can be achieved through RabbitMQ's RPC procedure calls which might help you to get a clean solution for your problem setting.
If the (original) publisher crashes before all answers could be received, you can assume that all outstanding answers are still queued on the broker. So you would have to build your publisher in a way that it is capable to resume with processing those left messages. This might turn out to be none-trivial.
Finally, I recommend the following solution: Design your producing component in a way that you can consume the answers with one or more dedicated answer consumers that are separated from the origin publisher.
Benefits of this solution are:
the origin publisher can finish its task independent of consumer success
the origin publisher is independent of consumer availability and speed
the origin publisher implementation is far less complex
in a crash scenario, the answer consumer can resume with processing answers
Now to a more general point: One of the major benefits of messaging is the decoupling of application components by the broker. In AMQP, this is achieved with exchanges and bindings that allow you to move message distribution logic from your application to a central point of configuration.
If you add RPC-style calls to your clients, then your components are most likely closely coupled again, meaning that the publishing component fails if one of the consuming components fails / is not available / too slow. This is exactly what you will want to avoid. Otherwise, why would you have split the components then?
My recommendation is that you design your application in a way that publishers can complete their tasks independent of the success of consumers wherever possible. Back-channels should be an exceptional case and be implemented in the described not-so coupled way.

PubSub + Reliable message delivery to unreliably present subscribers

I need to build a system that uses a Publish/Subscribe bus (e.g. Mule, ZeroMQ, RabbitMQ), but the literature all implies that subscriber applications are reliably available to receive messages from topics to which they subscribe as soon as the Pub/Sub bus is able to deliver the message.
I have a system where some of the applications will be reliably connected to the Publish/Subscribe bus, but other applications will not be active or connected to the bus all the time.
The obvious solution is to have some sort of "presence" protocol between the unreliable application and the Publish/Subscribe bus so that "present" applications get their messages delivered immediately, and "not present" applications have their messages queued up in a persistent buffer of some kind, and as soon as they complete the "presence handshake", the queued messages are delivered to the newly present application.
Are there any Publish/Subscribe buses which have this kind of feature built in, or are there any open-source add-ons which do this? Can you point me to any URLs which describe this?
You can achieve this behaviour quite easily with any AMQP-compliant broker (such as RabbitMQ).
Choose the correct exchange type for your usage model. You'll want to use a direct exchange if you're always sending to absolutely named destinations, something like chat.messages.
If you want to do pattern-based routing, you'll want to use topic exchange. Then you can route based on patterns such a chat.messages.*.
Routing is described in more detail in the RabbitMQ Tutorials.
To create the kind of persistent subscription that you mention, have each subscriber create a queue that is private to that subscriber. The queue is then bound to the relevant routing keys on your chosen exchange.
Since each subscriber has its own queue, messages will be consumed by the subscriber when active and stored when subscriber is inactive or disconnected.
You haven't mentioned your language of choice, but in Java you can accomplish this with JMS using durable subscribers. Any implementation of JMS (there are many, including the aforementioned RabbitMQ) will support this feature.

RabbitMQ fan out on a topic exchange

Pretty new to RabbitMQ and we're still in the investigation stage to see if it's a good fit for our use cases--
We've readily come to the conclusion that our desired topology would have us deploying a few topic based exchanges, and then filtering from there to specific queues. For example, let's say we have a user and an upload exchange, where the user queue might receive messages where the topic is "new-registration" or "friend-request" and the upload exchange might receive messages like "video-upload" or "picture-upload".
Creating the queues, getting them routed to the appropriate queue, and then building listeners to handle the messages for the various queues has been quite straight forward.
What's unclear to me however is if it's possible to do a fanout on a topic exchange?
I.e. I have named queues that are bound to my topic exchange, but I'd like to be able to just throw tons of instances of my listeners at those queues to prevent single points of failure. But to the best of my knowledge, RabbitMQ treats these listeners in a straight forward round robin fashion--e.g. every Nth message always go to the same Nth listener rather than dispatching messages to the first available consumer. This is generally acceptable to us but given the load we anticipate, we'd like to avoid the possibility of hot spots developing amongst our consumer farm.
So, is there some way, either in the queue or exchange configuration or in the consumer code, where we can point our listeners to a topic queue but have the listeners treated in a fanout fashion?
Yes, by having the listeners bind using different queue names, they will be treated in a fanout fashion.
Fanout is 1:N though, i.e. each task can be delivered to multiple listeners like pub-sub. Note that this isn't restricted to a fanout exchange, but also applies if you bind multiple queues to a direct or topic exchange with the same binding key. (Installing the management plugin and looking at the exchanges there may be useful to visualize the bindings in effect.)
Your current setup is a task queue. Each task/message is delivered to exactly one worker/listener. Throw more listeners at the same queue name, and they will process the tasks round-robin as you say. With "fanout" (separate queues for a topic) you will process a task multiple times.
Depending on your platform there may be existing work queue solutions that meet your requirements, such as Resque or DelayedJob for Ruby, Celery for Python or perhaps Octobot or Akka for the JVM.
I don't know for a fact, but I strongly suspect that RabbitMQ will skip consumers with unacknowledged messages, so it should never bottleneck on a single stuck consumer. The comments on their FAQ seem to suggest that RabbitMQ will make an effort to keep things chugging along even in the presence of troublesome consumers.
This is a late answer, but in case others come across this question...
It sounds like what you want is fair dispatch rather than a fan out model (which would publish a given message to every queue).
Fair dispatch will give a message to the next available worker rather than using a simple round-robin approach. This should avoid the "hotspots" you are concerned about, without delivering the same message to multiple consumers.
If this is what you are looking for, then see the "Fair Dispatch" section on this page in the Rabbit docs. A prefetch count of 1 is the key here.