How to use Azure service bus topics & Subscriptions to load balance messages - load-balancing

In reading many MSDN pages about the Azure Service Bus, it alludes to the ability to set up a "Load Balancing" pattern with the "Topic/Subscription" model, but never says how this is done.
My question is, is this possible. Essentially, we are looking to create Topics that would have a possible n number of subscribers that could be dynamically ramped up and down, based upon incoming load. So, it would not be using the traditional "multicast" pattern but round robining the messages to the subscribers. The reason we want to use this pattern is that we want to take advantage of the rules and filtering that reside in the Topics and Subscriptions, while allowing for dynamic scaling.
Any ideas?

Related

ActiveMQ/JMS network messaging

Sorry if this is answered in the documentation, but I need some more insight. We currently use RabbitMQ, and need a distributed system. I would like to build a distributed system with 3 or more distributed brokers, named NEWYORK, NEVADA and TEXAS. Looking to see if it is workable to send Q messages with routing keys like, NEWYORK.terminal.abc from NEVADA with the ability to send a reply back with a replyTo type option. Also, things like: NEVADA.jobQueue.fastpace from TEXAS. or TEXAS.queues.ect.
Then ability to send TOPIC type messages from NEWYORK.weather and other sites subscribe to NEWYORK.weather. ect.. ect..
Is this something that ActiveMQ/Artemis can do?
Yes, this sort of data transmission is done all the time with ActiveMQ.
Tip: Topics become confusing and complicated to configure once you go to a multi-broker architecture. Look into using Virtual Topics or Composite Destinations to get your data subscriptions lined up how you want, while maintaining pub-sub pattern.
Virtual Topic summary:
Producers end to a topic
Consumers read from specially named queue(s)
Ability to have multiple subscribers, and separate local traffic with over-the-wan traffic into separate queues
Support for consumer-provided server-side filtering using JMS standard selectors
ref: https://activemq.apache.org/virtual-destinations

Multiple subscriptions to a topic

I have been using pubsub for a bit of asynchronous work, and was wondering why someone may create multiple subscriptions for a single topic. My default values are as follows:
project_id = 'project'
topic_name = 'app'
subscription_name = 'general'
The routing of the actual function -- and how to process that -- is being doing in the subscriber receiver itself.
What would be reasons why there would be various subscription names? The only thing I can think of is to spread items across multiple servers for processing, such as:
server1 -- `main-1`
server2 -- `main-2`
etc.
Are there any other reasons why a subscription name would not work well with one value?
In general, there are two paradigms for having multiple subscribers:
Load balancing: The goal is to parallelize the processing of the load by having multiple subscribers using the same subscription. In this scenario, every subscriber receives a subset of the messages. One can horizontally scale processing by creating more subscribers for the same subscription.
Fan out: The goal is to have multiple subscribers receive the entire feed of messages. This is accomplished by having multiple subscriptions. The reason to have fan out is if there are multiple downstream applications interested in the full feed of messages. Imagine there is a feed where the messages are user events on a shopping website. Perhaps one application backs up the data to files, another analyzes the feed for trends in what people are looking at, and another looks through activity to try to find potentially fraudulent transactions. In this scenario, every one of those applications acting as a subscriber needs the full feed of messages, which requires separate subscriptions.

Message types : how much information should messages contain?

We are currently starting to broadcast events from one central applications to other possibly interested consumer applications, and we have different options among members of our team about how much we should put in our published messages.
The general idea/architecture is the following :
In the producer application :
the user interacts with some entities (Aggregate Roots in the DDD sense) that can be created/modified/deleted
Based on what is happening, Domain Events are raised (ex : EntityXCreated, EntityYDeleted, EntityZTransferred etc ... i.e. not only CRUD, but mostly )
Raised events are translated/converted into messages that we send to a RabbitMQ Exchange
in RabbitMQ (we are using RabbitMQ but I believe the question is actually technology-independent):
we define a queue for each consuming application
bindings connect the exchange to the consumer queues (possibly with message filtering)
In the consuming application(s)
application consumes and process messages from its queue
Based on Enterprise Integration Patterns we are trying to define the Canonical format for our published messages, and are hesitating between 2 approaches :
Minimalist messages / event-store-ish : for each event published by the Domain Model, generate a message that contains only the parts of the Aggregate Root that are relevant (for instance, when an update is done, only publish information about the updated section of the aggregate root, more or less matching the process the end-user goes through when using our application)
Pros
small message size
very specialized message types
close to the "Domain Events"
Cons
problematic if delivery order is not guaranteed (i.e. what if Update message is received before Create message ? )
consumers need to know which message types to subscribe to (possibly a big list / domain knowledge is needed)
what if consumer state and producer state get out of sync ?
how to handle new consumer that registers in the future, but does not have knowledge of all the past events
Fully-contained idempotent-ish messages : for each event published by the Domain Model, generate a message that contains a full snapshot of the Aggregate Root at that point in time, hence handling in reality only 2 kind of messages "Create or Update" and "Delete" (+metadata with more specific info if necessary)
Pros
idempotent (declarative messages stating "this is what the truth is like, synchronize yourself however you can")
lower number of message formats to maintain/handle
allow to progressively correct synchronization errors of consumers
consumer automagically handle new Domain Events as long as the resulting message follows canonical data model
Cons
bigger message payload
less pure
Would you recommend an approach over the other ?
Is there another approach we should consider ?
Is there another approach we should consider ?
You might also consider not leaking information out of the service acting as the technical authority for that part of the business
Which roughly means that your events carry identifiers, so that interested parties can know that an entity of interest has changed, and can query the authority for updates to the state.
for each event published by the Domain Model, generate a message that contains a full snapshot of the Aggregate Root at that point in time
This also has the additional Con that any change to the representation of the aggregate also implies a change to the message schema, which is part of the API. So internal changes to aggregates start rippling out across your service boundaries. If the aggregates you are implementing represent a competitive advantage to your business, you are likely to want to be able to adapt quickly; the ripples add friction that will slow your ability to change.
what if consumer state and producer state get out of sync ?
As best I can tell, this problem indicates a design error. If a consumer needs state, which is to say a view built from the history of an aggregate, then it should be fetching that view from the producer, rather than trying to assemble it from a collection of observed messages.
That is to say, if you need state, you need history (complete, ordered). All a single event really tells you is that the history has changed, and you can evict your previously cached history.
Again, responsiveness to change: if you change the implementation of the producer, and consumers are also trying to cobble together their own copy of the history, then your changes are rippling across the service boundaries.

RabbitMQ Pub/Sub setup with large number of disconnected clients...

This is a new area for me so hopefully my question makes sense.
In my program I have a large number of clients which are windows services running on laptops - that are often disconnected. Occasionally they come on line and I want them to receive updates based on user profiles. There are many types of notifications that require the client to perform some work on the local application (i.e. the laptop).
I realize that I could do this with a series of restful database queries, but since there are so many clients (upwards to 10,000) and there are lots of different notification types, I was curious if perhaps this was not a problem better suited for a messaging product like RabbitMQ or even 0MQ.
But how would one set this up. (let's assume in RabbitMQ?
Would each user be assigned their own queue?
Or is it preferable to have each queue be a distinct notification type and you would use some combination of direct exchanges or filtering messages based on a routing key, where the routing key could be a username.
Since each user may potentially have a different set of notifications based on their user profile, I am thinking that each client/consumer would have a specific message for each notification sitting on a queue waiting for them to come online and process it.
Is this the right way of thinking about the problem? Thanks in advance.
It will be easier for you to balance a lot of queues than filter long ones, so it's better to use queue per consumer.
Messages can have arbitrary headers and body so it is the right place for notification types.
Since you will be using long-living queues, waiting for consumers on disk - you better use lazy queues https://www.rabbitmq.com/lazy-queues.html (it's available since version 3.6.0)

Windows Service Bus Point-to-Point Communications to Reduce Broadcasts

I am using Windows Service Bus 1.0 to communicate between different processes, each context event stream exists on the bus as a topic.
Using the service bus to link events between bounded contexts I need a method to sync events (or in other words request a replay of past events) when a bounded context comes back online but want to limit the potential flood of messages coming back to only go to the endpoint that requested it, at least if this is something that can be easily done by using existing Service Bus features.
So given an imaginary ContextC sends a message to request all previous events from ContextA and ContextB, is there any way for these replay messages to be sent only to ContextC?
What would be the best way to map a context to be the owner of the topic (or in other words, an individual bus subscriber to a bus topic), to facilitate the unicast replaying above?
In my world, I keep this stuff loosely coupled - each context puts stuff onto a topic and anyone that needs stuff subscribes.
Each SB subscription can use the filtering facilities of Service Bus based on properties (e.g. you could tag events by adding Properties on the Messages and then have a filtering condition on the subscription meaning that only whitelisted classes of events ever apply to each consumer).
That plus the fact that you're already seggregating by topic.
The subscription and the topic then allow you to process the events without losing any or having the publisher go around worrying about or chasing subscribers.
You also mentioned you are tying this to an Event Store in other questions - in that case there is a chance your messages need to be consumed in order. If that is the case, you need to put a session id on your messages.
I could speculate as to why you want this subscriber driven redelivery but won't for now. You need to first explain / verify that concept and requirement (by asking questions which explain your higher level goals) a lot further before anyone answers how that would best be achieved using Service Bus.