Multiple subscriptions to a topic - redis

I have been using pubsub for a bit of asynchronous work, and was wondering why someone may create multiple subscriptions for a single topic. My default values are as follows:
project_id = 'project'
topic_name = 'app'
subscription_name = 'general'
The routing of the actual function -- and how to process that -- is being doing in the subscriber receiver itself.
What would be reasons why there would be various subscription names? The only thing I can think of is to spread items across multiple servers for processing, such as:
server1 -- `main-1`
server2 -- `main-2`
etc.
Are there any other reasons why a subscription name would not work well with one value?

In general, there are two paradigms for having multiple subscribers:
Load balancing: The goal is to parallelize the processing of the load by having multiple subscribers using the same subscription. In this scenario, every subscriber receives a subset of the messages. One can horizontally scale processing by creating more subscribers for the same subscription.
Fan out: The goal is to have multiple subscribers receive the entire feed of messages. This is accomplished by having multiple subscriptions. The reason to have fan out is if there are multiple downstream applications interested in the full feed of messages. Imagine there is a feed where the messages are user events on a shopping website. Perhaps one application backs up the data to files, another analyzes the feed for trends in what people are looking at, and another looks through activity to try to find potentially fraudulent transactions. In this scenario, every one of those applications acting as a subscriber needs the full feed of messages, which requires separate subscriptions.

Related

Mass Transit: ensure message processing order when there are different message types

I'm new to Mass Transit and I would like to understand if it can helps with my scenario.
I'm building a sample application implemented with a CQRS event sourcing architecture and I need a service bus in order to dispatch the events created by the command stack to the query stack denormalizers.
Let's suppose of having a single aggregate in our domain, let's call it Photo, and two different domain events: PhotoUploaded and PhotoArchived.
Given this scenario, we have two different message types and the default Mass Transit behaviour is creating two different RabbitMq exchanges: one for the PhotoUploaded message type and the other for the PhotoArchived message type.
Let's suppose of having a single denormalizer called PhotoDenormalizer: this service will be a consumer of both message types, because the photo read model must be updated whenever a photo is uploaded or archived.
Given the default Mass Transit topology, there will be two different exchanges so the message processing order cannot be guaranteed between events of different types: the only guarantee that we have is that all the events of the same type will be processed in order, but we cannot guarantee the processing order between events of different type (notice that, given the events semantic of my example, the processing order matters).
How can I handle such a scenario ? Is Mass Transit suitable with my needs ? Am I completely missing the point with domain events dispatching ?
Disclaimer: this is not an answer to your question, but rather a preventive message why you should not do what you are planning to do.
Whilst message brokers like RMQ and messaging middleware libraries like MassTransit are perfect for integration, I strongly advise against using message brokers for event-sourcing. I can refer to my old answer Event-sourcing: when (and not) should I use Message Queue? that explains the reasons behind it.
One of the reasons you have found yourself - event order will never be guaranteed.
Another obvious reason is that building read models from events that are published via a message broker effectively removes the possibility for replay and to build new read models that would need to start processing events from the beginning of time, but all they get are events that are being published now.
Aggregates form transactional boundaries, so every command needs to guarantee that it completes within one transaction. Whilst MT supports the transaction middleware, it only guarantees that you get a transaction for dependencies that support them, but not for context.Publish(#event) in the consumer body, since RMQ doesn't support transactions. You get a good chance of committing changes and not getting events on the read side. So, the rule of thumb for event stores that you should be able to subscribe to the stream of changes from the store, and not publish events from your code, unless those are integration events and not domain events.
For event-sourcing, it is crucial that each read-model keeps its own checkpoint in the stream of events it is projecting. Message brokers don't give you that kind of power since the "checkpoint" is actually your queue and as soon as the message is gone from the queue - it is gone forever, there's no coming back.
Concerning the actual question:
You can use the message topology configuration to set the same entity name for different messages and then they'll be published to the same exchange, but that falls to the "abuse" category like Chris wrote on that page. I haven't tried that but you definitely can experiment. Message CLR type is part of the metadata, so there shouldn't be deserialization issues.
But again, putting messages in the same exchange won't give you any ordering guarantees, except the fact that all messages will land in one queue for the consuming service.
You will have to at least set the partitioning filter based on your aggregate id, to prevent multiple messages for the same aggregate from being processed in parallel. That, by the way, is also useful for integration. That's how we do it:
void AddHandler<T>(Func<ConsumeContext<T>, string> partition) where T : class
=> ep.Handler<T>(
c => appService.Handle(c, aggregateStore),
hc => hc.UsePartitioner(8, partition));
AddHandler<InternalCommands.V1.Whatever>(c => c.Message.StreamGuid);

Subscription API data for multiple days?

When a user does not sync with Jawbone App for a few days, and then finally syncs on next day, how is the Push data sent ? I tried this and it only sent me a single update in the pushed data.
[{"action":"creation","timestamp":1483892382,"user_xid":"RCLWx75WGKR_eIpHcR5gfA","type":"move","event_xid":"UDtVcjNFXvI7NpciWElZOTbfaRAF4oeQ"}]
When a user syncs multiple days worth of data at once, the data will appear in subsequent pubsub notifications. The data could arrive as:
A single pubsub notification with multiple events
Multiple pubsub notifications with a single event
Some combination of 1 and 2.
There's no way to determine ahead of time which of these scenarios you will get since it depends on numerous factors (e.g., how much data is synced, how it is processed out of Jawbone's queues, what data you've asked for notifications, etc.).
There's also no guarantee that your notifications/events will arrive in chronological order, so your application should be ready to handle any of these scenarios.
For more details, please refer to the pubsub documentation.

What is the diff between data-sync and pub-sub in Deepstream

All:
I am pretty new to deepstream, on its website, it described in core concepts section as:
data-sync Interactive JSON documents that can be edited and observed.
Changes are persisted and synced across clients.
and
publish-subscribe Many clients can subscribe to topics and receive
data whenever other clients publish it to the same topic
I wonder what is the diff between its data-sync and pub-sub in terms of their purpose, in anther way, what task can one do while the other can not?
Thanks
PubSub is a way for clients and servers to send messages to each other. These messages can contain all sorts of data, but once the message is delivered its gone - there's no storage or statefulness. If you're familiar with EventEmitters in e.g. JavaScript you're already familiar with the pattern.
Data-Sync on the other hand is stateful, persistent data. Clients can request JSON documents called records, update them and subscribe to changes made by other records. Records can be arranged in lists and lists can be referenced by records, allowing for data-sync to become the realtime backbone for all the data that drives your app.

RabbitMQ Pub/Sub setup with large number of disconnected clients...

This is a new area for me so hopefully my question makes sense.
In my program I have a large number of clients which are windows services running on laptops - that are often disconnected. Occasionally they come on line and I want them to receive updates based on user profiles. There are many types of notifications that require the client to perform some work on the local application (i.e. the laptop).
I realize that I could do this with a series of restful database queries, but since there are so many clients (upwards to 10,000) and there are lots of different notification types, I was curious if perhaps this was not a problem better suited for a messaging product like RabbitMQ or even 0MQ.
But how would one set this up. (let's assume in RabbitMQ?
Would each user be assigned their own queue?
Or is it preferable to have each queue be a distinct notification type and you would use some combination of direct exchanges or filtering messages based on a routing key, where the routing key could be a username.
Since each user may potentially have a different set of notifications based on their user profile, I am thinking that each client/consumer would have a specific message for each notification sitting on a queue waiting for them to come online and process it.
Is this the right way of thinking about the problem? Thanks in advance.
It will be easier for you to balance a lot of queues than filter long ones, so it's better to use queue per consumer.
Messages can have arbitrary headers and body so it is the right place for notification types.
Since you will be using long-living queues, waiting for consumers on disk - you better use lazy queues https://www.rabbitmq.com/lazy-queues.html (it's available since version 3.6.0)

How to use Azure service bus topics & Subscriptions to load balance messages

In reading many MSDN pages about the Azure Service Bus, it alludes to the ability to set up a "Load Balancing" pattern with the "Topic/Subscription" model, but never says how this is done.
My question is, is this possible. Essentially, we are looking to create Topics that would have a possible n number of subscribers that could be dynamically ramped up and down, based upon incoming load. So, it would not be using the traditional "multicast" pattern but round robining the messages to the subscribers. The reason we want to use this pattern is that we want to take advantage of the rules and filtering that reside in the Topics and Subscriptions, while allowing for dynamic scaling.
Any ideas?