Pub/Sub and Redis Clustering - redis

On this link it says "The current implementation will simply broadcast all the publish messages to all the other nodes" and adds that it will be improved in future.
For current implementation: If loosing messages is not important; does it make sense to use redis for pub/sub for now? It looks like one instance is better to stop broadcast traffic. Because beside writes; reads should be propgated to other nodes too! (so that the client will not be notified twice.)
Am I missing something?

No, I don't think you missed any point. Redis Cluster is an on-going work, and this includes the specifications. The section about pub/sub is rather light and could probably be improved.
In Salvatore's proposal, a client is subscribed on a single instance (not to all of them), so when the publications are broadcasted to all instances, the client is only notified once. If the Redis instance is down, it is up to the client to subscribe on one of the surviving node of the cluster (any other).
Another possibility would have been to elect one node of the cluster as a unique pub/sub node, so that clients can publish and subscribe on this node only. But high-availability of the pub/sub service would be more difficult to support this way.

Related

Redis Pub/Sub vs Rabbit MQ

My team wants to move to microservices architecture. Currently we are using Redis Pub/Sub as message broker for some legacy parts of our system. My colleagues think that it is naturally to continue use redis as service bus as they don't want spend their time on studying new product. But in my opinion RabbitMQ (especially with MassTransit) is a better approach for microservices. Could you please compare Redis Pub/Sub with Rabbit MQ and give me some arguments for Rabbit?
Redis is a fast in-memory key-value store with optional persistence. The pub/sub feature of Redis is a marginal case for Redis as a product.
RabbitMQ is the message broker that does nothing else. It is optimized for reliable delivery of messages, both in command style (send to an endpoint exchange/queue) and publish-subscribe. RabbitMQ also includes the management plugin that delivers a helpful API to monitor the broker status, check the queues and so on.
Dealing with Redis pub/sub on a low level of Redis client can be a very painful experience. You could use a library like ServiceStack that has a higher level abstraction to make it more manageable.
However, MassTransit adds a lot of value compared to raw messaging over RMQ. As soon as you start doing stuff for real, no matter what transport you decide to use, you will hit typical issues that are associated with messaging like handling replies, scheduling, long-running processes, re-delivery, dead-letter queues, and poison queues. MassTransit does it all for you. Neither Redis or RMQ client would deliver any of those. If your team wants to spend time dealing with those concerns in their own code - that's more like reinventing the wheel. Using the argument of "not willing to learn a new product" in this context sounds a bit weird, since, instead of delivering value for the product, developers want to spend their time dealing with infrastructure concerns.
RabbitMQ is far more stable and robust than Redis for passing messages.
RabbitMQ is able to hold and store a message if there is no consumer for it (e.g. your listener crashed , etc).
RabbitMQ has different methods for communication: Pub/Sub , Queue. That you can use for load balancing , etc
Redis is convenient for simple cases. If you can afford losing a message and you don't need queues then I think Redis is also a good option.
If you however can not afford losing a message then Redis is not a good option.

Key-aware consumers in RabbitMQ

Let's consider a system where thousands of clients data is published to a RabbitMQ exchange (client_id is known at this stage). Exchange routes them to a single queue. Finally, messages are consumed by a single application. Works great.
However, over time, the consuming application becomes a bottleneck and needs to be scaled horizontally. The problem is the system requires that messages considering particular client are consumed by the same instance of the application.
I can create lots of queues: either one per client or use a topic exchange and route it based on some client_id prefix. Still, I don't see an elegant way how to design the consumer application so that it can be scaled horizontally (as it requires stating queues that it consumes explicitly).
I'm looking for RabbitMQ way for solving this problem.
RabbitMQ has x-consistent-hash and x-modulus-hash exchanges that can be used to solve the problem. When these exchanges are used, messages get partitioned to different queues according to hash values of routing keys. Of course, there are differences between x-consistent-hash and x-modulus-hash in the way how partitioning is implemented, but main idea stays the same - messages with the same routing key (client_id) will be distributed to the same queue and eventually should be consumed by the same application.
For example, the system can have the following topology: every application can define an exclusive queue (used by only one connection and the queue will be deleted when that connection closes) that is binded to the exchange (x-consistent-hash or x-modulus-hash).
In my opinion, it is a good idea to have a distributed cache layer in this particular scenario, but RabbitMQ provides the plugins to tackle this kind of problems.

Redis keyspace notifications subscriptions in distributed environment using ServiceStack

We have some Redis keys with a given TTL that we would like to subscribe to and take action upon once the TTL expires (a la job scheduler).
This works well in a single-host environment, and when you subscribe in ServiceStack, using its Redis client, to '__keyspace#0__:expired', that service will pick it and take action. That's fantastic...
... until you have a high-availability topology set up, with more than one API instance in that cluster. Then every single host appears to be picking up on that message and potentially doing things with it.
I know keyspace notifications don't work exactly the same as traditional pub/sub or messaging-layer events, but is there a way to perform some kind of acknowledgement on these kinds of events, so that, at the end of the day, only one host will carry on with the task?
Otherwise, is there a way to delay a message publishing?
Thanks!
As describe in https://redis.io/topics/notifications
very node of a Redis cluster generates events about its own subset of the keyspace as described above. However, unlike regular Pub/Sub communication in a cluster, events' notifications are not broadcasted to all nodes. Put differently, keyspace events are node-specific. This means that to receive all keyspace events of a cluster, clients need to subscribe to each of the nodes.
So client should create separate connection to each node to get redis keyspace notification.
My understanding of your question: You need an event based unicast notification whenever a key is expired.
This solution will be helpful to you if above assumption is correct. It's kind of crude solution but works!
Solution:
You need to put(may be using a service/thread) the expired keys in the Redis List/queue. Then blocking B*POP operation from the client instances on this list/queue will give you what you want!
How does it work?
Let's assume, a single background thread will continuously push the expired keys into a redis list/queue. The cluster of API instances will be calling blocking pop on this list/queue.
Since, blocking pop operation on each item of redis list will be consumed by only one client, only one API instance will the get the notification of expired key!!!
Ref:
List pop operation: https://redis.io/commands/lpop
Similar problem with pub/sub: Competing Consumer on Redis Pub/Sub supported?

RabbitMQ clustering and mirror queues behavior behind the scenes

Can someone please explain what is going on behind the scenes in a RabbitMQ cluster with multiple nodes and queues in mirrored fashion when publishing to a slave node?
From what I read, it seems that all actions other than publishes go only to the master and the master then broadcasts the effect of the actions to the slaves(this is from the documentation). Form my understanding it means a consumer will always consume message from the master queue. Also, if I send a request to a slave for consuming a message, that slave will do an extra hop by getting to the master for fetching that message.
But what happens when I publish to a slave node? Will this node do the same thing of sending first the message to the master?
It seems there are so many extra hops when dealing with slaves, so it seems you could have a better performance if you know only the master. But how do you handle master failure? Then one of the slaves will be elected master, so you have to know where to connect to?
Asking all of this because we are using RabbitMQ cluster with HAProxy in front, so we can decouple the cluster structure from our apps. This way, whenever a node goes done, the HAProxy will redirect to living nodes. But we have problems when we kill one of the rabbit nodes. The connection to rabbit is permanent, so if it fails, you have to recreate it. Also, you have to resend the messages in this cases, otherwise you will lose them.
Even with all of this, messages can still be lost, because they may be in transit when I kill a node (in some buffers, somewhere on the network etc). So you have to use transactions or publisher confirms, which guarantee the delivery after all the mirrors have been filled up with the message. But here another issue. You may have duplicate messages, because the broker might have sent a confirmation that never reached the producer (due to network failures, etc). Therefore consumer applications will need to perform deduplication or handle incoming messages in an idempotent manner.
Is there a way of avoiding this? Or I have to decide whether I can lose couple of messages versus duplication of some messages?
Can someone please explain what is going on behind the scenes in a RabbitMQ cluster with multiple nodes and queues in mirrored fashion when publishing to a slave node?
This blog outlines exactly what happens.
But what happens when I publish to a slave node? Will this node do the same thing of sending first the message to the master?
The message will be redirected to the master Queue - that is, the node on which the Queue was created.
But how do you handle master failure? Then one of the slaves will be elected master, so you have to know where to connect to?
Again, this is covered here. Essentially, you need a separate service that polls RabbitMQ and determines whether nodes are alive or not. RabbitMQ provides a management API for this. Your publishing and consuming applications need to refer to this service either directly, or through a mutual data-store in order to determine that correct node to publish to or consume from.
The connection to rabbit is permanent, so if it fails, you have to recreate it. Also, you have to resend the messages in this cases, otherwise you will lose them.
You need to subscribe to connection-interrupted events to react to severed connections. You will need to build in some level of redundancy on the client in order to ensure that messages are not lost. I suggest, as above, that you introduce a service specifically designed to interrogate RabbitMQ. You client can attempt to publish a message to the last known active connection, and should this fail, the client might ask the monitor service for an up-to-date listing of the RabbitMQ cluster. Assuming that there is at least one active node, the client may then establish a connection to it and publish the message successfully.
Even with all of this, messages can still be lost, because they may be in transit when I kill a node
There are certain edge-cases that you can't cover with redundancy, and neither can RabbitMQ. For example, when a message lands in a Queue, and the HA policy invokes a background process to copy the message to a backup node. During this process there is potential for the message to be lost before it is persisted to the backup node. Should the active node immediately fail, the message will be lost for good. There is nothing that can be done about this. Unfortunately, when we get down to the level of actual bytes travelling across the wire, there's a limit to the amount of safeguards that we can build.
herefore consumer applications will need to perform deduplication or handle incoming messages in an idempotent manner.
You can handle this a number of ways. For example, setting the message-ttl to a relatively low value will ensure that duplicated messages don't remain on the Queue for extended periods of time. You can also tag each message with a unique reference, and check that reference at the consumer level. Of course, this would require storing a cache of processed messages to compare incoming messages against; the idea being that if a previously processed message arrives, its tag will have been cached by the consumer, and the message can be ignored.
One thing that I'd stress with AMQP and Queue-based solutions in general is that your infrastructure provides the tools, but not the entire solution. You have to bridge those gaps based on your business needs. Often, the best solution is derived through trial and error. I hope my suggestions are of use. I blog about a number of RabbitMQ design solutions here, including the issues you mentioned, here if you're interested.

CQRS using Redis MQ

I have been working on a CQRS project (my first) for over the last 9 months which has been a heavy learning curve. I am currently using JOliver's excellent EventStore in my write model and using PostGresSql for my read model.
Both my read and write databases are on the same machine which means that when a change is made to the write database, in the same synchronous call a change is made to the read model.
As I was learning CQRS I felt this was the best way to go as I had no experience with message queue/service bus frameworks such as MassTransit, NServiceBus etc.
I am now at a point with most of my architecture in place to introduce a message queue framework.
Today, I came across Redis MQ which is part of ServiceStack and as we are already using ServiceStack for our Rest based HTTP clients, this seems like the right way to go.
My question is more about understanding what I need to know (or if I have any misunderstandings) to implement Redis MQ and whether Redis MQ is the right choice?
Now from what I understand, I would use Redis MQ as a durable queue between the write and read database. Once my event store has recorded that something has happened in my domain then it will publish to Redis MQ. The services listening for events/messages would receive the event/message from Redis MQ and once it has processed it (i.e. update or write to the read model), a notification/response goes back to the event store to tell the event store that the message has been received and processed by the listener/subscriber.
Does this sound correct?
Also would the Redis MQ architecture give me everything that NSB, RavenDB, MassTransit etc offer?
Also, I will be deploying to windows 2008 and 2003 server. Is Redis stable for these OSs?
I think the ServiceStack implementation of message queueing in Redis is more appropriate for job-queue scenarios - it pushes a message onto the end of a Redis list and then uses Redis pub-sub to notify listening subscribers that there is a message to pull from the queue. Any consumers would be competing for messages.
For event sourcing, you may be more interested in a type of fanout or topic based messaging topology as offered by RabbitMQ, not that that precludes you from building that sort of thing using Redis data structures yourself.
Now from what I understand, I would use Redis MQ as a durable queue
between the write and read database.
Yes this is correct.
Once my event store has recorded that something has happened in my
domain then it will publish to Redis MQ.
Yes and this can be done in several ways. It can either happen as part of the transaction which persists to the event store or you can have an out of band process which continuously publishes events from the event store.
a notification/response goes back to the event store to tell the event
store that the message has been received and processed by the
listener/subscriber.
The response back to the event publisher is usually omitted. This truly decouples the publishers from subscribers. You make the assumption that once the message is published, all interested subscribers will handle it. If something happens, an error should be logged.
Also would the Redis MQ architecture give me everything that NSB,
RavenDB, MassTransit etc offer?
I don't have experience running Redis MQ, but I do know that Redis supports pub/sub which is one of the value propositions of NSB and MassTransit (as opposed to say bare-bones MSMQ). What MT and NSB offer beyond pub/sub are sagas and it doesn't seem like Redis MQ support those out of the box at least. You may not ever have a need for sagas so this should not automatically be a deterrent. RavenDB is not a message queue so it doesn't apply here.
Also, I will be deploying to windows 2008 and 2003 server. Is Redis
stable for these OSs?
I've run Redis on 2008 R2 and it has been stable so I would think Redis MQ would be stable as well.
You may be interested in a little side project of mine on GitHub which is a queue and persistence implementation for NServiceBus using Redis. https://github.com/mackie1001/NServicebus.Redis
I'd not call it production ready and I want to port it to NSB 4 and do some thorough testing but the meat of it is done.