Redis keyspace notifications subscriptions in distributed environment using ServiceStack - redis

We have some Redis keys with a given TTL that we would like to subscribe to and take action upon once the TTL expires (a la job scheduler).
This works well in a single-host environment, and when you subscribe in ServiceStack, using its Redis client, to '__keyspace#0__:expired', that service will pick it and take action. That's fantastic...
... until you have a high-availability topology set up, with more than one API instance in that cluster. Then every single host appears to be picking up on that message and potentially doing things with it.
I know keyspace notifications don't work exactly the same as traditional pub/sub or messaging-layer events, but is there a way to perform some kind of acknowledgement on these kinds of events, so that, at the end of the day, only one host will carry on with the task?
Otherwise, is there a way to delay a message publishing?
Thanks!

As describe in https://redis.io/topics/notifications
very node of a Redis cluster generates events about its own subset of the keyspace as described above. However, unlike regular Pub/Sub communication in a cluster, events' notifications are not broadcasted to all nodes. Put differently, keyspace events are node-specific. This means that to receive all keyspace events of a cluster, clients need to subscribe to each of the nodes.
So client should create separate connection to each node to get redis keyspace notification.

My understanding of your question: You need an event based unicast notification whenever a key is expired.
This solution will be helpful to you if above assumption is correct. It's kind of crude solution but works!
Solution:
You need to put(may be using a service/thread) the expired keys in the Redis List/queue. Then blocking B*POP operation from the client instances on this list/queue will give you what you want!
How does it work?
Let's assume, a single background thread will continuously push the expired keys into a redis list/queue. The cluster of API instances will be calling blocking pop on this list/queue.
Since, blocking pop operation on each item of redis list will be consumed by only one client, only one API instance will the get the notification of expired key!!!
Ref:
List pop operation: https://redis.io/commands/lpop
Similar problem with pub/sub: Competing Consumer on Redis Pub/Sub supported?

Related

Redis Subscribing to a channel ( key space notifications should be enabled ??)

I am working on a node JS app that connect to redis server and subscribe to a channel to get the messages.
There is a bit of confusion, should we really enable "key space notifications" on redis config to get the events in client
The same scenario I have tried using rdis cli, with which i see "key space notifications" are not enabled at the same time I have subscribed to a channel with a pattern, so when ever I publish a message from the other client, I am able to capture that event in subscribed client.
Is the "key space notifications" mandatory , but the POC says other way.
Does any one know what should be the right approach here, subscribing to channel is suffice to get messages, and its nothing to do with "key-space-notifications" ??
From Redis Keyspace Notifications
Keyspace notifications allow clients to subscribe to Pub/Sub channels in order to receive events affecting the Redis data set in some way.
Examples of events that can be received are:
All the commands affecting a given key.
All the keys receiving an LPUSH operation.
All the keys expiring in the database 0.
Events are delivered using the normal Pub/Sub layer of Redis, so clients implementing Pub/Sub are able to use this feature without modifications.
So, if you need just pub/sub, there is no need of extra configuration regarding Keyspace Notifications

Why pub sub in redis cannot be used together with other commands?

I'm reading here, and I see a warning stating that PUB/SUB subscribers in Redis should not issue other commands:
A client subscribed to one or more channels should not issue commands,
although it can subscribe and unsubscribe to and from other channels.
I have two questions:
Why is this limitation?
For the scope of the paragraph, what's a client? A whole process? A Redis connection? A complete Redis instance? Or is it a bad idea in general to issue commands and subscribe to channels, and the admonition goes for every and any scope I can think of?
A client, in this case, is an instance of a connection to Redis. An application could well have multiple clients, each with different responsibilities or as a way to provide higher degrees of parallelism to the application.
What they are suggesting here, however, is that you use an individual client (think 'connection') to handle your incoming subscription messages and to react to those messages as its sole responsibility. The reason it's recommended not to make calls with this connection is because while it is waiting on incoming messages from subscribed channels, the client is in a blocked state.
Trying to make a call on a given client won't work while it's awaiting response from a blocking call.

AWS: Broadcast notifications for multiple worker processes running on multiple instances

I have multiple application instances inside of Amazon EC2, each running several worker processes. What I want is each worker process to be subscribed to some notification(e.g. configuration change). This notification should be basically broadcast message, so that once it is sent - every worker receives it.
I know SQS does not support messages broadcast. Looking through similar questions/threads I see the suggestions to use SNS instead of SQS. I'm not sure this will work for me due to the following reasons:
application instances are part of autoscaling group so they can be dynamically added and removed. In this case I don't see any clear way to unsubscribe every worker(I have multiple workers per instance) once instance gets terminated, which means I'll end up with the mess of dead subscribers after some time.
protocol to use for subscription is also not clear. HTTP endpoint looks like the only option, which means my every worker should run HTTP server on its own port. It also looks I should listen only on instance public IP, which adds one more layer of complexity and insecurity.
At the moment I have a solution based on third party - I'm using 0MQ pub/sub server. But I'm looking for some out-of-box solutions AWS provides.
Thanks,
Vovan
The out-of-the-box AWS solution that comes to mind would be to create one SNS topic, and then for each instance, when the instance boots up, it would create its own SQS queue and subscribe the queue to the SNS topic, so that each individual queue gets a broadcast copy of each message you publish to SNS.
You'd want unsubscribe and delete these queues on instance termination, which could be done with lifecycle hooks.
If you didn't want to use a server to manage the processing of the lifecycle hooks (which publish the launch or termination events to SNS or SQS) you could create an AWS API Gateway endpoint to fire an AWS Lambda function, then subscribe the API Gateway endpoint to the SNS topic using https, to handle the cleanup tasks in Lambda, with no server needed.
That's several services working together and may sound a little complicated, but would be very inexpensive and require little maintenance or attention.
One more solution I've figured out is to use Amazon Kinesis.The implication here is that each subscriber has to maintain it's own checkpoint to receive only most recent notifications.
I realize this is an old thread, but I'd like to share my experience with this. Kinesis has a 5 reads/sec throttle. So if you have 10 nodes polling for events in the stream 1/sec, you're going to be in a constant state of throttling.
Kinesis looks to be primarily for massive writes with just a few readers, which doesn't quite fit a broadcast to many nodes use-case.
Redis is handy solution for broadcasting a message to all subscribers on a topic. It is convenient because it can be used as a docker container for rapid prototyping, but is also offered by AWS as a managed service for multi-node clusters.

Pub/Sub and Redis Clustering

On this link it says "The current implementation will simply broadcast all the publish messages to all the other nodes" and adds that it will be improved in future.
For current implementation: If loosing messages is not important; does it make sense to use redis for pub/sub for now? It looks like one instance is better to stop broadcast traffic. Because beside writes; reads should be propgated to other nodes too! (so that the client will not be notified twice.)
Am I missing something?
No, I don't think you missed any point. Redis Cluster is an on-going work, and this includes the specifications. The section about pub/sub is rather light and could probably be improved.
In Salvatore's proposal, a client is subscribed on a single instance (not to all of them), so when the publications are broadcasted to all instances, the client is only notified once. If the Redis instance is down, it is up to the client to subscribe on one of the surviving node of the cluster (any other).
Another possibility would have been to elect one node of the cluster as a unique pub/sub node, so that clients can publish and subscribe on this node only. But high-availability of the pub/sub service would be more difficult to support this way.

Does the redis pub/sub model require persistent connections to redis?

In a web application, if I need to write an event to a queue, I would make a connection to redis to write the event.
Now if I want another backend process (say a daemon or cron job) to process the or react the the publishing of the event in redis, do I need a persistant connection?
Little confused on how this pub/sub process works in a web application.
Basically in Redis there are two different messaging models:
Fire and Forget / One to Many: Pub/Sub. At the time a message is PUBLISH-ed all the subscribers will receive it, but this message is then lost forever. If a client was not subscribed there is no way it can get it back.
Persisting Queues / One to One: Lists, possibly used with blocking commands such as BLPOP. With lists you have a producer pushing into a list, and one or many consumers waiting for elements, but one message will reach only one of the waiting clients. With lists you have persistence, and messages will wait for a client to pop them instead of disappearing. So even if no one is listening there is a backlog (as big as your available memory, or you can limit the backlog using LTRIM).
I hope this is clear. I suggest you studying the following commands to understand more about Redis and messaging semantics:
LPUSH/RPUSH, RPOP/LPOP, BRPOP/BLPOP
PUBLISH, SUBSCRIBE, PSUBSCRIBE
Doc for this commands is available at redis.io
I'm not totally sure, but I believe that yes, pub/sub requires a persistent connection.
For an alternative I would take a peek at resque and how it handles that. Instead of using pub/sub it simply adds an item to a list in redis, and then whatever daemon or cron job you have can use the lpop command to get the first one.
Sorry for only giving a pseudo answer and then a plug.