Redis Pub Sub - Only one subscriber to Act on expired event - redis

I'm implementing redis Keyspace notifications in my application which is having 10 instances on our production environment.
My pubsub listens for expired event in map1 and decrements in map2 based on that.
This works fine on my local machine. My issue is that when I deploy my application with multiple instances , I think all instances will read expired event and all will decrement the key whereas I want to restrict that only 1 instance should decrement.
Is there any way to achieve this ?

Your listeners will have to coordinate the decrement somehow. You can do that with some sort of locking, but a simpler way perhaps would be embed a notion of version/timestamp into this logic. Here's what I had in mind.
What if you include a timestamp in your "map2"? An expired event has it's own timestamp, so you can have the listeners check-and-set against that (tip: I'd use Lua for the CAS). This will prevent race-like conditions and multiple decrements in one go.
Note: Redis PubSub is amazing, but note that your current solution does not ensure the decrement in "map2" in case a message is lost. In the very near future, Redis will offer the Stream data type, that is much more suitable for that type of job. Specifically, the Stream Consumer Groups functionality is IMO just what you need here for replacing keyspace notifications.

Related

Redis keyspace notifications - get values(small size) of set operations

I'm working on creating DB with Redis.
One of my recruitments is that all the clients in the system will be able to listen to set events and get information about both key and value change.
I know that publishing value may be big(512 MB) but I know that in my system the size of value will not be more than 100 chars.
I have 3 possible solutions and I wonder which one will be better or consider other solutions:
1) After each set operation client will also publish it (PUB/SUB)
2)Edit setGenericCommand function to publish the value as well and use keyspace binding.
3)After client receive keyspace notification it will get the value with get operation.
I would like to understand which approach will be better?
Thank you!
So, 1st and foremost, remember that PubSub is at-most-once delivery. If you really need to process every change in the client, you should consider a more resilient way to do so.
That said, assuming you're ok with PubSub's promises, 1 is the simplest and I'd go with that. At most, I'd provide the clients with a Lua wrapper that combines the SET and PUBLISH commands. This, of course, removes the need to actually listen to Keyspace notifications as you basically implementing it yourself.
2 means hacking Redis, which is great but means you'll have to maintain your own which is meh--;
3 is also simple enough, but with 1 you get away with a single round trip instead of 2.
Another (4) approach is to write a custom module, but IMO too complex for this need. Go with 1 and Lua, and may the force be with you.

Redis keyspace notifications subscriptions in distributed environment using ServiceStack

We have some Redis keys with a given TTL that we would like to subscribe to and take action upon once the TTL expires (a la job scheduler).
This works well in a single-host environment, and when you subscribe in ServiceStack, using its Redis client, to '__keyspace#0__:expired', that service will pick it and take action. That's fantastic...
... until you have a high-availability topology set up, with more than one API instance in that cluster. Then every single host appears to be picking up on that message and potentially doing things with it.
I know keyspace notifications don't work exactly the same as traditional pub/sub or messaging-layer events, but is there a way to perform some kind of acknowledgement on these kinds of events, so that, at the end of the day, only one host will carry on with the task?
Otherwise, is there a way to delay a message publishing?
Thanks!
As describe in https://redis.io/topics/notifications
very node of a Redis cluster generates events about its own subset of the keyspace as described above. However, unlike regular Pub/Sub communication in a cluster, events' notifications are not broadcasted to all nodes. Put differently, keyspace events are node-specific. This means that to receive all keyspace events of a cluster, clients need to subscribe to each of the nodes.
So client should create separate connection to each node to get redis keyspace notification.
My understanding of your question: You need an event based unicast notification whenever a key is expired.
This solution will be helpful to you if above assumption is correct. It's kind of crude solution but works!
Solution:
You need to put(may be using a service/thread) the expired keys in the Redis List/queue. Then blocking B*POP operation from the client instances on this list/queue will give you what you want!
How does it work?
Let's assume, a single background thread will continuously push the expired keys into a redis list/queue. The cluster of API instances will be calling blocking pop on this list/queue.
Since, blocking pop operation on each item of redis list will be consumed by only one client, only one API instance will the get the notification of expired key!!!
Ref:
List pop operation: https://redis.io/commands/lpop
Similar problem with pub/sub: Competing Consumer on Redis Pub/Sub supported?

Redis as a message broker

Question
I want to pass data between applications, in a publish-subscribe manner. Data may be produced at a much higher rate than consumed and messages get lost, which is not a problem. Imagine a fast sensor and a slow sensor data processor. For that, I use redis pub/sub and wrote a class which acts as a subscriber, receives every message and puts that into a buffer. The buffer is overwritten when a new message comes in or nullified when the message is requested by the "real" function. So when I ask this class, I immediately get a response (hint that my function is slower than data comes in) or I have to wait (hint that my function is faster than the data).
This works pretty good for the case that data comes in fast. But for data which comes in relatively seldom, let's say every five seconds, this does not work: imagine my consumer gets launched slightly after the producer, the first message is lost and my consumer needs to wait nearly five seconds, until it can start working.
I think I have to solve this with Redis tools. Instead of a pub/sub, I could simply use the get/set methods, thus putting the cache functionality into Redis directly. But then, my consumer would have to poll the database instead of the event magic I have at the moment. Keys could look like "key:timestamp", and my consumer now has to get key:* and compare the timestamps permamently, which I think would cause a lot of load. There is no natural possibility to sleep, since although I don't care about dropped messages (there is nothing I can do about), I do care about delay.
Does someone use Redis for a similar thing and could give me a hint about clever use of Redis tools and data structures?
edit
Ideally, my program flow would look like this:
start the program
retrieve key from Redis
tell Redis, "hey, notify me on changes of key".
launch something asynchronously, with a callback for new messages.
By writing this, an idea came up: The publisher not only publishes message on topic key, but also set key message. This way, an application could initially get and then subscribe.
Good idea or not really?
What I did after I got the answer below (the accepted one)
Keyspace notifications are really what I need here. Redis acts as the primary source for information, my client subscribes to keyspace notifications, which notify the subscribers about events affecting specific keys. Now, in the asynchronous part of my client, I subscribe to notifications about my key of interest. Those notifications set a key_has_updates flag. When I need the value, I get it from Redis and unset the flag. With an unset flag, I know that there is no new value for that key on the server. Without keyspace notifications, this would have been the part where I needed to poll the server. The advantage is that I can use all sorts of data structures, not only the pub/sub mechanism, and a slow joiner which misses the first event is always able to get the initial value, which with pub/sib would have been lost.
When I need the value, I obtain the value from Redis and set the flag to false.
One idea is to push the data to a list (LPUSH) and trim it (LTRIM), so it doesn't grow forever if there are no consumers. On the other end, the consumer would grab items from that list and process them. You can also use keyspace notifications, and be alerted each time an item is added to that queue.
I pass data between application using two native redis command:
rpush and blpop .
"blpop blocks the connection when there are no elements to pop from any of the given lists".
Data are passed in json format, between application using list as queue.
Application that want send data (act as publisher) make a rpush on a list
Application that want receive data (act as subscriber) make a blpop on the same list
The code shuold be (in perl language)
Sender (we assume an hash pass)
#Encode hash in json format
my $json_text = encode_json \%$hash_ref;
#Connect to redis and send to list
my $r = Redis->new(server => "127.0.0.1:6379");
$r->rpush("shared_queue","$json_text");
$r->quit;
Receiver (into a infinite loop)
while (1) {
my $r = Redis->new(server => "127.0.0.1:6379");
my #elem =$r->blpop("shared_queue",0);
#Decode hash element
my $hash_ref=decode_json($elem\[1]);
#make some stuff
}
I find this way very usefull for many reasons:
The element are stored into list, so temporary disabling of receiver has no information loss. When recevier restart, can process all items into the list.
High rate of sender can be handled with multiple instance of receiver.
Multiple sender can send data on unique list. In ths case should be easily implmented a data collector
Receiver process that act as daemon can be monitored with specific tools (e.g. pm2)
From Redis 5, there is new data type called "Streams" which is append-only datastructure. The Redis streams can be used as reliable message queue with both point to point and multicast communication using consumer group concept Redis_Streams_MQ

Redis publish/subscribe: see what channels are currently subscribed to

I am currently interested in seeing what channels are subscribed to in a Redis pub/sub application I have. When a client connects to our server, we register them to a channel that looks like:
user:user_id
The reason for this is I want to be able to see who's "online". I currently blindly fire off messages to a channel without knowing if a client is online since it's not critical that they receive these types of messages.
In an effort to make my application smarter, I'd like to be able to discover if a client is online or not using the pub/sub API, and if they are offline, cache their messages to a separate redis queue which I can push to them when they get back online.
This does not have to be 100% accurate, but the more accurate it is, the better. I'm assuming a generic key does not get created when a channel gets subscribed to, so I cannot do something as trivial as:
redis-cli keys user* to find all online users.
The other strategy I've thought of is just maintaining my own Redis Set whenever a user published or removes themselves from a channel (which the client automatically handles when they hop online and close the app). That would be an additional layer of complexity that I need to manage and I'm hoping there is a more trivial approach with the data that's already available.
As of Redis 2.8 you can do:
PUBSUB CHANNELS [pattern]
The PUBSUB CHANNELS command has O(N) complexity, where N is the number of active channels.
So in your case:
redis-cli PUBSUB CHANNELS user*
would give you want you want.
There is currently no command for showing what channels "exist" by way of being subscribed to, but there is and "approved" issue and a pull request that implements this.
https://github.com/antirez/redis/issues/221
https://github.com/antirez/redis/pull/412
Due to the nature of this call, it is not something that can scale, and is thus a "DEBUG" command.
There are a few other ways to solve your problem, however.
If you have reason to believe that a channel may be subscribed to, you can send it a message and look at the result. The result is the number of subscribers that got the message. If you got 0, you know that they're not there.
Assuming that your user_ids are incremental, you might be interested in using SETBIT to set a 1 or 0 to a user's offset bit to track presence. You can then do cool things like the new BITCOUNT to see how many users are online, and GETBIT to determine if a specific user is online.
The way I have solved your problem more specifically in the past is by signaling a subscription manager that I have subscribed to a channel. The manager then "pings" the channel by sending a blank message to confirm that there is a subscriber, and occasionally pings the channel thereafter to determine if the user is still online. Not ideal, but better than using DEBUG CHANNELS in production.
From version 2.8.0 redis has a pubsub command that would help in this case:
http://redis.io/commands/pubsub
Remark: currently the state of 2.8.0 is not stable yet (RC2)
I am unaware of any specific way to query what channels are being subscribed to, and you are correct that there isn't any key created when this happens. Also, I wouldn't use the KEYS command in production anyway, as it's really a debugging command.
You have the right idea about using a set to add the user when they're online, and then query this with SISMEMBER <set> <user_id> to determine if the messages should be sent to them or added to a Redis list for processing once they do come online.
You will need to figure out when a user logs off so you can remove them from the list of online users, but I don't know enough about your system to know exactly how you would go about that.
If the connected clients have the ability to send a message back to inform the server that the message(s) were consumed, you could use this to keep track of which messages should be stored for later retrieval.
Cheers,
Mike
* PUBSUB NUMSUB [channel-1 ... channel-N]
Returns the number of subscribers (not counting clients subscribed to patterns) for the specified channels.
https://redis.io/commands/pubsub

How is Redis used in Trello?

I understand that, roughly speaking, Trello uses Redis for a transient data store.
Is anyone able to elaborate further on the part it plays in the application?
We use Redis on Trello for ephemeral data that we would be okay with losing. We do not persist the data in Redis to disk, and we use it allkeys-lru, so we only store things there can be kicked out at any time with only very minor inconvenience to users (e.g. momentarily seeing an incorrect user status). That being said, we give it more than 5x the space it needs to store its actual working set and choose from 10 keys for expiry, so we really never see anything get kicked out that we're using.
It's our pubsub server. When a user does something to a board or a card, we want to send a message with that delta to all websocket-connected clients that are subscribed to the object that changed, so all of our Node processes are subscribed to a pubsub channel that propagates those messages, and they propagate that out to the appropriately permissioned and subscribed websockets.
We SORT OF use it to back socket.io, but since we only use the websockets, and since socket.io is too chatty to scale like we need it to at the moment, we have a patch that disables all but the one channel that is necessary to us.
For our users who don't have websockets, we have to keep a list of the actions that have happened on each object channel since the user's last poll request. For that we use a list which we cap at the most recent 100 elements, and an auxilary counter of how many elements have been added to the list since it was created. So when we're answering a poll request from such a browser, we can check the last element it reports that it has seen, and only send down any messages that have been added to the queue since then. So that gets a poll request down to just a permissions check and a single Redis key check in most cases, which is very fast.
We store some ephemeral data about the active status of connected users in Redis, because that data changes frequently and it is not necessary to persist it to disk.
We store short-lived keys to support OAuth logins in Redis.
We love Redis; once you have an instance of it up and running, you want to use it for all kinds of things. The only real trouble we have had with it is with slow-consuming clients eating up the available space.
We use MongoDB for our more traditional database needs.
Trello uses Redis with Socket.IO (RedisStore) for scaling, with the following two features:
key-value store, to set and get values for a connected client
as a pub-sub service
Resources:
Look at the code for RedisStore in Socket.IO here: https://github.com/LearnBoost/socket.io/blob/master/lib/stores/redis.js
Example of Socket.IO with RedisStore: http://www.ranu.com.ar/2011/11/redisstore-and-rooms-with-socketio.html