Redis publish/subscribe: see what channels are currently subscribed to - redis

I am currently interested in seeing what channels are subscribed to in a Redis pub/sub application I have. When a client connects to our server, we register them to a channel that looks like:
user:user_id
The reason for this is I want to be able to see who's "online". I currently blindly fire off messages to a channel without knowing if a client is online since it's not critical that they receive these types of messages.
In an effort to make my application smarter, I'd like to be able to discover if a client is online or not using the pub/sub API, and if they are offline, cache their messages to a separate redis queue which I can push to them when they get back online.
This does not have to be 100% accurate, but the more accurate it is, the better. I'm assuming a generic key does not get created when a channel gets subscribed to, so I cannot do something as trivial as:
redis-cli keys user* to find all online users.
The other strategy I've thought of is just maintaining my own Redis Set whenever a user published or removes themselves from a channel (which the client automatically handles when they hop online and close the app). That would be an additional layer of complexity that I need to manage and I'm hoping there is a more trivial approach with the data that's already available.

As of Redis 2.8 you can do:
PUBSUB CHANNELS [pattern]
The PUBSUB CHANNELS command has O(N) complexity, where N is the number of active channels.
So in your case:
redis-cli PUBSUB CHANNELS user*
would give you want you want.

There is currently no command for showing what channels "exist" by way of being subscribed to, but there is and "approved" issue and a pull request that implements this.
https://github.com/antirez/redis/issues/221
https://github.com/antirez/redis/pull/412
Due to the nature of this call, it is not something that can scale, and is thus a "DEBUG" command.
There are a few other ways to solve your problem, however.
If you have reason to believe that a channel may be subscribed to, you can send it a message and look at the result. The result is the number of subscribers that got the message. If you got 0, you know that they're not there.
Assuming that your user_ids are incremental, you might be interested in using SETBIT to set a 1 or 0 to a user's offset bit to track presence. You can then do cool things like the new BITCOUNT to see how many users are online, and GETBIT to determine if a specific user is online.
The way I have solved your problem more specifically in the past is by signaling a subscription manager that I have subscribed to a channel. The manager then "pings" the channel by sending a blank message to confirm that there is a subscriber, and occasionally pings the channel thereafter to determine if the user is still online. Not ideal, but better than using DEBUG CHANNELS in production.

From version 2.8.0 redis has a pubsub command that would help in this case:
http://redis.io/commands/pubsub
Remark: currently the state of 2.8.0 is not stable yet (RC2)

I am unaware of any specific way to query what channels are being subscribed to, and you are correct that there isn't any key created when this happens. Also, I wouldn't use the KEYS command in production anyway, as it's really a debugging command.
You have the right idea about using a set to add the user when they're online, and then query this with SISMEMBER <set> <user_id> to determine if the messages should be sent to them or added to a Redis list for processing once they do come online.
You will need to figure out when a user logs off so you can remove them from the list of online users, but I don't know enough about your system to know exactly how you would go about that.
If the connected clients have the ability to send a message back to inform the server that the message(s) were consumed, you could use this to keep track of which messages should be stored for later retrieval.
Cheers,
Mike

* PUBSUB NUMSUB [channel-1 ... channel-N]
Returns the number of subscribers (not counting clients subscribed to patterns) for the specified channels.
https://redis.io/commands/pubsub

Related

How does Redis PubSub subscribe mechanism works?

I want to create a Publish-Subscribe infrastructure in which every subscriber will listen to multiple (say 100k) channels.
I think to use Redis PubSub for that purpose but I'm not sure if subscribing to thousands of channels is the best practice here.
To answer this I want to know how subscribing mechanism in Redis works in the background.
Another option is to create a channel per subscriber and put some component in between, that will get all messages and publish it to relevant channels.
Any other Idea?
Salvatore/creator of Redis has answered this here: https://groups.google.com/forum/#!topic/redis-db/R09u__3Jzfk
All the complexity on the end is on the PUBLISH command, that performs
an amount of work that is proportional to:
a) The number of clients receiving the message.
b) The number of clients subscribed to a pattern, even if they'll not
match the message.
This means that if you have N clients subscribed to 100000 different
channels, everything will be super fast.
If you have instead 10000 clients subscribed to the same channel,
PUBLISH commands against this channel will be slow, and take maybe a
few milliseconds (not sure about the actual time taken). Since we have
to send the same message to everybody.

RabbitMQ Pub/Sub setup with large number of disconnected clients...

This is a new area for me so hopefully my question makes sense.
In my program I have a large number of clients which are windows services running on laptops - that are often disconnected. Occasionally they come on line and I want them to receive updates based on user profiles. There are many types of notifications that require the client to perform some work on the local application (i.e. the laptop).
I realize that I could do this with a series of restful database queries, but since there are so many clients (upwards to 10,000) and there are lots of different notification types, I was curious if perhaps this was not a problem better suited for a messaging product like RabbitMQ or even 0MQ.
But how would one set this up. (let's assume in RabbitMQ?
Would each user be assigned their own queue?
Or is it preferable to have each queue be a distinct notification type and you would use some combination of direct exchanges or filtering messages based on a routing key, where the routing key could be a username.
Since each user may potentially have a different set of notifications based on their user profile, I am thinking that each client/consumer would have a specific message for each notification sitting on a queue waiting for them to come online and process it.
Is this the right way of thinking about the problem? Thanks in advance.
It will be easier for you to balance a lot of queues than filter long ones, so it's better to use queue per consumer.
Messages can have arbitrary headers and body so it is the right place for notification types.
Since you will be using long-living queues, waiting for consumers on disk - you better use lazy queues https://www.rabbitmq.com/lazy-queues.html (it's available since version 3.6.0)

RabbitMQ Message Lifetime Replay Message

We are currently evaluating RabbitMQ. Trying to determine how best to implement some of our processes as Messaging apps instead of traditional DB store and grab. Here is the scenario. We have a department of users who perform similar tasks. As they submit work to the server applications we would like the server app to send messages back into a notification window saying what was done - to all the users, not just the one submitting the work. This is all easy to do.
The question is we would like these message to live for say 4 hours in the Queue. If a new user logs in or say a supervisor they would get all the messages from the last 4 hours delivered to their notification window. This gives them a quick way to review what has recently happened and what is going on without having to ask others, "have you talked to John?", "Did you email him is itinerary?", etc.
So, how do we publish messages that have a lifetime of x hours from the time they were published AND any new consumers that connect will get all of these messages delivered in chronological order? And preferably the messages just disappear after they have expired from the queue.
Thanks
There is Per-Queue Message TTL and Per-Message TTL in RabbitMQ. If I am right you can utilize them for your task.
In addition to the above answer, it would be better to have the application/client publish messages to two queues. Consumer would consume from one of the queues while the other queue can be configured using per queue-message TTL or per message TTL to retain the messages.
Queuing messages you do to get a message from one point to the other reliable. So the sender can work independently from the receiver. What you propose is working with a temporary persistent store.
A sql database would fit perfectly, but also a mongodb would work nicely. You drop a document in mongo, give it a ttl and let the database handle the expiration.
http://docs.mongodb.org/master/tutorial/expire-data/

"Archiving" publish/subscribe message in Redis

I am using Redis' publish/subscribe feature. So the server is publishing 10 items then the client gets those 10 items.
Now however, a new client subscribes to the feed. I would like them to get the previous 10 items as well as any new items.
Does Redis have a way of doing this using the publish and subscribe functionality? Is a feed history stored anywhere in the database? Is there an easy way of doing this? Is the best way to also store the messages in a list and have the client do an LRANGE my_list 0 10 on the list?
I'd keep a separate archive of the data and have events added to both. New clients can subscribe and queue the real time events, read the archive until it's up to date with the first published event, then catch up with the published events. That way you shouldn't miss any published events while switching between the archive and real time events.
Stumbled on this during some research. I know it is old but I wanted to add that with the Redis Streams data structure it is not overly complex to implement persistent messaging.
The publisher would publish messages to a Stream and a subscriber would just get the latest message if that is all it cared about. You can also create user groups to limit how many subscribers can get the message and then mark them as acknowledged to avoid duplicate processing. This is good when you want a message to be handled only once and need a way to confirm that.
I ended up creating a nodejs app for this sort of purpose. In my case, user data was published to the redis server which i wanted to store, I subscribed to the redis channel with a nodejs app and then saved the details to a database, ive played around with mysql and mongo so far, let me know if this is of any interest and ill paste some code, there are some similarities in trying to store a publish history...
Cheers

How is Redis used in Trello?

I understand that, roughly speaking, Trello uses Redis for a transient data store.
Is anyone able to elaborate further on the part it plays in the application?
We use Redis on Trello for ephemeral data that we would be okay with losing. We do not persist the data in Redis to disk, and we use it allkeys-lru, so we only store things there can be kicked out at any time with only very minor inconvenience to users (e.g. momentarily seeing an incorrect user status). That being said, we give it more than 5x the space it needs to store its actual working set and choose from 10 keys for expiry, so we really never see anything get kicked out that we're using.
It's our pubsub server. When a user does something to a board or a card, we want to send a message with that delta to all websocket-connected clients that are subscribed to the object that changed, so all of our Node processes are subscribed to a pubsub channel that propagates those messages, and they propagate that out to the appropriately permissioned and subscribed websockets.
We SORT OF use it to back socket.io, but since we only use the websockets, and since socket.io is too chatty to scale like we need it to at the moment, we have a patch that disables all but the one channel that is necessary to us.
For our users who don't have websockets, we have to keep a list of the actions that have happened on each object channel since the user's last poll request. For that we use a list which we cap at the most recent 100 elements, and an auxilary counter of how many elements have been added to the list since it was created. So when we're answering a poll request from such a browser, we can check the last element it reports that it has seen, and only send down any messages that have been added to the queue since then. So that gets a poll request down to just a permissions check and a single Redis key check in most cases, which is very fast.
We store some ephemeral data about the active status of connected users in Redis, because that data changes frequently and it is not necessary to persist it to disk.
We store short-lived keys to support OAuth logins in Redis.
We love Redis; once you have an instance of it up and running, you want to use it for all kinds of things. The only real trouble we have had with it is with slow-consuming clients eating up the available space.
We use MongoDB for our more traditional database needs.
Trello uses Redis with Socket.IO (RedisStore) for scaling, with the following two features:
key-value store, to set and get values for a connected client
as a pub-sub service
Resources:
Look at the code for RedisStore in Socket.IO here: https://github.com/LearnBoost/socket.io/blob/master/lib/stores/redis.js
Example of Socket.IO with RedisStore: http://www.ranu.com.ar/2011/11/redisstore-and-rooms-with-socketio.html