How would I use Reddis + Azure Event Hubs to handle mobile push notifications archiving for billions of topics? - redis

I need to design a system that allows
Users to subscribe to any topic
No defined topic limit
Control over sending to one device, or all
Recovery when offline clients, (or APNS) that drops a notification. Provide a way to catch up via REST
Discard all updates older than age T.
I studied many different solutions, such as Notification Hubs, Service Bus, Event Hub... and now discovered Kafka and not sure if that's a good fit.
Draft architecture
Use an Event Hub to listen for mobile deviceID registrations, and userIDs that requests for topic subscriptions .. Pass that to Reddis, below
If registering a phone/subscribing to a topic, save the deviceID userID to the topic key.
If sending a message to a topic, query Reddis for the topic key, and send that result to a FIFO queue for processing.
Pipe the output of the previous query into the built in Reddis Pub/Sub features to alert worker roles that there is work pending.
While the workers send notices to Apple and Firebase, archive out the sent notices to some in-memory store below.
Archive server maintains a history of sent events, so that out-of-sync devices can get the most up to date information LIFO-queue style.
Question
What are your thoughts on using this approach to solve the above needs?
What other things should I learn, research, or experiment (measure)?

Related

How do I monitor RabbitMQ exchange lifecycle events

I'm working with a product suite which uses RabbitMQ as a back end for service bus messaging. Many of the clients use software (NeuronESB) which is supposed to automatically configure exchanges, queues and channels as needed. Somewhere in the system exchanges in Rabbit are being deleted and not re-created, resulting in unexpected issues. Because of the size of the system and closed source nature of at least one of the service bus clients, an audit of code has been unsuccessful in determining the source of the deletion of these exchanges.
I have tried using the firehose functionality of Rabbit, but that only provides the messages being sent through Rabbit, not the internal activities I need.
What methods are available for logging the creation and deletion of exchanges in RabbitMQ? Ideally I would like to know the date, time and client IP of the deleter, but even just getting the date and time would allow me to narrow my search of logs to help find the offender.
Try Events Exchange plugin that should do the trick.
If not working for some reason, the last resort I can think of:
Get a test environment with less clients/messages if you app is busy, then analyse your traffic with wireshark (it can understand amqp) to filter out requests to delete exchange.

RabbitMQ Message Lifetime Replay Message

We are currently evaluating RabbitMQ. Trying to determine how best to implement some of our processes as Messaging apps instead of traditional DB store and grab. Here is the scenario. We have a department of users who perform similar tasks. As they submit work to the server applications we would like the server app to send messages back into a notification window saying what was done - to all the users, not just the one submitting the work. This is all easy to do.
The question is we would like these message to live for say 4 hours in the Queue. If a new user logs in or say a supervisor they would get all the messages from the last 4 hours delivered to their notification window. This gives them a quick way to review what has recently happened and what is going on without having to ask others, "have you talked to John?", "Did you email him is itinerary?", etc.
So, how do we publish messages that have a lifetime of x hours from the time they were published AND any new consumers that connect will get all of these messages delivered in chronological order? And preferably the messages just disappear after they have expired from the queue.
Thanks
There is Per-Queue Message TTL and Per-Message TTL in RabbitMQ. If I am right you can utilize them for your task.
In addition to the above answer, it would be better to have the application/client publish messages to two queues. Consumer would consume from one of the queues while the other queue can be configured using per queue-message TTL or per message TTL to retain the messages.
Queuing messages you do to get a message from one point to the other reliable. So the sender can work independently from the receiver. What you propose is working with a temporary persistent store.
A sql database would fit perfectly, but also a mongodb would work nicely. You drop a document in mongo, give it a ttl and let the database handle the expiration.
http://docs.mongodb.org/master/tutorial/expire-data/

Camel route "to" specific websocket endpoint

I have some camel routes with mina sockets and jetty websockets. I am able to broadcast a message to all the clients connected to the websocket but how do i send a message to a specific endpoint. How do i maintain a list of all connected clients with a client id as reference so i can route to a specific client. Is that possible? Will i be able to mention a dynamic client in the to URI?
Or maybe i am thinking about this wrong and i need to create topics on active mq and have the clients subscribe to it. That would mean that i create a topic for every websocket client? and route the message to the right topic.
Am i atleast on the right track here, any examples you can point out? Google was not helpful.
The approach you take depends on how sensitive the client information is. The downside of a single topic with selectors is that anyone can subscribe to the topic without a selector and see all the information for everyone - not usually something that you want to do.
A better scheme is to use a message distribution mechanism (set of Camel routes) that act as an intermediary between the websocket clients and the system producing the messages. This mechanism is responsible for distributing messages from a single destination to client-specitic destinations. I have worked on a couple of banking web front-ends that used a similar scheme.
In order for this to work you first generate for each user a distinct token/UUID; this is presented to the user when the session is established (usually through some sort of profile query/message).
It's essential that the UUID can be worked out as a hash of the clientId rather than being stored in a DB, as it will be used all the time and you want to make sure this is worked out quickly.
The user then uses that information to connect to specific topics that use that UUID as a suffix. For example two users subscribing to an orderConfirmation topic would each subscribe to their own version of that topic:
clientA -> orderConfirmation.71jqsd87162iuhw78162wd7168
clientB -> orderConfirmation.76232hdwe7r23j92irjh291e0d
To keep track of "presence", your clients would need to periodically send a heartbeat message containing their clientId to a well-known topic that your distribution mechanism listens on. Clients should not be able to subscribe to this topic for reads (see ActiveMQ Security). The message distribution mechanism needs to keep in memory a data structure that contains the clientId and the time a heartbeat was last seen.
When a message is received by the distribution mechanism, it checks whether the clientID for which it received the message has a "live/present" session, determines the UUID for the client, and broadcasts the message on the appropriate topic.
Over time this will create a large number of topics on your broker that you don't want hanging around when the user has gone away. You can configure ActiveMQ to delete these if they have been inactive for some time.
You definitely do not want to create separate endpoint for each client.
Topic and a subscription with selector is an elegant way to resolve it.
I would say the best one.
You need single topic, which every client would subscribe to with the selector looking like where clientId in ('${myClientId}', 'EVERYONE'). Now when you want to publish a message to specific client, you set a property clientId to the id of this client. If you want to broadcast, you set it to 'EVERYONE'
I hope I understand the problem right...

Windows Service Bus Point-to-Point Communications to Reduce Broadcasts

I am using Windows Service Bus 1.0 to communicate between different processes, each context event stream exists on the bus as a topic.
Using the service bus to link events between bounded contexts I need a method to sync events (or in other words request a replay of past events) when a bounded context comes back online but want to limit the potential flood of messages coming back to only go to the endpoint that requested it, at least if this is something that can be easily done by using existing Service Bus features.
So given an imaginary ContextC sends a message to request all previous events from ContextA and ContextB, is there any way for these replay messages to be sent only to ContextC?
What would be the best way to map a context to be the owner of the topic (or in other words, an individual bus subscriber to a bus topic), to facilitate the unicast replaying above?
In my world, I keep this stuff loosely coupled - each context puts stuff onto a topic and anyone that needs stuff subscribes.
Each SB subscription can use the filtering facilities of Service Bus based on properties (e.g. you could tag events by adding Properties on the Messages and then have a filtering condition on the subscription meaning that only whitelisted classes of events ever apply to each consumer).
That plus the fact that you're already seggregating by topic.
The subscription and the topic then allow you to process the events without losing any or having the publisher go around worrying about or chasing subscribers.
You also mentioned you are tying this to an Event Store in other questions - in that case there is a chance your messages need to be consumed in order. If that is the case, you need to put a session id on your messages.
I could speculate as to why you want this subscriber driven redelivery but won't for now. You need to first explain / verify that concept and requirement (by asking questions which explain your higher level goals) a lot further before anyone answers how that would best be achieved using Service Bus.

Redis publish/subscribe: see what channels are currently subscribed to

I am currently interested in seeing what channels are subscribed to in a Redis pub/sub application I have. When a client connects to our server, we register them to a channel that looks like:
user:user_id
The reason for this is I want to be able to see who's "online". I currently blindly fire off messages to a channel without knowing if a client is online since it's not critical that they receive these types of messages.
In an effort to make my application smarter, I'd like to be able to discover if a client is online or not using the pub/sub API, and if they are offline, cache their messages to a separate redis queue which I can push to them when they get back online.
This does not have to be 100% accurate, but the more accurate it is, the better. I'm assuming a generic key does not get created when a channel gets subscribed to, so I cannot do something as trivial as:
redis-cli keys user* to find all online users.
The other strategy I've thought of is just maintaining my own Redis Set whenever a user published or removes themselves from a channel (which the client automatically handles when they hop online and close the app). That would be an additional layer of complexity that I need to manage and I'm hoping there is a more trivial approach with the data that's already available.
As of Redis 2.8 you can do:
PUBSUB CHANNELS [pattern]
The PUBSUB CHANNELS command has O(N) complexity, where N is the number of active channels.
So in your case:
redis-cli PUBSUB CHANNELS user*
would give you want you want.
There is currently no command for showing what channels "exist" by way of being subscribed to, but there is and "approved" issue and a pull request that implements this.
https://github.com/antirez/redis/issues/221
https://github.com/antirez/redis/pull/412
Due to the nature of this call, it is not something that can scale, and is thus a "DEBUG" command.
There are a few other ways to solve your problem, however.
If you have reason to believe that a channel may be subscribed to, you can send it a message and look at the result. The result is the number of subscribers that got the message. If you got 0, you know that they're not there.
Assuming that your user_ids are incremental, you might be interested in using SETBIT to set a 1 or 0 to a user's offset bit to track presence. You can then do cool things like the new BITCOUNT to see how many users are online, and GETBIT to determine if a specific user is online.
The way I have solved your problem more specifically in the past is by signaling a subscription manager that I have subscribed to a channel. The manager then "pings" the channel by sending a blank message to confirm that there is a subscriber, and occasionally pings the channel thereafter to determine if the user is still online. Not ideal, but better than using DEBUG CHANNELS in production.
From version 2.8.0 redis has a pubsub command that would help in this case:
http://redis.io/commands/pubsub
Remark: currently the state of 2.8.0 is not stable yet (RC2)
I am unaware of any specific way to query what channels are being subscribed to, and you are correct that there isn't any key created when this happens. Also, I wouldn't use the KEYS command in production anyway, as it's really a debugging command.
You have the right idea about using a set to add the user when they're online, and then query this with SISMEMBER <set> <user_id> to determine if the messages should be sent to them or added to a Redis list for processing once they do come online.
You will need to figure out when a user logs off so you can remove them from the list of online users, but I don't know enough about your system to know exactly how you would go about that.
If the connected clients have the ability to send a message back to inform the server that the message(s) were consumed, you could use this to keep track of which messages should be stored for later retrieval.
Cheers,
Mike
* PUBSUB NUMSUB [channel-1 ... channel-N]
Returns the number of subscribers (not counting clients subscribed to patterns) for the specified channels.
https://redis.io/commands/pubsub