AWS: Broadcast notifications for multiple worker processes running on multiple instances - notifications

I have multiple application instances inside of Amazon EC2, each running several worker processes. What I want is each worker process to be subscribed to some notification(e.g. configuration change). This notification should be basically broadcast message, so that once it is sent - every worker receives it.
I know SQS does not support messages broadcast. Looking through similar questions/threads I see the suggestions to use SNS instead of SQS. I'm not sure this will work for me due to the following reasons:
application instances are part of autoscaling group so they can be dynamically added and removed. In this case I don't see any clear way to unsubscribe every worker(I have multiple workers per instance) once instance gets terminated, which means I'll end up with the mess of dead subscribers after some time.
protocol to use for subscription is also not clear. HTTP endpoint looks like the only option, which means my every worker should run HTTP server on its own port. It also looks I should listen only on instance public IP, which adds one more layer of complexity and insecurity.
At the moment I have a solution based on third party - I'm using 0MQ pub/sub server. But I'm looking for some out-of-box solutions AWS provides.
Thanks,
Vovan

The out-of-the-box AWS solution that comes to mind would be to create one SNS topic, and then for each instance, when the instance boots up, it would create its own SQS queue and subscribe the queue to the SNS topic, so that each individual queue gets a broadcast copy of each message you publish to SNS.
You'd want unsubscribe and delete these queues on instance termination, which could be done with lifecycle hooks.
If you didn't want to use a server to manage the processing of the lifecycle hooks (which publish the launch or termination events to SNS or SQS) you could create an AWS API Gateway endpoint to fire an AWS Lambda function, then subscribe the API Gateway endpoint to the SNS topic using https, to handle the cleanup tasks in Lambda, with no server needed.
That's several services working together and may sound a little complicated, but would be very inexpensive and require little maintenance or attention.

One more solution I've figured out is to use Amazon Kinesis.The implication here is that each subscriber has to maintain it's own checkpoint to receive only most recent notifications.

I realize this is an old thread, but I'd like to share my experience with this. Kinesis has a 5 reads/sec throttle. So if you have 10 nodes polling for events in the stream 1/sec, you're going to be in a constant state of throttling.
Kinesis looks to be primarily for massive writes with just a few readers, which doesn't quite fit a broadcast to many nodes use-case.

Redis is handy solution for broadcasting a message to all subscribers on a topic. It is convenient because it can be used as a docker container for rapid prototyping, but is also offered by AWS as a managed service for multi-node clusters.

Related

Handling of pubsub subscribers for distributed longrunning tasks

I am evaluating the use of using pubsub for long-running tasks such as video transcoding, where a particular transcode may take between 2-10 minutes. Is pubsub a good approach for such a task distribution? For example, let's say I have five servers:
- publisher1
- publisher2
- publisher3
- publisher4
- publisher5
And a topic called "videos". Would it be possible to spread out the messages equally across those five servers? What about when servers are added or removed? What would be a good approach to doing this, or is pubsub not the right tool for something like this?
This does sound like a reasonable use case for pubsub. Specifically, if you use a pull subscriber, you can configure flow control settings to have at most one outstanding message to your server, and configure the max ack extension period (in java) to be a reasonable upper bound of your processing time. This api is described here http://googleapis.github.io/google-cloud-java/google-cloud-clients/apidocs/index.html?com/google/cloud/pubsub/v1/package-summary.html
This should effectively load balance across your servers by default if you use the same subscriber id for all jobs. If a server is added and backlog exists, it will receive a new entry. If a server is removed, it will no longer be sent messages. If it removed while processing or crashes, the message it was working on will be resent to another server.
One concern however is that pubsub has a limit of 10MB per message. You might consider instead putting the data itself in a google cloud storage bucket. Cloud storage can publish the file location to a pubsub topic when an upload is complete. https://cloud.google.com/storage/docs/pubsub-notifications

Redis keyspace notifications subscriptions in distributed environment using ServiceStack

We have some Redis keys with a given TTL that we would like to subscribe to and take action upon once the TTL expires (a la job scheduler).
This works well in a single-host environment, and when you subscribe in ServiceStack, using its Redis client, to '__keyspace#0__:expired', that service will pick it and take action. That's fantastic...
... until you have a high-availability topology set up, with more than one API instance in that cluster. Then every single host appears to be picking up on that message and potentially doing things with it.
I know keyspace notifications don't work exactly the same as traditional pub/sub or messaging-layer events, but is there a way to perform some kind of acknowledgement on these kinds of events, so that, at the end of the day, only one host will carry on with the task?
Otherwise, is there a way to delay a message publishing?
Thanks!
As describe in https://redis.io/topics/notifications
very node of a Redis cluster generates events about its own subset of the keyspace as described above. However, unlike regular Pub/Sub communication in a cluster, events' notifications are not broadcasted to all nodes. Put differently, keyspace events are node-specific. This means that to receive all keyspace events of a cluster, clients need to subscribe to each of the nodes.
So client should create separate connection to each node to get redis keyspace notification.
My understanding of your question: You need an event based unicast notification whenever a key is expired.
This solution will be helpful to you if above assumption is correct. It's kind of crude solution but works!
Solution:
You need to put(may be using a service/thread) the expired keys in the Redis List/queue. Then blocking B*POP operation from the client instances on this list/queue will give you what you want!
How does it work?
Let's assume, a single background thread will continuously push the expired keys into a redis list/queue. The cluster of API instances will be calling blocking pop on this list/queue.
Since, blocking pop operation on each item of redis list will be consumed by only one client, only one API instance will the get the notification of expired key!!!
Ref:
List pop operation: https://redis.io/commands/lpop
Similar problem with pub/sub: Competing Consumer on Redis Pub/Sub supported?

Why pub sub in redis cannot be used together with other commands?

I'm reading here, and I see a warning stating that PUB/SUB subscribers in Redis should not issue other commands:
A client subscribed to one or more channels should not issue commands,
although it can subscribe and unsubscribe to and from other channels.
I have two questions:
Why is this limitation?
For the scope of the paragraph, what's a client? A whole process? A Redis connection? A complete Redis instance? Or is it a bad idea in general to issue commands and subscribe to channels, and the admonition goes for every and any scope I can think of?
A client, in this case, is an instance of a connection to Redis. An application could well have multiple clients, each with different responsibilities or as a way to provide higher degrees of parallelism to the application.
What they are suggesting here, however, is that you use an individual client (think 'connection') to handle your incoming subscription messages and to react to those messages as its sole responsibility. The reason it's recommended not to make calls with this connection is because while it is waiting on incoming messages from subscribed channels, the client is in a blocked state.
Trying to make a call on a given client won't work while it's awaiting response from a blocking call.

Approaches for reporting progress for competing consumer scenario

I am getting my head around messaging. Currently we are spiking a few scenarios using Rebus. We are also considering NServiceBus.
The scenario we are trying to build is a proof of concept for a background task processing system. Today we have a handful of backend services hosted in different ways. (web, windows services, console apps) I am looking to hook them up to rebus and start consuming messages using competing consumer, some mesages will have one listener and some will share the load of messages. Elegant :)
I got a pretty good start from this other question How should I set rebus up for one producer and many consumers and it is working nicely in the proof of concept.
Now I want to start reporting progress. My intital approach is to set up pub/sub as well and spin up a service that listens to progress events from all the services. And if a service is interrested in a specific progress in the future it is easy to subscripe of interrest to the messages and start listening.
But how shall I approach setting up both competing consumer and pub/sub? it is dimply two separate things? (In the rebus case one adapter using UseSqlServerInOneWayClientMode / UseSqlServer and another adapter that is set up for the pub/sub using whatever protocol we want?)
Or is there a better solution then having two "buses" here?
I've built something like that myself a couple of times, and I've had pretty good results with using SignalR to report progress from this kind of backend worker processes.
Our setup had a bunch of WPF clients, one single SignalR hub, and a bunch of backend worker processes. All WPF clients and all backend workers would then establish a connection to the hub, allowing workers to send progress reports while doing their work.
SignalR has some nice properties that makes it very suitable for this exact kind of problem:
The published messages "escape" the Rebus unit of work, allowing progress report messages to be sent several times from within one single message handler even though it could take a long time to complete
It was easy to get the messages all the way to the clients because they subscribe directly
We could use the hub groups functionality to group users so we could target progress/status messages from the backend at either all users or a single user (could also be used for departments, etc.)
The most important point, I guess, is that this progress reporting thing (at least in our case) was not as important as our Rebus messages, i.e. it didn't require the same reliability etc, which we could use to our advantage and then pick a technology with some other nice properties that turned out to be cool.

Does the redis pub/sub model require persistent connections to redis?

In a web application, if I need to write an event to a queue, I would make a connection to redis to write the event.
Now if I want another backend process (say a daemon or cron job) to process the or react the the publishing of the event in redis, do I need a persistant connection?
Little confused on how this pub/sub process works in a web application.
Basically in Redis there are two different messaging models:
Fire and Forget / One to Many: Pub/Sub. At the time a message is PUBLISH-ed all the subscribers will receive it, but this message is then lost forever. If a client was not subscribed there is no way it can get it back.
Persisting Queues / One to One: Lists, possibly used with blocking commands such as BLPOP. With lists you have a producer pushing into a list, and one or many consumers waiting for elements, but one message will reach only one of the waiting clients. With lists you have persistence, and messages will wait for a client to pop them instead of disappearing. So even if no one is listening there is a backlog (as big as your available memory, or you can limit the backlog using LTRIM).
I hope this is clear. I suggest you studying the following commands to understand more about Redis and messaging semantics:
LPUSH/RPUSH, RPOP/LPOP, BRPOP/BLPOP
PUBLISH, SUBSCRIBE, PSUBSCRIBE
Doc for this commands is available at redis.io
I'm not totally sure, but I believe that yes, pub/sub requires a persistent connection.
For an alternative I would take a peek at resque and how it handles that. Instead of using pub/sub it simply adds an item to a list in redis, and then whatever daemon or cron job you have can use the lpop command to get the first one.
Sorry for only giving a pseudo answer and then a plug.