I have simple code that subscribes to a channel, receives one message and then unsubscribes.
I am using Stackexchange.Redis that has as far as I can see one connection to Redis for subscriptions.
The method that i described will be called by multiple threads simultaneously and the channel is dynamic. What I am wondering is what's going to happen if one of the threads cannot perform the unsubscribe (due to an exception e.g.).
If this keeps going on I'll have a lot of useless stale subscriptions that noone is listening to since from what I understand subscription is not closed after ChannelMessageQueue goes out of scope and is garbage collected eventually.
Is there a good way to handle this situation?
Related
In my app(multiple instances), we occasionally see the case where connection is lost between my app and rabbitmq due to network issues(my app and rabbitmq are both alive), then after connection is recovered(re-established) we will receive messages that are unacked.
This creates an issue for us, because my app wasn't dead, and it is still processing the same message it received before, but now the message is redeivered, and it causes the app to process the message again (which can be fatal to us).
Since the app has multiple instances, it is not easy for an instance to check if another instance is processing the same message at the same time. We can't simply filter out redelivered message, because we need this feature to handle instance/app crashes/re-deployments.
It doesn't seem that there is an api to tell rabbitmq when to not redeliver unacked messages.
So what is the recommended practice to handle this situation ?
Thanks,
The general solution for such scenario is to make the consumers handle the messages in an idempotent manner . Generally what I do is from the producer side ( in case there is no unique identifier in the message body ) I add an attribute idempotencyId to the message body which is a guid and on the consumer side for each message this id is validated against the stored value in database , any duplicates are rejected.
This approach also works for messages which might be shoveled from another cluster or if in a same cluster multiple instances of consumers are listening then too this approach guarantee one time processing.
Would suggest to go over the RabbitMQ Reliability Guide here
Yeah, exactly-once delivery is not something RabbitMQ is good at. In fact, I'd say you should probably not be using it for these kinds of problems. Honestly, the only way to truly fix this is to use distributed transactions or locking.
Anyway, you could turn the problem on its head by ack'ing the message as soon as the consumer gets it, before it starts working on it. That would avoid the RabbitMQ-related duplication issue at least. This is at-most-once delivery.
Of course, it means that if the consumer crashes, the message is lost forever. So you need to persist the message right before you ack it so you can recover it later and also the consumer should remove it once it's complete.
Considering that crashes are rare, you can then have a single dedicated process that just works on those persisted messages. Or for that matter, handle them manually.
Just be aware that you are pushing the duplication problem in front of you, because the consumer might fail to remove the persisted message after it's done working with it anyway, but at least you have the option to implement it however you want.
Storage in this case could be anything from files, a RDBMS or something like ZooKeeper or Redis to lock/unlock in-flight messages.
In some exceptional situations I need somehow to tell consumer on receiving point that some messages shouldn’t be processed. Otherwise two systems will become out-of-sync (we deal with some outdates external systems, and if, for example, connection is dropped we have to discard all queued operations in scope of that connection).
Take a risk and resolve problem messages manually? Compensation actions (that could be tough to support in my case)? Anything else?
There are a few ways:
You can set a time-to-live when sending a message: await endpoint.Send(myMessage, c => c.TimeToLive = TimeSpan.FromHours(1));, but this will apply to all messages that are sent (or published) like this. I would consider this, after looking at your requirements. This is technical, but it is a proper messaging pattern.
Make TTL and generation timestamp properties of your message itself and let the consumer decide if the message is still worth processing. This is more business and, probably, the most correct way.
Combine tech and business - keep the timestamp and TTL in message headers so they don't pollute your message contracts, and filter them out using a custom middleware. In this case, you need to be careful to log such drops so you won't be left wonder why messages disappear now and then.
Almost any unreliable integration can be monitored using sagas, with timeouts. For example, we use a saga to integrate with Twilio. Since we have no ability to open a webhook for them, we poll after some interval to check the message status. You can start a saga when you get a message and schedule a message to check if the processing is still waiting. As discussed in comments, you can either use the "human intervention required" way to fix the issue or let the saga decide to drop the message.
A similar way could be to use a lookup table, where you put the list of messages that aren't relevant for processing. Such a table would be similar to the list of sagas. It seems that this way would also require scheduling. Both here, and for the saga, I'd recommend using a separate receive endpoint (a queue) for the DropIt message, with only one consumer. It would prevent DropIt messages from getting stuck behind the integration messages that are waiting to be processed (and some should be already dropped)
Use RMQ management API to remove messages from the queue. This is the worst method, I won't recommend it.
From what I understand, you're building a system that sends messages to 3rd party systems. In other words, systems you don't control. It has an API but compensating actions aren't always possible, because the API doesn't provide it or because actions are performed inside the 3rd party system that can't be compensated or rolled back?
If possible try to solve this via sagas. Make sure the saga executes the different steps (the sending of messages) in the right order. So that messages that cannot be compensated are sent last. This way message that can be compensated if they fail, will be compensated by the saga. The ones that cannot be compensated should be sent last, when you're as sure as possible that they don't have to be compensated. Because that last message is the last step in synchronizing all systems.
All in all this is one of the problems with distributed systems, keeping everything in sync. Compensating actions is the way to deal with this. If compensating actions aren't possible, you're in a very difficult situation. Try to see if the business can help by becoming more flexible and accepting that you need to compensate things, where they'll tell you it's not possible.
In some exceptional situations I need somehow to tell consumer on receiving point that some messages shouldn’t be processed.
Can't you revert this into:
Tell the consumer that an earlier message can be processed.
This way you can easily turn this in a state machine (like a saga) that acts on two messages. If the 2nd message never arrives then you can discard the 1st after a while or do something else.
The strategy here is to halt/wait until certain that no actions need to be reverted.
I'm reading here, and I see a warning stating that PUB/SUB subscribers in Redis should not issue other commands:
A client subscribed to one or more channels should not issue commands,
although it can subscribe and unsubscribe to and from other channels.
I have two questions:
Why is this limitation?
For the scope of the paragraph, what's a client? A whole process? A Redis connection? A complete Redis instance? Or is it a bad idea in general to issue commands and subscribe to channels, and the admonition goes for every and any scope I can think of?
A client, in this case, is an instance of a connection to Redis. An application could well have multiple clients, each with different responsibilities or as a way to provide higher degrees of parallelism to the application.
What they are suggesting here, however, is that you use an individual client (think 'connection') to handle your incoming subscription messages and to react to those messages as its sole responsibility. The reason it's recommended not to make calls with this connection is because while it is waiting on incoming messages from subscribed channels, the client is in a blocked state.
Trying to make a call on a given client won't work while it's awaiting response from a blocking call.
We have a bunch of requests that we plan to publish to the queue.
There will be several different subscriber types, each in their own round robin pool.
For example Request1 is pushed onto the queue
LoggingSubscriber1 and LoggingSubscriber2 both subscribe with the "LoggingSubscriber" subscriptionId so that only one of them gets the request.
There will be other groups like DoProcessSubscriber1, DoProcessSubscriber2, and DoProcessSubscriber3
And another DoOtherProcessSubscriber1, DoOtherProcessSubscriber2
We need some way to know that all three subscribers (Logging, DoProcess, and DoOtherProcess) have completed, so that we can perform some action...like sending a message to the client that all the entire request has completed.
How would we aggregate responses like this? We were thinking of having each subscriber put a response object on the queue, but we still aren't sure how to know that they are all done.
Ideally you'd use the Request/Response pattern built into EasyNetQ, but that's designed for a single (potentially farmed) consumer. It doesn't allow you to bind to multiple queues. In your case you should probably have your client set up a subscription for replies and have all three services publish a message when they are complete. The client can then wait until it has a response from all three before updating.
However, I'd encourage you to possibly re-think your design. By making the client responsible for acknowledging the completion of the subscribers, you're building a very tightly coupled system. Messaging System design works far better if you adopt the notion of eventual consistency. Allow your client to fire-and-forget and have some audit process ensure that all the expected processing did eventually occur.
In a web application, if I need to write an event to a queue, I would make a connection to redis to write the event.
Now if I want another backend process (say a daemon or cron job) to process the or react the the publishing of the event in redis, do I need a persistant connection?
Little confused on how this pub/sub process works in a web application.
Basically in Redis there are two different messaging models:
Fire and Forget / One to Many: Pub/Sub. At the time a message is PUBLISH-ed all the subscribers will receive it, but this message is then lost forever. If a client was not subscribed there is no way it can get it back.
Persisting Queues / One to One: Lists, possibly used with blocking commands such as BLPOP. With lists you have a producer pushing into a list, and one or many consumers waiting for elements, but one message will reach only one of the waiting clients. With lists you have persistence, and messages will wait for a client to pop them instead of disappearing. So even if no one is listening there is a backlog (as big as your available memory, or you can limit the backlog using LTRIM).
I hope this is clear. I suggest you studying the following commands to understand more about Redis and messaging semantics:
LPUSH/RPUSH, RPOP/LPOP, BRPOP/BLPOP
PUBLISH, SUBSCRIBE, PSUBSCRIBE
Doc for this commands is available at redis.io
I'm not totally sure, but I believe that yes, pub/sub requires a persistent connection.
For an alternative I would take a peek at resque and how it handles that. Instead of using pub/sub it simply adds an item to a list in redis, and then whatever daemon or cron job you have can use the lpop command to get the first one.
Sorry for only giving a pseudo answer and then a plug.