NServiceBus - Notify User On Failure (After SLR) - nservicebus

I have a service that handles messages that persists data to an external system. If (a.k.a. when) the writing of this data to the external system fails, or normal monitoring strategy will alert system admins of the failure.
I would like to also notify the user who submitted the message that there is a delay in processing their request.
Where/How is the best way to accomplish this scenario? I've looked into the IManageMessageFailures, but it seems that will bypass the SLR functionality.

Starting with NServiceBus version 5.1 now has the ability to use Reactive Extensions to observe when a message is sent to an error queue. From there, you can log, email, or whatever best meets your needs.
http://docs.particular.net/nservicebus/subscribing-to-push-based-error-notifications

Why don't you try and separate the two concerns?
Manage the 3rd party interaction in a saga, and if it fails, send a failure notification message (you can use timeout to cater for no proper reply).

Related

How to handle secured API in service to service communication

I have a working monolith application (deployed in a container), for which I want to add notifications feature as a separate microservice.
I'm planning for the monolith to emit events to a message bus (RabbitMQ) where they will be received by the new service, which will send the notification to user. In order to compose a notification, it will need other information about the user from the monolit, so it will call monolith's REST API in order to obtain it.
The problem is, that access to the monolith's API requires authentication in form of a token. I was thinking of:
using the secret from the monolith to issue a never-expiring token - I don't think this is a great idea from the security perspective, and also I know that sometimes the keys rotate in which case the token would became invalid eventually anyway
using the message bus to retrieve the information - this does not seem a good idea either as the asynchrony would make it very complicated
providing all the info the notification service needs in the event - this would make them more coupled together, and moreover, I plan to also send notifications based on the state on the monolith not triggered by an event
removing the authentication from the monolith and implementing it differently (not sure how yet)
My question is, what are some of the good ways this kind of problem can be solved, and also, having just started learning about microservices, is what I am trying to do right in the first place?
When dealing with internal security you should always consider the deployment and how the APIs are exposed to the outside world, an API gateway might be used to simply make it impossible to access internal APIs. In that case, a fixed token might be good enough to ensure that the client is authorized.
In general, though, I would suggest looking into OAuth2 or a JWT-based solution as it helps to validate the identities of the calling system as well as their access grants.
As for your architecture doubts, you need to consider the following scenarios when building out the solution:
The remote call can fail, at any time for unknown reasons, as such you shouldn't acknowledge the notification event until you're certain that the notification has been processed successfully.
As you've mentioned RabbitMQ, you should aim to keep the notification queue as small as possible, to that effect, a cache that contains the user details might help speed things along (and help you reduce the chance of failure due to the external system not being available).
If your application sends a lot of notifications to potentially millions of different users, you could consider having a read-only database replica of the users which is accessible to the notification service, and directly read from the database cluster in batches. This reduces the load on the monolith and shift it to the database layer

How does RabbitMQ publisher confirms work?

I have gone through rabbitmq documentation,
https://www.rabbitmq.com/confirms.html#publisher-confirms
Using standard AMQP 0-9-1, the only way to guarantee that a message
isn't lost is by using transactions -- make the channel transactional
then for each message or set of messages publish, commit. In this
case, transactions are unnecessarily heavyweight and decrease
throughput by a factor of 250. To remedy this, a confirmation
mechanism was introduced. It mimics the consumer acknowledgements
mechanism already present in the protocol.
To enable confirms, a client sends the confirm.select method.
Depending on whether no-wait was set or not, the broker may respond
with a confirm.select-ok. Once the confirm.select method is used on a
channel, it is said to be in confirm mode. A transactional channel
cannot be put into confirm mode and once a channel is in confirm mode,
it cannot be made transactional.
Currently I am using RabbitTemplate.convertAndSend of spring-rabbit library to send message.
I am using transactional channel to publish messages to rabbitmq, As per the document its slower and I can can improve the throughput by using publisher-confirm.
But I am not much clear about it.
If I want to enable confirm then what are changes required and how do I handle exception?
What will be my retrial mechanism?
Does this publisher confirm work in asynchronous way?
And does transaction work in synchronously?
Any suggestion is highly appreciated.
Using publisher confirms will not improve performance significantly over transactions if you wait for the confirm for each individual send. They help significantly if you send many messages and wait for the confirms later.
Transactions are synchronous. Confirms are completely asynchronous.
See Confirms and Returns.
When you enable confirms, you provide a callback to the template which will be called when the confirm is received. You add correlation data to the send, which is provided in the callback so you can determine which send this confirm is for. Furthermore, the correlation data (in recent versions) provides a Future<?> which you can wait on to receive the confirm in a synchronous manner.
That's where you would handle any exception(s).
I hope that helps.
There is a confirms and returns sample Spring Boot application in the samples repo but it was created before the future was added to the CorrelationData. That will be fixed soon.
The correlation data can contain the original message, enabling retry.

RabbitMQ+MassTransit: how to cancel queued message from processing?

In some exceptional situations I need somehow to tell consumer on receiving point that some messages shouldn’t be processed. Otherwise two systems will become out-of-sync (we deal with some outdates external systems, and if, for example, connection is dropped we have to discard all queued operations in scope of that connection).
Take a risk and resolve problem messages manually? Compensation actions (that could be tough to support in my case)? Anything else?
There are a few ways:
You can set a time-to-live when sending a message: await endpoint.Send(myMessage, c => c.TimeToLive = TimeSpan.FromHours(1));, but this will apply to all messages that are sent (or published) like this. I would consider this, after looking at your requirements. This is technical, but it is a proper messaging pattern.
Make TTL and generation timestamp properties of your message itself and let the consumer decide if the message is still worth processing. This is more business and, probably, the most correct way.
Combine tech and business - keep the timestamp and TTL in message headers so they don't pollute your message contracts, and filter them out using a custom middleware. In this case, you need to be careful to log such drops so you won't be left wonder why messages disappear now and then.
Almost any unreliable integration can be monitored using sagas, with timeouts. For example, we use a saga to integrate with Twilio. Since we have no ability to open a webhook for them, we poll after some interval to check the message status. You can start a saga when you get a message and schedule a message to check if the processing is still waiting. As discussed in comments, you can either use the "human intervention required" way to fix the issue or let the saga decide to drop the message.
A similar way could be to use a lookup table, where you put the list of messages that aren't relevant for processing. Such a table would be similar to the list of sagas. It seems that this way would also require scheduling. Both here, and for the saga, I'd recommend using a separate receive endpoint (a queue) for the DropIt message, with only one consumer. It would prevent DropIt messages from getting stuck behind the integration messages that are waiting to be processed (and some should be already dropped)
Use RMQ management API to remove messages from the queue. This is the worst method, I won't recommend it.
From what I understand, you're building a system that sends messages to 3rd party systems. In other words, systems you don't control. It has an API but compensating actions aren't always possible, because the API doesn't provide it or because actions are performed inside the 3rd party system that can't be compensated or rolled back?
If possible try to solve this via sagas. Make sure the saga executes the different steps (the sending of messages) in the right order. So that messages that cannot be compensated are sent last. This way message that can be compensated if they fail, will be compensated by the saga. The ones that cannot be compensated should be sent last, when you're as sure as possible that they don't have to be compensated. Because that last message is the last step in synchronizing all systems.
All in all this is one of the problems with distributed systems, keeping everything in sync. Compensating actions is the way to deal with this. If compensating actions aren't possible, you're in a very difficult situation. Try to see if the business can help by becoming more flexible and accepting that you need to compensate things, where they'll tell you it's not possible.
In some exceptional situations I need somehow to tell consumer on receiving point that some messages shouldn’t be processed.
Can't you revert this into:
Tell the consumer that an earlier message can be processed.
This way you can easily turn this in a state machine (like a saga) that acts on two messages. If the 2nd message never arrives then you can discard the 1st after a while or do something else.
The strategy here is to halt/wait until certain that no actions need to be reverted.

Error notification on mule flows

I am relatively new to mule and Im wondering if there is a built in error notification if there an error on a mule flow or if this can be set up in the mmc to trigger an alert if something is wrong with the flow. Please advise.
Thanks and Have a good day!
You can do it in several ways.
The MMC can do log analysis send an email if matches a certain pattern.
You can simply had an exception handling in the flow like a Catch Exception and do the send of a mail there, I wrote a blog post having has sample this case
You can use mule notification system to create a java class that will manage this notification, maybe using log4j SMTP appender to easily send notification mails.
This are my opinion about the 3 methods
First method relies on the mmc being up, mmc creates load on your mule server anyway by making calls and in some production environnement it may be disabled depending on the internal policy of the company. Furthermore if for some reasons goes down you will not receive any notification so you also need to make sure your mmc stays high available. Not an option for me.
I find this method the most appropriate as is it similar to standard exception handling in programming. Manage your exception when you need to and never let them pass silently. When needed send some via mail.
This approach is not bad but not my favorite because lot of times you will see "fake exception" coming in that you need to filter. One example is when a client stops the connection to mule (closing the browser for example) you will get a socket exception that mule cannot write back, this is totally normal and I don't think you want to be spammed by this kind of mail. If you really want to use this system than keep in mind you will need to filter non critical exceptions.
Hope this helps

Persisting Data in a Twisted App

I'm trying to understand how to persist data in a Twisted application. Let's say I've decided to write a Twisted server that:
Accepts inbound SMTP requests
Sends the message to a 3rd party system for modification
Relays the modified message to its destination
A typical Twisted tutorial would have you build this app using Deferreds and callbacks, roughly:
A Factory handles inbound requests
Each time a full email is received a call is sent to the remote message processor, returning a deferred
Add an errback that substitutes the original message if anything goes wrong in the modify call.
Add a callback to send the message on to the recipient, which again returns a deferred.
A real server would add/include additional call/errbacks to retry or notify the sender or whatnot. Again for simplicity, assume we consider this an acceptable amount of effort and just log errors.
Of course, this persists NO data in the event of a crash/restart/something else. I get that a solution involves a 3rd party persistent datastore (RabbitMQ is often mentioned) and could probably come up with a dozen random ways to achieve the outcome.
However, I imagine there are a few approaches that work best in a Twisted app. What do they look like? How do they store (and restore in the event of a crash) the in-process messages?
If you found this question, you probably already know that Twisted is event-based. It sounds simple, but the "hardest" part of the answer is to get the persistence platform generating the events we need when we need them. Naturally, you can persist the data in a DB or a message queue, but some platforms don't naturally generate events. For example:
ZeroMQ has (or at least had) no callback for new data. It's also relatively poor at persistence.
In other cases, events are easy but reliability is a problem:
pgSQL can be configured to generate events using triggers, but they're one-time things so you can't resume incomplete events
The light at the end of the tunnel seems to be something like RabbitMQ.
RabbitMQ can persist the message in a database to survive a crash
We can use acknowledgements on both legs (incoming SMTP to RabbitMQ and RabbitMQ to outgoing SMTP) to ensure the application is reliable. Importantly, RabbitMQ supports acknowledgements.
Finally, several of the RabbitMQ clients provide full asynchronous support (see for example pika, txampq, and puka)
It's enough for our purposes that the RabbitMQ client provides us an event-based interface.
At a more theoretical level, however, this need not be the case. In fact, despite the "notification" issue above, ZeroMQ has an event-based client. Even if our software is elegantly event-based, we will run into systems that aren't. In these cases, we have no choice but to fall back on polling. In principle, if not in practice, we just query the message provider for messages. When we exhaust the current queue (and immediately if there are no messages), we use a callLater command to check again in the future. It may feel anti-pattern, but it's (to the best of my knowledge anyway) the right way to handle this particular case.