I am using BeginPeek() /no params/ to subscribe to messages coming in to my private queue. This is being done in a service hosted in NServiceBus host. When NServiceBus encounters transport connection timeout exception (i'm seeing circuit breaker armed logs and timeout exception logs), the peek event subscription seems get lost. When database connectivity becomes stable and new messages come in to my queue, the service is no longer notified.
Any ideas or suggestions on how to address this?
Related
I have read the MT documentation on Error handling and faults and put some code to publish the fault and written a fault consumer to listen to the fault message after some number of retries with Polly.
I have a queue consumer gets the messages from RabbitMQ using MassTranasit and send to a cloud system through Http api. I have handled all possible exceptions and also wrapped http calls in Polly retry for transient network errors. But the problem with this approach is the message is literally abandoned from processing after the retries exhausted.
If the destination system is down for 10 hrs assume( this outage we don't know before otherwise i will plan for consumer service stop), what is the best strategy we can put with MassTransit to stop pulling the messages from Queue into Consumer? Is there a way we can stop receiving the messages based on number of failures etc..?
Thanks
You need a circuit breaker, it's a well-known pattern in distributed systems. The circuit breaker activates when the remote system is struggling under load and putting more requests to it will potentially strangle it. It would also allow you to stop sending messages to the remote system when it is down.
The circuit breaker is available in MassTransit out of the box.
I would also not recommend implementing retries using Polly in the consumer. MassTransit has a comprehensive set of retry policies and it also allows MassTransit to understand how many failures occur in the consumer, which is not available when you use Polly. For example, the circuit breaker middleware won't know about failures in a Polly-wrapped call and therefore won't be reacting properly.
If the remote system is down for a long time (like hours, as you described), any retry policy with a limited number of attempts will eventually fail. The circuit breaker will open but it would reset from time to time and try sending calling the consumer again. Otherwise, it won't ever know when the remote system is recovered. So, you would either need to recover messages from the error queue or add the redelivery middleware.
You can therefore configure your receive pipeline this way:
redelivery -> circuit breaker -> retry -> consumer
I have experienced the following situation with an ActiveMQ Pub/Sub implementation. If the connection to the message broker is lost the publisher could re-try to establish a connection since the publish method would throw an exception.
However if the connection to the message broker is lost at the subscriber end, the subscriber would not know it. This would be the same if the session expires.
Proposed solution:
One of the solutions I thought was to implement a heartbeat at the subscriber end to periodically publish a ping message to a separate topic so that the subscriber could know if the connection is dropped. This works fine, but the down side is that the amount of ping messages generated by the subscribers available in the system. The second option I thought was to implement the heartbeat to try and create a connection in an interval. WDYT?
Do you see a better way of implementing this? Appreciate your thoughts.
Use the ActiveMQ Failover transport and don't disable the inactivity monitor and the client will check the connection and automatically reconnect as needed. Without more information on you set-up that's about the best answer.
I've got an iOS application which uses a STOMP Client to talk to RabbitMQ. The application loads a lot of state during startup, and then keeps that state in sync by receiving updates published on STOMP. Of course, if it loses its connection, it can no longer be sure it's in sync, and therefore has to re-load that large initial blob. Any kind of network interruption triggers this behavior and makes my customers sad.
There are a lot of big-picture ways to fix this (and I'm working on them) but in the meantime, I'm trying to use persistent queues to solve this problem. The idea is that the server will create a queue, bind it to the appropriate topics, and then start building the large startup bundle. When finished, it will hand everything off to the client. The client will set itself up with the startup bundle, open a subscription to the queue, and then process any updates which happened while the server was getting things ready. Similarly, if the client should become disconnected, it can simply reconnect and resume reading the messages it finds in the queue.
My problem is that while the client successfully receives messages sent after it connects, if there were any messages in the queue before it connected, they are not read. Likewise, if the client becomes disconnected, when it reconnects, it won't see any messages which arrived while it was away.
Can anyone suggest how I might get the client to be able to read those missing messages?
It turns out what was happening was that the STOMP adapter was consuming the messages but failing to deliver them. Thus, when the client reconnected, it wouldn't have any messages waiting for it.
To fix the problem, I changed the "ack" setting on the subscribe request to "client", meaning that STOMP shouldn't consider the message delivered until the client sends back an ACK frame. By changing my client appropriately, messages now get delivered even after the client has been away.
I know that once a message has been delivered to the MSMQ by a WCF client, the netmsmqbinding provides retries out of the box in case the service faults.
But if my client fails to put the message in the MSMQ in the first place, is there an out of the box client retry available in WCF or do I have to implement a client queue and retry logic in my client code?
Thanks
It's a highly unlikely scenario that your messages sent to the service will not even be placed in the client queue in the first place, if you have MSMQ server running on the client station and the MSMQ listener service is up and running you should have nothing to worry about. I don't think MSMQ offers anything to check this for you, you should code some method on your client to periodically Peek() the local queue and send an acknowledgment receipt for every message that has reached the queue, this is feasible since you can easily access your local queues in code and also every message sent via MSMQ from a client to a service will always go trhough the local queue. You can also tell that the message reaches the queue if your Send() method desn't return an error. But I don't think you really need to worry about message son the client not reaching the local queue first.
Not being an expert on MSMQ or WCF, I have read up a fair bit about it and it sounds and looks great.
I am trying to develop something, eventually but first some theory, which needs to be robust and durable.
MSMQ I guess will be hosted on a seperate server.
There will be 2 WCF services. One for incoming messages and the other for outgoing messages (takes a message, does some internal processing/validation then places it on the outgoing messages queue or maybe sending an email/text message/whatever)
I understand with the right configuration, we can have the system so that it can be transactional (no messages are ever lost) and can be sent exactly once, so no chance of duplication of messages.
The applications/services will be multithreaded to process messages, which there will be hundreds and thousands of them.
BUT during the processing of a message or through the services lifetime, what if the server crashes? What if the server reboots? What if the service throws an exception for whatever reason? How is it possible to not lose that message but some how to put it back on the queue waiting for it to be processed again?
Also how is it possible to make sure that the service is robust in such a way that it will spawn itself again?
I'd appreciate any advice and details here. There is quite alot to take in and WCF/MSMQ exposes quite alot of options.
Your assumption:
MSMQ I guess will be hosted on a seperate server.
is incorrect. MSMQ is installed on all machines which want to participate in message queuing.
There will be 2 WCF services. One for incoming messages and the other
for outgoing messages
In the most typical configuration, the destination queues are local to the listening service.
For example, your ServiceA would have a local queue from which it reads. ServiceB also has a local queue from which it reads. If ServiceA wants to call ServiceB it will put a message into ServiceB's local queue.
I understand with the right configuration, we can have the system so
that it can be transactional (no messages are ever lost)
This is correct. This is because MSMQ uses a messaging pattern called store-and-forward. See here for an explanation.
Essentially the reason it is safe to assume no message loss is because the transmission of a message from one machine to another actually takes place under three distinct transactions.
The first transaction: ServiceA writes to it's own temporary local queue. If this fails the transaction rolls back and ServiceA can handle the exception.
Second transaction: Queue manager on ServiceA machine transmits message to Queue manager on ServiceB machine. If failure then message remains on temporary queue.
Third transaction: ServiceB reads the message off local queue. If ServiceB message handler method throws exception then transaction rolls message back to local queue.
The applications/services will be multithreaded to process messages
This is fine except if you require order to be preserved in the message processing chain. If you need ordered processing then you cannot have multiple threads without implementing a re-sequencer to reapply order.
I thought that MSMQ can be hosted seperately and have x servers share
that queue?
All servers which want to participate in the exchange of messages have MSMQ installed. Each server can then write to any queue on any other server.
The reason for my thinking was because what if the server goes down?
Then how will the messages get sent/received into MSMQ
If the queues are transactional then that means messages on them are persisted to disk. If the server goes down then when it comes back up the messages are still there. While a server is down it obviously cannot participate in the exchange of messages. However, messages can still be "sent" to that server - they just remain local to the sender (in a temporary queue) until the destination server comes back on-line.
so by having one central MSMQ server (and having it mirrored/failover)
then there will be guarentee of uptime
The whole point of using message queueing is it's a fault-tolerant transport, so you don't need to guarantee uptime. If you have a 100% availability then there would be little reason to use message queuing.
how will WCF be notified of messages that are incoming?
Each service will listen on its own local queue. When a message arrives, the WCF runtime causes the handling method to be called and the message to be handled.
how will the service be notified of failures of sending messages
If ServiceA fails to transmit a message to ServiceB then ServiceB will never be notified of that failure. Nor should it be. ServiceA will handle the failure to transmit, not ServiceB. Your expectation in this instance creates a hard coupling between the services, something which message queueing is supposed to remove.
MSMQ can store messages even if temporary shutdown the service or reboot computer.
Main goal of WCF is transport message from source to destination. Doesn't matter what is the transport. In your case MSMQ is transport for WCF and not obvious to have online / available both client and service simultaneously. But when message is received, it's your responsibility to correctly process it, despite what transport was used to send message.