Issue while sending messages in loop to windows rabbitmq broker - rabbitmq

My Setup
I have a situation where i am sending some 15 messages in a loop from one machine to another machine via rabbitmq.
There is NAT setup between sending and recieving machine.
I am using spring rabbitmq for all rabbitmq operations.
On the receiving machine, i am losing 2 messages sometimes which is never received even after waiting for a long time.
And also i don't see any messages accumulated in the queue (in both sending machine and receiving machine).
And also there is only listener for the queue in the receiving machine.
My Question
if i send messages in a loop to rabbitmq, is there any chance that it rejects some messages if it cant handle? The overall size of 15 messages in close to 8mb.
I don't see any exceptions even after i perform send message to rabbitmq.
SENDING MACHINE CODE
#Override
public boolean send(final Message message, final String routingKey)
throws SinecnmsMessagingException {
private RabbitTemplate rabbitTemplate = null;
rabbitTemplate.send(routingKey, message);
}
RECIEVING MACHINE CODE
<rabbit:listener-container
connection-factory="connectionFactory">
<rabbit:listener ref="onMessageCommand"
queue-names="TestQueue" />
</rabbit:listener-container>
<bean id="onMessageCommand"
class="com.test.OnMessageListner">
<property name="callBackObject" ref="callbackEvent" />
<property name="template" ref="amqpTemplate" />
</bean>
<bean id="callbackEvent" class="com.test.SettingsListener"></bean>
OnMessageListner implements MessageListener.
In SettingsListener class, i recieve the messages. This is working fine for all me in other code that i have developed. Only in this use case which i have mentioned, i am observing this issue.

So does it mean that publisher confirms concept was introduced because some times rabbitmq may "reject/not accept" messages. With publisher confirms we can know if the first message was recieved by rabbitmq broker and then send the second message.
Can we conclude this?
No you cannot; waiting for each confirmation would slow down the publishing; confirms are designed so you send a bunch of messages and then wait for the confirms.
It was not >introduced because sometimes rabbitmq may "reject/not accept" messages; publishing with RabbitMQ is asynchronous; so a publish is generally successful - but anything can happen between sending the message and it arriving at the broker - if the connection is lost, the client is told, but that's too late for the publisher since he has already completed successfully.
NAT should make no difference but, perhaps, some flaky network router might be the problem.
You can use a network monitor (e.g. WireShark) to see what's happening.

Related

Check if BeginPeek is still Subscribed

I am using BeginPeek() /no params/ to subscribe to messages coming in to my private queue. This is being done in a service hosted in NServiceBus host. When NServiceBus encounters transport connection timeout exception (i'm seeing circuit breaker armed logs and timeout exception logs), the peek event subscription seems get lost. When database connectivity becomes stable and new messages come in to my queue, the service is no longer notified.
Any ideas or suggestions on how to address this?

RabbitMQ dropping messages after the first one

I'm using celery 3.0.18 with RabbitMQ 3.0.2. I have a task sent to another application by using celery.send_task, and I can see the send_task call in my logs, I can see the packets leaving the worker instance, and I can see the packets reaching the RabbitMQ instance when I call tcpflow -ce -i any port 5672, however, only the first message gets to the queue. They all have the same routing key, I tried recreating the exchange and bindings, and even a new RabbitMQ instance, and nothing seems to work. This used to work fine for months, until we had to rebuild the RabbitMQ from scratch after a crash in our AWS infrastructure. Strangely, I have the exact same setup working on other application, using the same broker and the same exchange, binding and queue, and it works perfectly there. Also, it works when I send the messages to the same exchange using the same call from a management script, running from the shell on the same instance, but it doesn't work when it's sent from the celery task in the worker process.
Any ideas on what the problem might be?
Eventually, I figured what's wrong, but it's not clear if this is the expected behavior, a celery bug, or a RabbitMQ bug.
What happens is that besides our application tasks, I have a custom logging handler used to send logs to a central location using RabbitMQ, using celery.send_task. This logging handler sends messages to an exchange named application.logger, with a routing key like application.logger.info, application.logger.warning, etc, and have bindings to route some logging levels to specific queues. This exchange, bindings and queues were created directly in RabbitMQ and not defined in Celery routes.
When the worker tries to send a message to this exchange and it doesn't exist, Celery would log a 404 NOT_FOUND error. After that, tasks sent to other exchanges using the same connection weren't delivered. They were sent by the worker instance, we could see the packets arriving and the RabbitMQ management screen for that connection even shows the data arriving from the client in kb/s, but no messages were delivered.

How do you replay missed messages when using STOMP to connect to RabbitMQ?

I've got an iOS application which uses a STOMP Client to talk to RabbitMQ. The application loads a lot of state during startup, and then keeps that state in sync by receiving updates published on STOMP. Of course, if it loses its connection, it can no longer be sure it's in sync, and therefore has to re-load that large initial blob. Any kind of network interruption triggers this behavior and makes my customers sad.
There are a lot of big-picture ways to fix this (and I'm working on them) but in the meantime, I'm trying to use persistent queues to solve this problem. The idea is that the server will create a queue, bind it to the appropriate topics, and then start building the large startup bundle. When finished, it will hand everything off to the client. The client will set itself up with the startup bundle, open a subscription to the queue, and then process any updates which happened while the server was getting things ready. Similarly, if the client should become disconnected, it can simply reconnect and resume reading the messages it finds in the queue.
My problem is that while the client successfully receives messages sent after it connects, if there were any messages in the queue before it connected, they are not read. Likewise, if the client becomes disconnected, when it reconnects, it won't see any messages which arrived while it was away.
Can anyone suggest how I might get the client to be able to read those missing messages?
It turns out what was happening was that the STOMP adapter was consuming the messages but failing to deliver them. Thus, when the client reconnected, it wouldn't have any messages waiting for it.
To fix the problem, I changed the "ack" setting on the subscribe request to "client", meaning that STOMP shouldn't consider the message delivered until the client sends back an ACK frame. By changing my client appropriately, messages now get delivered even after the client has been away.

MSMQ, WCF and robustness

Not being an expert on MSMQ or WCF, I have read up a fair bit about it and it sounds and looks great.
I am trying to develop something, eventually but first some theory, which needs to be robust and durable.
MSMQ I guess will be hosted on a seperate server.
There will be 2 WCF services. One for incoming messages and the other for outgoing messages (takes a message, does some internal processing/validation then places it on the outgoing messages queue or maybe sending an email/text message/whatever)
I understand with the right configuration, we can have the system so that it can be transactional (no messages are ever lost) and can be sent exactly once, so no chance of duplication of messages.
The applications/services will be multithreaded to process messages, which there will be hundreds and thousands of them.
BUT during the processing of a message or through the services lifetime, what if the server crashes? What if the server reboots? What if the service throws an exception for whatever reason? How is it possible to not lose that message but some how to put it back on the queue waiting for it to be processed again?
Also how is it possible to make sure that the service is robust in such a way that it will spawn itself again?
I'd appreciate any advice and details here. There is quite alot to take in and WCF/MSMQ exposes quite alot of options.
Your assumption:
MSMQ I guess will be hosted on a seperate server.
is incorrect. MSMQ is installed on all machines which want to participate in message queuing.
There will be 2 WCF services. One for incoming messages and the other
for outgoing messages
In the most typical configuration, the destination queues are local to the listening service.
For example, your ServiceA would have a local queue from which it reads. ServiceB also has a local queue from which it reads. If ServiceA wants to call ServiceB it will put a message into ServiceB's local queue.
I understand with the right configuration, we can have the system so
that it can be transactional (no messages are ever lost)
This is correct. This is because MSMQ uses a messaging pattern called store-and-forward. See here for an explanation.
Essentially the reason it is safe to assume no message loss is because the transmission of a message from one machine to another actually takes place under three distinct transactions.
The first transaction: ServiceA writes to it's own temporary local queue. If this fails the transaction rolls back and ServiceA can handle the exception.
Second transaction: Queue manager on ServiceA machine transmits message to Queue manager on ServiceB machine. If failure then message remains on temporary queue.
Third transaction: ServiceB reads the message off local queue. If ServiceB message handler method throws exception then transaction rolls message back to local queue.
The applications/services will be multithreaded to process messages
This is fine except if you require order to be preserved in the message processing chain. If you need ordered processing then you cannot have multiple threads without implementing a re-sequencer to reapply order.
I thought that MSMQ can be hosted seperately and have x servers share
that queue?
All servers which want to participate in the exchange of messages have MSMQ installed. Each server can then write to any queue on any other server.
The reason for my thinking was because what if the server goes down?
Then how will the messages get sent/received into MSMQ
If the queues are transactional then that means messages on them are persisted to disk. If the server goes down then when it comes back up the messages are still there. While a server is down it obviously cannot participate in the exchange of messages. However, messages can still be "sent" to that server - they just remain local to the sender (in a temporary queue) until the destination server comes back on-line.
so by having one central MSMQ server (and having it mirrored/failover)
then there will be guarentee of uptime
The whole point of using message queueing is it's a fault-tolerant transport, so you don't need to guarantee uptime. If you have a 100% availability then there would be little reason to use message queuing.
how will WCF be notified of messages that are incoming?
Each service will listen on its own local queue. When a message arrives, the WCF runtime causes the handling method to be called and the message to be handled.
how will the service be notified of failures of sending messages
If ServiceA fails to transmit a message to ServiceB then ServiceB will never be notified of that failure. Nor should it be. ServiceA will handle the failure to transmit, not ServiceB. Your expectation in this instance creates a hard coupling between the services, something which message queueing is supposed to remove.
MSMQ can store messages even if temporary shutdown the service or reboot computer.
Main goal of WCF is transport message from source to destination. Doesn't matter what is the transport. In your case MSMQ is transport for WCF and not obvious to have online / available both client and service simultaneously. But when message is received, it's your responsibility to correctly process it, despite what transport was used to send message.

Behavior of channels in "confirm" mode with RabbitMQ

I've got some trouble understanding the confirm of RabbitMQ, I see the following explanation from RabbitMQ:
Notes
The broker loses persistent messages if it crashes before said
messages are written to disk. Under certain conditions, this causes
the broker to behave in surprising ways. For instance, consider this
scenario:
a client publishes a persistent message to a durable queue
a client consumes the message from the queue (noting that the message is persistent and the queue durable), but doesn't yet ack it,
the broker dies and is restarted, and
the client reconnects and starts consuming messages.
At this point, the client could reasonably assume that the message
will be delivered again. This is not the case: the restart has caused
the broker to lose the message. In order to guarantee persistence, a
client should use confirms. If the publisher's channel had been in
confirm mode, the publisher would not have received an ack for the
lost message (since the consumer hadn't ack'd it and it hadn't been
written to disk).
Then I am using this http://hg.rabbitmq.com/rabbitmq-java-client/file/default/test/src/com/rabbitmq/examples/ConfirmDontLoseMessages.java to do some basic test and verify the confirm, but get some weird results:
The waitForConfirmsOrDie method doesn't block the producer, which is different from my expectation, I suppose the waitForConfirmsOrDie will block the producer until all the messages have been ack'd or one of them is nack'd.
I remove the channel.confirmSelect() and channel.waitForConfirmsOrDie() from publisher, and change the consumer from auto ack to manual ack, I publish all messages to the queue and consume messages one by one, then I stop the rabbitmq server during the consuming process, what I expect now is the left messages will be lost after the rabbitmq server is restarted, because the channel is not in confirm mode, but I still see all other messages in the queue after the server restart.
Since I am new to RabbitMQ, can anyone tells me where is my problem of the confirm understanding?
My understanding is that "Channel Confirmation" is for Broker confirms it successfully got the message from producer, regardless of consumer ack this message or not. Depending on the queue type and message deliver mode, see http://www.rabbitmq.com/confirms.html for details,
the messages are confirmed when:
it decides a message will not be routed to queues
(if the mandatory flag is set then the basic.return is sent first) or
a transient message has reached all its queues (and mirrors) or
a persistent message has reached all its queues (and mirrors) and been persisted to disk (and fsynced) or
a persistent message has been consumed (and if necessary acknowledged) from all its queues
Old question but oh well..
I publish all messages to the queue and consume messages one by one, then I stop the rabbitmq server during the consuming process, what I expect now is the left messages will be lost after the rabbitmq server is restarted, because the channel is not in confirm mode, but I still see all other messages in the queue after the server restart.
This is actually how it should work, IF the persistence is enabled. If the server crashes or something else goes wrong, the messages cannot be confirmed, and thus, won't be removed from the queue.
Messages will only be removed from the queue if they are confirmed to be handled, or the broker didn't yet write it to memory or disk before the server crashed.
Confirming and acknowledging can be set off if wanted, and the producer won't be waiting for the acks. I cannot find the exact command for it right now, but it does exist.
More on the acks and confirms: https://www.rabbitmq.com/reliability.html