Server closes after pika.exceptions.StreamLostError: Stream connection lost - rabbitmq

I have some images in my queue and I pass each image to my flask server where processing on images is done and a response is received in my rabbitmq server. After receiving response, I get this error "pika.exceptions.StreamLostError: Stream connection lost(104,'Connection reset by peer')". This happens when rabbitmq channel again starts consuming the connection. I don't understand why this happens. Also I would like to restart the server again automatically if this error persists. Is there any way to do that?

Your consume process is probably taking too much time to complete and send Ack/Nack to the server. Therefore, server does not receive heartbeat from your client, and thereby stops from serving. Then, on the client side you receive:
pika.exceptions.StreamLostError: Stream connection lost(104,'Connection reset by peer')
You should see server logs as well. It is probably like this:
missed heartbeats from client, timeout: 60s
See this issue for mor information.

Do your work on another thread. See this code as an example -
https://github.com/pika/pika/blob/master/examples/basic_consumer_threaded.py
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

You can change stream connection limit if you set heartbeat in ConnectionParameters
connection_params = pika.ConnectionParameters(heartbeat=10)
wher number in seconds. It say yout TCP connection keepalive to 10 seconds for example.
More information https://www.rabbitmq.com/heartbeats.html and https://www.rabbitmq.com/heartbeats.html#tcp-keepalives

Related

Why does this connection sometimes closes (RESET) Flags: 0x014 (RST, ACK) TCP

We have an issue on the acceptance environment during a handshake process. External service tries to send some data and during handshake sometimes we reset the connection after timeout around 2 minutes. In the picture below you can see communication between two services our server IP ends with 11 and external service IP ends with 5.
The strange thing is that is more than 50% it's working and when it's happened (they will try to send us data every hour and we reject each of them). In between, if we send data to them the next try from them will be successful (picture below). In this case, they use the server IP ends with .6.
Does someone have a clue what can be a problem here? We have tried to find something in our logs but nothing wasn't logged. Some help regarding additional logging will be appreciated (we tried with https://learn.microsoft.com/en-us/dotnet/framework/network-programming/how-to-configure-network-tracing and https://learn.microsoft.com/en-us/dotnet/framework/wcf/diagnostics/tracing/configuring-tracing?redirectedfrom=MSDN). Our backed in written in C# WCF. The additional fact, when we try to send data to them we never have an issue, it's always working.

RabbitMQ durable queue losing messages over STOMP

I have a webpage connecting to a rabbit mq broker using javascript/websockets that are exposed by a spring app deployed in tomcat. Messages are produced 1 per second by an external application and are rendered on the webpage. The javascript subscription is durable.
The issue I'm experiencing is that when the network connection is broken on the javascript client for a period of time (say 60 seconds), the first ~24 seconds of messages are missing. I've looked through the logs of the app deployed in tomcat and the missing messages seem to be up until the following log statement:
org.springframework.messaging.simp.stomp.StompBrokerRelayMessageHandler - DEBUG - TCP connection to broker closed in session 14
I think this is the point at which the endpoint realises the javascript client is disconnected and decides to close the connection to the broker resulting in future messages queueing up.
My question is how can I ensure that the messages between the time the network is severed and the time the endpoint realises the client is disconnected are not lost? Should the endpoint put the messages back on the queue somehow? Maybe there's a way to make it transactional?
Thanks in advance.
The RabbitMQ team monitors this mailing list and only sometimes answers questions on StackOverflow.
Your Tomcat application should not acknowledge messages from RabbitMQ until it confirms that your Javascript client has received them. This way, any messages that aren't ack-ed by the JS client won't be ack-ed by Tomcat, and RabbitMQ will re-deliver them.
I don't know how your JS app and Tomcat interact, but you may have to implement your own ack process there.

rabbitmq messages not consumed - tcp closure

UPDATE - apparently a tcp closure
I see on rabbit server:
=ERROR REPORT==== 24-Jan-2015::03:22:00 ===
closing AMQP connection <0.1070.22> (209.151.226.37:38040 -> 192.168.80.81:5672):
{inet_error,etimedout}
This conections appears alive on my app's side. How to prevent this? tcp keepalive parms look OK.
I have two apps.
One, "processor", consumes jobs from a queue and sends replies to a response queue.
The other, "responder" consumes from this response queue and talks to a database.
I had some replies which apparently made it into the response queue because upon restart of the responder they were handled and database updated appropriately. But before that restart where were they?
How can I pinpoint why they weren't PREVIOUSLY handled? That responder seems to have been running fine.
In the responder I do
res = amqp_consume_message(Cx->conn, genvelope, &tqb, 0);
I ack ( not multiple ) after replying to the database.
I have prefetch at 11.
The processor was closed and restarted a few times during this FWIW. Also the processor is the one that establishes the exchange used for the replies; the responder connects to it.
I have the management url up.
I saw no indication that the replies were available from the consume(), which makes sense since the database wasn't updated. The processor did do its processing and put a reply in the response queue according to logs.
In separate testing I saw that messages in the reply aren't destroyed by restarting the processor - reply exchange is durable.
The apps generally work.
Any debugging suggestions or conceptual info that might be relevant would be appreciated.

RabbitMQ: Server heartbeat must fail 3 times before connection drop?

We have a HA RabbitMQ cluster (v3.2.x) with two nodes that sits behind a load-balancer. Our clients are configured to use a 300s heartbeat. Everything works as expected most of the time.
However, if the client's connection drops (say the client's NIC is disconnected), we have noticed (via TCPDump/wireshark) that the RabbitMQ node will attempt 3 heartbeat messages (in our case nearly 15 mins) before it closes the connection. Why? Why not close it after one failure?
Is there some means to change this behavior on the RabbitMQ server? Or do we have to shorten our heartbeat to something much smaller like 5s or 10s in order to get the connection to close sooner, thoughts?
Related issue...
Looking at the TCPDump (captured on load-balancer), I wonder why the LB doesn't close the connection when it doesn't receive the TCP-ACK from the dead client in response to the proxied RabbitMQ server heartbeat request? In fact, the LB will attempt to send the request several times (never receiving a response, of course). Wouldn't it make sense for the LB to make the assumption the connection has been dropped and close the entire session (including the connection to RabbitMQ node)?
It appears as though RabbitMQ is configured to tolerate two missed heartbeats before it terminates the connection. However, it waits until the next heartbeat would need to be sent before it drops the connection, that's what gives it the appearance of requiring 3 missed heartbeats.
Heartbeat1 (no response) wait Heartbeat2 (no response) wait Heartbeat3 terminate
There is a slight bug in MQ (it sends a 3rd heartbeat but immediately terminates the connection) but it isn't really affecting anything.

How to debug ActiveMQ client?

I'm a fairly new user of ActiveMQ and I'm looking for a way to get detailed debug information on the client side of a queue connection. My problem is this: I have a server that is sending a message through a queue to a client. Using the admin web page associated with the broker, I can verify the following: the queue was created, there is a consumer associated with the queue, the message has been enqueued, the message has been dispatched, the dispatched queue size is 1, the message has not been dequeued. This setup was working yesterday but mysteriously stopped working today even though I did a restart of the activemq service. The log file at /var/log/activemq.log does not contain any useful information.
At this point I'm stumped; I'm assuming that there is some sort of problem with the configuration, but it hasn't changed since yesterday. Does anybody have a suggestion about what my next step should be?
Turn on debug (or even trace) logging in the broker first of all in conf/log4j.properties.
log4j.logger.org.apache.activemq=DEBUG
restart the broker and re-run your scenario. The logging will hopefully provide you with some information.
Jconsole is also a useful tool to monitor the running broker.
Does your client use any message filters?
You can also enable remote debugging and then connect with an IDE.
To start remote debugging execute
$ ACTIVEMQ_DEBUG=true bin/activemq
and then start a remote debugger to connect to port 5005