RabbitMQ 3.10.1
rabbitmq-diagnostics status
...
Config files
* /etc/rabbitmq/rabbitmq.config
...
rabbitmq.config:
[
{rabbit,
[
{heartbeat, 90}
]
}
].
RabbitMQ Management show 5s heartbeat
And log:
2022-05-13 19:56:43.235925+03:00 [error] <0.5979.0> closing AMQP connection <0.5979.0> (xxx.xxx.xxx.xxx:3555 -> xxx.xxx.xxx.xxx:5672):
2022-05-13 19:56:43.235925+03:00 [error] <0.5979.0> missed heartbeats from client, timeout: 5s
How to fix this?
Set the heartbeat to 90s in the client. Most clients are able to set the heartbeat (from the client). RabbitMQ will respect the heartbeat suggested by the client. More about that here: https://www.rabbitmq.com/heartbeats.html#heartbeats-timeout
Related
I have an application that is pushing data into RabbitMQ and then some other apps are subscribing to the different exchanges.
But recently, I keep having errors like this after a few hours:
2020-07-09 12:45:12.670 [error] <0.23578.1> Error on AMQP connection <0.23578.1> (172.18.0.5:48230 ->
172.18.0.3:5672, vhost: '/', user: 'guest', state: running), channel 6:
operation basic.publish caused a connection exception unexpected_frame:
"expected content header for class 60, got non content"
2020-07-09 12:45:12.674 [info] <0.23578.1> closing AMQP connection <0.23578.1> (172.18.0.5:48230 ->
172.18.0.3:5672, vhost: '/'
On the client side, I get messages like this:
"Already closed: The AMQP operation was interrupted: AMQP close-reason, initiated by Peer,
code=505, text='UNEXPECTED_FRAME - expected content body, got non content body frame instead',
classId=60, methodId=40"
This is on a docker container.
What could this error be about?
You are sharing a channel for concurrent publishing, use below code
lock (ch) { ch.BasicPublish(); }
I'm using rabbitmq cluster in k8s which has only pure ipv6 address. inet return nxdomain error when parsing the k8s service name.
The paramter passed to erlang from rabbitmq is:
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+A 128 -kernel inetrc '/etc/rabbitmq/erl_inetrc' -proto_dist inet6_tcp"
RABBITMQ_CTL_ERL_ARGS="-proto_dist inet6_tcp"
erl_inetrc: |-
{inet6, true}.
when rabbitmq using its plugin rabbit_peer_discovery_k8s to invoke k8s api:
2019-10-15 07:33:55.000 [info] <0.238.0> Peer discovery backend does not support locking, falling back to randomized delay
2019-10-15 07:33:55.000 [info] <0.238.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized start
up delay.
2019-10-15 07:33:55.000 [debug] <0.238.0> GET https://kubernetes.default.svc.cluster.local:443/api/v1/namespaces/tazou/endpoints/zt4-crmq
2019-10-15 07:33:55.015 [debug] <0.238.0> Response: {error,{failed_connect,[{to_address,{"kubernetes.default.svc.cluster.local",443}},{inet,[inet]
,nxdomain}]}}
2019-10-15 07:33:55.015 [debug] <0.238.0> HTTP Error {failed_connect,[{to_address,{"kubernetes.default.svc.cluster.local",443}},{inet,[inet],nxdom
ain}]}
2019-10-15 07:33:55.015 [info] <0.238.0> Failed to get nodes from k8s - {failed_connect,[{to_address,{"kubernetes.default.svc.cluster.local",443}}
,
{inet,[inet],nxdomain}]}
2019-10-15 07:33:55.016 [error] <0.237.0> CRASH REPORT Process <0.237.0> with 0 neighbours exited with reason: no case clause matching {error,"{fa
iled_connect,[{to_address,{\"kubernetes.default.svc.cluster.local\",443}},\n {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from
_config/0 line 167 in application_master:init/4 line 138
2019-10-15 07:33:55.016 [info] <0.43.0> Application rabbit exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kub
ernetes.default.svc.cluster.local\",443}},\n
in k8s console, the address could be resolved:
[rabbitmq]# nslookup -type=AAAA kubernetes.default.svc.cluster.local
Server: 2019:282:4000:2001::6
Address: 2019:282:4000:2001::6#53
kubernetes.default.svc.cluster.local has AAAA address fd01:abcd::1
the inet could return ipv6 address.
kubectl exec -ti zt4-crmq-0 rabbitmqctl eval 'inet:gethostbyname("kubernetes.default.svc.cluster.local").'
{ok,{hostent,"kubernetes.default.svc.cluster.local",[],inet6,16,
[{64769,43981,0,0,0,0,0,1}]}}
as I know, plugin call httpc:request to invoke k8s api. I don't know what's the gap between httpc:request and inet:gethostbyname. I also don't what's used by httpc:request to resolve the address of hostname.
I query for the rabbitmq plugin, It's said that rabbitmq plugin don't aware how erlang resovlve the address. https://github.com/rabbitmq/rabbitmq-peer-discovery-k8s/issues/55.
Anything else I could set for erl_inetrc so that erlang could resolve the ipv6 address? what did i miss to config? or how could i debug from erlang side? I'm new to erlang.
B.R,
Tao
I am using Akka.Remote to communicate between a server-side service application and multiple desktop client applications. The clients send a request message to the server (using Akka.net) and waits for the server to reply with a response message. The client applications are transient, meaning that they often connect to the server, stay connected for some time, disconnect and then reconnect again.
The problem I encountered is that sometimes when a client disconnects from the server actor (by shutting down its ActorSystem) and then reconnects back to the server, it does not receive any replies from the server for some time. After a few minutes the communication works without any problems. I found out that this issue occurs when the server sends a reply to a client that has disconnected during the request and is no longer reachable. The server cannot deliver the response message and it somehow marks the client endpoint as invalid.
In the log (on the server side) I am getting the following messages when the client is disconnected.
[DEBUG] 2016-01-21 13:04:58.6151 received AutoReceiveMessage <Terminated>: [akka.tcp://qb#client:8090/user/qb] - ExistenceConfirmed=True ServerActor
[DEBUG] 2016-01-21 13:04:58.6550 Stopped Akka.Remote.Transport.ProtocolStateActor
[ INFO] 2016-01-21 13:04:58.6550 Quarantined address [akka.tcp://qb#client:8090] is still unreachable or has not been restarted. Keeping it quarantined. Akka.Event.DummyClassForStringSources
[DEBUG] 2016-01-21 13:04:58.6725 Stopped Akka.Remote.ReliableDeliverySupervisor
[DEBUG] 2016-01-21 13:04:58.6725 no longer watched by [akka://myservice/system/endpointManager/reliableEndpointWriter-akka.tcp%3a%2f%2fqb%40client%3a8090-2] Akka.Remote.EndpointWriter
[DEBUG] 2016-01-21 13:04:58.6725 Disassociated [akka.tcp://myservice#server:8081] <- akka.tcp://qb#client:8090 Akka.Remote.EndpointWriter
[DEBUG] 2016-01-21 13:04:58.6725 Stopped Akka.Remote.EndpointWriter
And then when the client attempts to reconnect, I get:
[DEBUG] 2016-01-21 13:05:15.5883 ConnectResponse [akka.tcp://qb#client:8090/user/qb] ServerActor
[DEBUG] 2016-01-21 13:05:16.0467 Started (Akka.Remote.Transport.ProtocolStateActor) Akka.Remote.Transport.ProtocolStateActor
[DEBUG] 2016-01-21 13:05:16.0467 Stopped Akka.Remote.Transport.ProtocolStateActor
[ WARN] 2016-01-21 13:05:16.0467 AssociationError [akka.tcp://myservice#server:8081] -> akka.tcp://qb#client:8090: Error [Invalid address: akka.tcp://qb#client:8090] [] Akka.Remote.EndpointWriter
[ INFO] 2016-01-21 13:05:16.0467 Quarantined address [akka.tcp://qb#client:8090] is still unreachable or has not been restarted. Keeping it quarantined. Akka.Event.DummyClassForStringSources
[DEBUG] 2016-01-21 13:05:16.0643 Stopped Akka.Remote.ReliableDeliverySupervisor
[DEBUG] 2016-01-21 13:05:16.0711 no longer watched by [akka://myservice/system/endpointManager/reliableEndpointWriter-akka.tcp%3a%2f%2fqb%40client%3a8090-4] Akka.Remote.EndpointWriter
[DEBUG] 2016-01-21 13:05:16.0711 Disassociated [akka.tcp://myservice#server:8081] -> akka.tcp://qb#client:8090 Akka.Remote.EndpointWriter
[DEBUG] 2016-01-21 13:05:16.0711 Stopped Akka.Remote.EndpointWriter
[DEBUG] 2016-01-21 13:05:16.0867 received AutoReceiveMessage <Terminated>: [akka://myservice/system/endpointManager/reliableEndpointWriter-akka.tcp%3a%2f%2fqb%40client%3a8090-4] - ExistenceConfirmed=True Akka.Remote.EndpointManager
[DEBUG] 2016-01-21 13:05:16.0867 Terminated [akka.tcp://qb#client:8090/user/qb] ServerActor
I suspect that this behavior is a feature of Akka.net, however, I need to implement my system so that clients can disconnect and then reconnect back to the server without the need to wait. Is there any way to disable the quarantine mechanism or to gracefully close the client endpoint on the server so that the client endpoint doesn't get quarantined?
[ INFO] 2016-01-21 13:04:58.6550 Quarantined address [akka.tcp://qb#client:8090] is still unreachable or has not been restarted. Keeping it quarantined. - that says it all. The node was quarantined which requires a restart of the actor system.
However, IMHO - just upgrade to Akka.NET 1.0.6, which we released on Monday. We made the remoting policy manager much less brittle than it has been historically.
I'm having difficulty configuring my connection to CloudAMQP in my deployed grails application. I can run the application locally against a locally installed RabbitMQ instance but can't figure out how to correctly define my application to run on CloudBees using the CloudAMQP service.
In my Config.groovy, I'm defining my connection info and a queue:
rabbitmq {
connectionfactory {
username = 'USERNAME'
password = 'PASSWORD'
hostname = 'lemur.cloudamqp.com'
}
queues = {
testQueue autoDelete: false, durable: false, exclusive: false
}
}
When the application starts and tries to connect, I see the following log messages:
2013-08-23 21:29:59,195 [main] DEBUG listener.SimpleMessageListenerContainer - Starting Rabbit listener container.
2013-08-23 21:29:59,205 [SimpleAsyncTaskExecutor-1] DEBUG listener.BlockingQueueConsumer - Starting consumer Consumer: tag=[null], channel=null, acknowledgeMode=AUTO local queue size=0
2013-08-23 21:30:08,405 [SimpleAsyncTaskExecutor-1] WARN listener.SimpleMessageListenerContainer - Consumer raised exception, processing can restart if the connection factory supports it
org.springframework.amqp.AmqpIOException: java.io.IOException
at org.springframework.amqp.rabbit.connection.RabbitUtils.convertRabbitAccessException(RabbitUtils.java:112)
at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:163)
at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createConnection(CachingConnectionFactory.java:228)
at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils$1.createConnection(ConnectionFactoryUtils.java:119)
at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils.doGetTransactionalResourceHolder(ConnectionFactoryUtils.java:163)
at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils.getTransactionalResourceHolder(ConnectionFactoryUtils.java:109)
at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:199)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:524)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException
at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:106)
at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:102)
at com.rabbitmq.client.impl.AMQChannel.exnWrappingRpc(AMQChannel.java:124)
at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:381)
at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:516)
at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:545)
Caused by: com.rabbitmq.client.ShutdownSignalException: connection error; reason: java.net.SocketException: Connection reset
at com.rabbitmq.utility.ValueOrException.getValue(ValueOrException.java:67)
at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:33)
at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:343)
at com.rabbitmq.client.impl.AMQChannel.privateRpc(AMQChannel.java:216)
at com.rabbitmq.client.impl.AMQChannel.exnWrappingRpc(AMQChannel.java:118)
... 3 more
Caused by: java.net.SocketException: Connection reset
at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)
at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:131)
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:508)
2013-08-23 21:30:08,406 [SimpleAsyncTaskExecutor-1] INFO listener.SimpleMessageListenerContainer - Restarting Consumer: tag=[null], channel=null, acknowledgeMode=AUTO local queue size=0
2013-08-23 21:30:08,406 [SimpleAsyncTaskExecutor-1] DEBUG listener.BlockingQueueConsumer - Closing Rabbit Channel: null
2013-08-23 21:30:08,407 [SimpleAsyncTaskExecutor-2] DEBUG listener.BlockingQueueConsumer - Starting consumer Consumer: tag=[null], channel=null, acknowledgeMode=AUTO local queue size=0
Aug 23, 2013 9:30:11 PM org.apache.catalina.core.ApplicationContext log
INFO: Initializing Spring FrameworkServlet 'grails'
Aug 23, 2013 9:30:11 PM org.apache.coyote.http11.Http11Protocol init
INFO: Initializing Coyote HTTP/1.1 on http-8634
Aug 23, 2013 9:30:11 PM org.apache.coyote.http11.Http11Protocol start
INFO: Starting Coyote HTTP/1.1 on http-8634
According to https://developer.cloudbees.com/bin/view/RUN/CloudAMQP
when you bind your CloudAMQP service to your app - some config params are provided in the pattern of CLOUDAMQP_URL_ - this is the type of thing you would need to put in your config files so they can be wired in when the app launches.
Make sure to specify the virtualHost for CloudAMQP connections. That worked for me.
With the following rabbitmq config
[ {mnesia, [{dump_log_write_threshold, 100}]},
{rabbit, [{vm_memory_high_watermark, 0.4}]},
{rabbitmq_shovel,
[{shovels,
[{devShovel,
[{sources, [{broker, "amqp://shoveluser:shoveluser#server2:5672"}]},
{destinations, [{broker, "amqp://shoveluser:shoveluser#localhost:5672"}]},
{queue, <<"queue">>},
{publish_fields,[{exchange,<<"DataExchange">>}]}
]
}]
}]
}
].
and all of the relevant queues / exchanges declared I am able to start my rabbitmq server. However, when I check the shovel management, the plugin always displays starting as the state of the shovel. What causes this and is there any way to get more info ?
Make sure to check the user is setup correctly on the brokers.