Mqtt connection lost. java.io.EOFException: null - azure-iot-hub

Context
OS and version used: Windows 10
Java runtime used: 17
Azure IotHub SDK version used: v1/v2
Description of the issue
I have met this issue with Azure IotHub SDK v1 about 1-2 months ago. It happened suddenly, I discovered this error in my automatic tests and there weren't any changes in libraries versions. Saying more, switching to SDK v2 gives the same errors.
Despite this error with connection, DeviceTwin works, but DirectMethod has stopped working.
In summary: DirectMethod doesn't work anymore, independently of SDK versions
Code sample exhibiting the issue
Taking the examples of DirectMethod from official repo:
https://github.com/Azure/azure-iot-sdk-java/blob/main/device/iot-device-samples
https://github.com/Azure/azure-iot-sdk-java/blob/main/service/iot-service-samples
Console log of the issue
From service:
[main] DEBUG com.microsoft.azure.sdk.iot.service.methods.DirectMethodsClient - Initialized a DirectMethodsClient instance using SDK version 2.1.2
[main] DEBUG com.microsoft.azure.sdk.iot.service.jobs.ScheduledJobsClient - Initialized a ScheduledJobsClient instance using SDK version 2.1.2
[main] DEBUG com.microsoft.azure.sdk.iot.service.query.QueryClient - Initialized a QueryClient instance client using SDK version 2.1.2
directly invoke method on the Device
{"errorCode":404103,"message":"The operation failed because the requested device isn't online or hasn't registered the direct method callback. To learn more, see https://aka.ms/iothub404103","trackingId":"D2F86180CA32433CB850C20C034A4D87-G2:-TimeStamp:2022-09-24T15:35:29.240331067+00:00","timestampUtc":"2022-09-24T15:35:29.240331067+00:00","info":null}
Shutting down sample...
From device:
(onMethodInvoked in callback was reached but with empty payload)
22:40:12.655 [MQTT Rec: int-test-1] WARN com.microsoft.azure.sdk.iot.device.transport.mqtt.Mqtt - Mqtt connection lost
org.eclipse.paho.client.mqttv3.MqttException: Utracono połączenie
at org.eclipse.paho.client.mqttv3.internal.CommsReceiver.run(CommsReceiver.java:197)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.io.EOFException: null
at java.base/java.io.DataInputStream.readByte(DataInputStream.java:273)
at org.eclipse.paho.client.mqttv3.internal.wire.MqttInputStream.readMqttWireMessage(MqttInputStream.java:92)
at org.eclipse.paho.client.mqttv3.internal.CommsReceiver.run(CommsReceiver.java:137)
... 1 common frames omitted
22:40:12.655 [MQTT Rec: int-test-1] DEBUG com.microsoft.azure.sdk.iot.device.transport.IotHubTransport - Mapping throwable to NO_NETWORK because it was a retryable exception
com.microsoft.azure.sdk.iot.device.transport.ProtocolException: Mqtt connection lost
at com.microsoft.azure.sdk.iot.device.transport.mqtt.exceptions.PahoExceptionTranslator.convertToMqttException(PahoExceptionTranslator.java:63)
at com.microsoft.azure.sdk.iot.device.transport.mqtt.Mqtt.connectionLost(Mqtt.java:339)
at org.eclipse.paho.client.mqttv3.internal.CommsCallback.connectionLost(CommsCallback.java:304)
at org.eclipse.paho.client.mqttv3.internal.ClientComms.shutdownConnection(ClientComms.java:441)
at org.eclipse.paho.client.mqttv3.internal.CommsReceiver.run(CommsReceiver.java:197)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.eclipse.paho.client.mqttv3.MqttException: Utracono połączenie
... 2 common frames omitted
Caused by: java.io.EOFException: null
at java.base/java.io.DataInputStream.readByte(DataInputStream.java:273)
at org.eclipse.paho.client.mqttv3.internal.wire.MqttInputStream.readMqttWireMessage(MqttInputStream.java:92)
at org.eclipse.paho.client.mqttv3.internal.CommsReceiver.run(CommsReceiver.java:137)
... 1 common frames omitted
22:40:12.656 [MQTT Rec: int-test-1] WARN com.microsoft.azure.sdk.iot.device.transport.IotHubTransport - Updating transport status to new status DISCONNECTED_RETRYING with reason NO_NETWORK
com.microsoft.azure.sdk.iot.device.transport.ProtocolException: Mqtt connection lost
at com.microsoft.azure.sdk.iot.device.transport.mqtt.exceptions.PahoExceptionTranslator.convertToMqttException(PahoExceptionTranslator.java:63)
at com.microsoft.azure.sdk.iot.device.transport.mqtt.Mqtt.connectionLost(Mqtt.java:339)
at org.eclipse.paho.client.mqttv3.internal.CommsCallback.connectionLost(CommsCallback.java:304)
at org.eclipse.paho.client.mqttv3.internal.ClientComms.shutdownConnection(ClientComms.java:441)
at org.eclipse.paho.client.mqttv3.internal.CommsReceiver.run(CommsReceiver.java:197)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.eclipse.paho.client.mqttv3.MqttException: Utracono połączenie
... 2 common frames omitted
Caused by: java.io.EOFException: null
at java.base/java.io.DataInputStream.readByte(DataInputStream.java:273)
at org.eclipse.paho.client.mqttv3.internal.wire.MqttInputStream.readMqttWireMessage(MqttInputStream.java:92)
at org.eclipse.paho.client.mqttv3.internal.CommsReceiver.run(CommsReceiver.java:137)
... 1 common frames omitted
22:40:12.656 [MQTT Rec: int-test-1] DEBUG com.microsoft.azure.sdk.iot.device.transport.IotHubTransport - Invoking connection status callbacks with new status details
CONNECTION STATUS UPDATE: DISCONNECTED_RETRYING
CONNECTION STATUS REASON: NO_NETWORK
CONNECTION STATUS THROWABLE: Mqtt connection lost
Solution:
In my case the problem was in version of Azure SDK iot-service-client
I was using 2.1.2, after switch to 2.1.4 it has started to work.

Related

okhttp 3.7.0 SSL Shutdown failed threw UnknownHostException and not SSLException

While making a network request with low connectivity very rarely I see that I get
<-- HTTP FAILED: java.net.UnknownHostException: Unable to resolve host ....
while my server seems to have got the request correctly. I found 1 instance of it with device logs which shows actually an SSLException happened
D/NativeCrypto: jniThrowException: javax/net/ssl/SSLException: Read error: ssl=0x7dc365f080: I/O error during system call, Software caused connection abort
D/NativeCrypto: jniThrowException: javax/net/ssl/SSLException: SSL shutdown failed: ssl=0x7dc365f080: I/O error during system call, Broken pipe
My question is why does okhttp and retrofit throw UnknownHostException and not SSLException, and is there a way to actually get the SSLException as currently my app thinks the request did not go while server processes that request.
I am using
okhttp:3.10.0
retrofit:2.2.0
adapter-rxjava2:2.2.0

JUnit tests using embedded ActiveMQ throws javax.jms.JMSException: peer stopped

I've a java application that consumes messages from ActiveMQ. I've JUnit test cases that uses embedded ActiveMQ (version: 5.10.0). The test cases execute fine but throws this error post execution. I tried with latest version (5.14.0) and the error is thrown. There is no error with 5.8.0 though. I see a related thread that describes the same issue for ActiveMQ version 5.6.0 but could not see a solution. Appreciate your inputs.
#Bean
public ConnectionFactory jmsConnectionFactory() {
ActiveMQConnectionFactory amqConnectionFactory = new ActiveMQConnectionFactory("vm://my-amq-host");
CachingConnectionFactory cachingConnectionFactory = new CachingConnectionFactory(amqConnectionFactory);
cachingConnectionFactory.setCacheConsumers(false);
return cachingConnectionFactory;
}
2016-09-23 13:53:37,083 WARN [org.springframework.jms.connection.CachingConnectionFactory][ActiveMQ Connection Executor: vm://my-amq-host#0][301] Encountered a JMSException - resetting the underlying JMS Connection
javax.jms.JMSException: peer (vm://my-amq-host#1) stopped.
at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:54)
at org.apache.activemq.ActiveMQConnection.onAsyncException(ActiveMQConnection.java:1998)
at org.apache.activemq.ActiveMQConnection.onException(ActiveMQConnection.java:2017)
at org.apache.activemq.transport.TransportFilter.onException(TransportFilter.java:101)
at org.apache.activemq.transport.ResponseCorrelator.onException(ResponseCorrelator.java:126)
at org.apache.activemq.transport.TransportFilter.onException(TransportFilter.java:101)
at org.apache.activemq.transport.vm.VMTransport.stop(VMTransport.java:206)
at org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:65)
at org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:65)
at org.apache.activemq.transport.ResponseCorrelator.stop(ResponseCorrelator.java:132)
at org.apache.activemq.broker.TransportConnection.doStop(TransportConnection.java:1102)
at org.apache.activemq.broker.TransportConnection$4.run(TransportConnection.java:1068)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.activemq.transport.TransportDisposedIOException: peer (vm://my-amq-host#1) stopped.
... 9 more
My guess would be that the test code is running in such a way that the connection factory when creating the connection to the VM transport which create an in-VM broker if none is running is actually capturing an instance of an in-VM broker before it has fully cleaned up and gone down. Without seeing the test code completely it's hard to say for sure though.
It's usually a good idea to have the test create it's own BrokerService that you can control and have the factories use the VM transport with the create=false URI option.

Spring XD on YARN

I am getting the below error, while I am trying to install Spring XD on YARN.
Error executing a spring application; nested exception is org.springframework.yarn.YarnSystemException:
Call From c01dfobi43.vcac.dc1.dsghost.net/100.98.226.45 to c01dfobi41.vcac.dc1.dsghost.net:8032 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused;
nested exception is java.net.ConnectException:
Call From c01dfobi43.vcac.dc1.dsghost.net/100.98.226.45 to c01dfobi41.vcac.dc1.dsghost.net:8032 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Not sure where I am committing a mistake. Also do we need to install Spring XD Yarn on all nodes?
It would be great if you can share any documentation mentioned explicitly for YARN.
I am going to assume that c01dfobi41.vcac.dc1.dsghost.net:8032 is a ResourceManager host, I am also going to assume that based on your comment stating that yarn applications do run, you have more then one. In that case what may be happening (and I see this all the time) is that your yarn client attempts to contact the resource manager by looking it up in yarn-site.xml, it picks the first one and gets ConnectionRefused as the standby resource manager does not listen on its RPC port, it moves on to the next one and succeeds. If this is the case this is not a fatal error and can be ignored.

JBoss AS7 Hornetq - RemoteConenctionFactory lookup fails with RoundRobinConnectionLoadBalancingPolicy NotSerializableException

I'm using JBoss AS 7 Hornetq. Our standalone java application interacts with a queue and sends messages. We had the entire environment setup and it was working pretty smoothly. Suddenly, one fine day, our standalone application failed with the below exception:
Caused by: java.io.NotSerializableException: org.hornetq.api.core.client.loadbalance.RoundRobinConnectionLoadBalancingPolicy
Detailed exception stack trace is below
javax.naming.NamingException: Failed to lookup [Root exception is java.io.NotSerializableException: org.hornetq.api.core.client.loadbalance.RoundRobinConnectionLoadBalancingPolicy]
at org.jboss.naming.remote.client.ClientUtil.namingException(ClientUtil.java:36)
at org.jboss.naming.remote.protocol.v1.Protocol$1.execute(Protocol.java:104)
at org.jboss.naming.remote.protocol.v1.RemoteNamingStoreV1.lookup(RemoteNamingStoreV1.java:79)
at org.jboss.naming.remote.client.RemoteContext.lookup(RemoteContext.java:79)
at org.jboss.naming.remote.client.RemoteContext.lookup(RemoteContext.java:83)
at javax.naming.InitialContext.lookup(Unknown Source)
at com.infosys.lbs.publishing.LocationProcessor.postMessageInQueue(LocationProcessor.java:377)
at com.infosys.lbs.publishing.LocationProcessor.process(LocationProcessor.java:69)
at com.infosys.lbs.publishing.main.Publisher.main(Publisher.java:34)
Caused by: java.io.NotSerializableException: org.hornetq.api.core.client.loadbalance.RoundRobinConnectionLoadBalancingPolicy
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:891)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1063)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:1019)
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:885)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1063)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:1019)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:998)
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:885)
at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:62)
at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:119)
at org.jboss.naming.remote.protocol.v1.Protocol$1$2.write(Protocol.java:138)
at org.jboss.naming.remote.protocol.v1.WriteUtil.write(WriteUtil.java:61)
at org.jboss.naming.remote.protocol.v1.Protocol$1.handleServerMessage(Protocol.java:128)
at org.jboss.naming.remote.protocol.v1.RemoteNamingServerV1$MessageReciever$1.run(RemoteNamingServerV1.java:73)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: an exception which occurred:
in field loadBalancingPolicy
in field serverLocator
in object org.hornetq.jms.client.HornetQJMSConnectionFactory#ea074d
Exception was happening when the app was trying to lookup the connection factory
QueueConnectionFactory qcf = (QueueConnectionFactory)context.lookup("jms/RemoteConnectionFactory");
Below are the steps on how we resolved the issue
There was almost zero help for resolving this issue. A web search on this exception returned next to nothing. However, this particular thread on JBoss AS Dev site spinned a thought in my head: RemoteConnectionFactory is not found when looking up in a remote client
The scenario mentioned in this thread was not same as ours. (In our app this is the first and the only lookup happening.) This thread got me thinking towards a possible connection factory initialization issue. While there is nothing I could do to debug or find the issue around it, I thought that if I could reinitialize it, that would help.
So I tried lookup with java:jboss/exported/jms/RemoteConnectionFactory As expected it failed with a NamingException. Hoping that this naming syntax (using java:/) would have resulted in a reinitialization, I tried lookup again with jms/RemoteConnectionFactory. And bingo!!! it worked!
Unfortunately, we still don't know why it happened, and if it is just a one-off case! Documenting it here just in case some mortal soul hits this issue.

Liferay stopped at database shutdown caused a crash

I was stopping the Liferay portal, but few seconds after, I stopped the database (db2 quiesce, that means, that the connections are closed) and apparently, Liferay did not stopped correctly its execution.
After that, I restarted the database and liferay, but the portal does not work now. It shows this message in the browser:
HTTP Status 500 -
type Exception report
message
description The server encountered an internal error () that prevented it from fulfilling this request.
exception
javax.servlet.ServletException: Servlet execution threw an exception
com.liferay.portal.kernel.servlet.filters.invoker.InvokerFilterChain.doFilter(InvokerFilterChain.java:72)
...
root cause
java.lang.NoSuchMethodError: com.liferay.portal.util.PortalUtil.getCDNHostHttp()Ljava/lang/String;
com.liferay.portal.events.ServicePreActionExt.servicePre(ServicePreActionExt.java:937)
After looking in the logs, I found the following messages (they are edited):
SEVERE: Error waiting for multi-thread deployment of directories to completehostConfig.deployWar=Deploying web application archive {0}
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1000)
WARN [DefaultConnectionTester:203] SQL State '08001' of Exception which occurred during a Connection test (fallback DatabaseMetaData test) implies that the database is invalid, and the pool should refill itself with fresh Connections.
com.ibm.db2.jcc.am.DisconnectNonTransientConnectionException: [jcc][t4][2030][11211][3.63.75] A communication error occurred during operations on the connection's underlying socket, socket input stream, or socket output stream. Error location: Reply.fill() - insufficient data (-1). Message: Insufficient data. ERRORCODE=-4499, SQLSTATE=08001
at com.ibm.db2.jcc.am.fd.a(fd.java:321)
WARN [DefaultConnectionTester:136] SQL State '08001' of Exception tested by statusOnException() implies that the database is invalid, and the pool should refill itself with fresh Connections.
WARN [C3P0PooledConnectionPool:708] A ConnectionTest has failed, reporting that all previously acquired Connections are likely invalid. The pool will be reset.
WARN [NewPooledConnection:486] [c3p0] A PooledConnection that has already signalled a Connection error is still in use!
WARN [NewPooledConnection:487] [c3p0] Another error has occurred [ com.ibm.db2.jcc.am.SqlNonTransientConnectionException: [jcc][t4][10335][10366][3.63.75] Invalid operation: Connection is closed. ERRORCODE=-4470, SQLSTATE=08003 ] which will not be reported to listeners!
com.ibm.db2.jcc.am.SqlNonTransientConnectionException: [jcc][t4][10335][10366][3.63.75] Invalid operation: Connection is closed. ERRORCODE=-4470, SQLSTATE=08003
WARN [BasicResourcePool:1841] com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#4fad5112 -- Acquisition Attempt Failed!!! Clearing pending acquires. While trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (3). Last acquisition attempt exception:
com.ibm.db2.jcc.am.SqlNonTransientConnectionException: DB2 SQL Error: SQLCODE=-20157, SQLSTATE=08004, SQLERRMC=FUT5MAN;QUIESCE DATABASE;;, DRIVER=3.63.75
ERROR [PortalJobStore:109] MisfireHandler: Error handling misfires: Unexpected runtime exception: null
org.quartz.JobPersistenceException: Unexpected runtime exception: null [See nested exception: java.lang.reflect.UndeclaredThrowableException]
Caused by: java.lang.reflect.UndeclaredThrowableException
at $Proxy279.prepareStatement(Unknown Source)
at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.countMisfiredTriggersInState(StdJDBCDelegate.java:413)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source)
Caused by: java.sql.SQLException: Connections could not be acquired from the underlying database!
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:106)
Caused by: com.mchange.v2.resourcepool.CannotAcquireResourceException: A ResourcePool could not acquire a resource from its primary factory or source.
at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1319)
Now, I see that it is almost impossible to start the current Liferay installation. However, I have the database (I made a full backup), and the lucene's data directory. How can I recreate a Liferay installation with these two things? I would like to recover some of this data in a new installation, but I do not how.
This is not the best solution, but I installed Liferay with a new database. Once it was configured, I change the database configuration in order to use the other one.
Probably, it was a problem with the ROOT deployment, but this is very weird.
I could recover all the data from the Lucene and the database.
The database is still quiesced and the Liferay user doesn't have the QUIESCE_CONNECT privilege.
Unquiesce the database and restart Liferay.
Using DB2 instance owner (if you're on Windows, any administrator):
db2 connect to DBNAME
db2 unquiesce database
db2 connect reset
Regards.