Spring XD on YARN - hadoop-yarn

I am getting the below error, while I am trying to install Spring XD on YARN.
Error executing a spring application; nested exception is org.springframework.yarn.YarnSystemException:
Call From c01dfobi43.vcac.dc1.dsghost.net/100.98.226.45 to c01dfobi41.vcac.dc1.dsghost.net:8032 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused;
nested exception is java.net.ConnectException:
Call From c01dfobi43.vcac.dc1.dsghost.net/100.98.226.45 to c01dfobi41.vcac.dc1.dsghost.net:8032 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Not sure where I am committing a mistake. Also do we need to install Spring XD Yarn on all nodes?
It would be great if you can share any documentation mentioned explicitly for YARN.

I am going to assume that c01dfobi41.vcac.dc1.dsghost.net:8032 is a ResourceManager host, I am also going to assume that based on your comment stating that yarn applications do run, you have more then one. In that case what may be happening (and I see this all the time) is that your yarn client attempts to contact the resource manager by looking it up in yarn-site.xml, it picks the first one and gets ConnectionRefused as the standby resource manager does not listen on its RPC port, it moves on to the next one and succeeds. If this is the case this is not a fatal error and can be ignored.

Related

JMX connection to Gemfire over SSL

I have used GFSH to start locator like below
start locator --name=gemfire_locator --security-properties-file="../config/gfsecurity.properties" --J=-Dgemfire.ssl-enabled-components=all --mcast-port=0 --J=-Dgemfire.jmx-manager-ssl=true
Also started server
start server --name=server1 --security-properties-file="../config/gfsecurity.properties" --J=-Dgemfire.ssl-enabled-components=all --mcast-port=0 --J=-Dgemfire.jmx-manager-ssl=true
I am trying to connect to Gemfire as ClientCache which works perfectly fine over SSL. But When I connect as JMX client, I am getting below error in Java code as well as Jconsole.
Error:
Exception in thread "main" java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.CommunicationException [Root exception is java.rmi.ConnectIOException: non-JRMP server at remote endpoint]
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:369)
at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270)
at SamplePlugin.main(SamplePlugin.java:101)
Am I missing any other configuration?
Here is my JAVA_TOOL_OPTIONS:
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=true
-Djava.rmi.server.hostname=myhostname
You will also need to add the geode-core jar to your classpath for jvisualvm. Use the --cp:a option. I would suggest just using geode-dependencies.jar as that will get everything you might need.
The reason this is required is explained a bit in the comments for ContextAwareSSLRMIClientSocketFactory. Basically it seems that when RMI uses SSL, the necessary RMIClientSocketFactory is exported from the server to the client for use there. In general this would simply just be SslRMIClientSocketFactory. But in our case, we have a custom socket factory and so the client (jvisualvm in this case) needs to have access to it.

okhttp 3.7.0 SSL Shutdown failed threw UnknownHostException and not SSLException

While making a network request with low connectivity very rarely I see that I get
<-- HTTP FAILED: java.net.UnknownHostException: Unable to resolve host ....
while my server seems to have got the request correctly. I found 1 instance of it with device logs which shows actually an SSLException happened
D/NativeCrypto: jniThrowException: javax/net/ssl/SSLException: Read error: ssl=0x7dc365f080: I/O error during system call, Software caused connection abort
D/NativeCrypto: jniThrowException: javax/net/ssl/SSLException: SSL shutdown failed: ssl=0x7dc365f080: I/O error during system call, Broken pipe
My question is why does okhttp and retrofit throw UnknownHostException and not SSLException, and is there a way to actually get the SSLException as currently my app thinks the request did not go while server processes that request.
I am using
okhttp:3.10.0
retrofit:2.2.0
adapter-rxjava2:2.2.0

JBoss AS7 Hornetq - RemoteConenctionFactory lookup fails with RoundRobinConnectionLoadBalancingPolicy NotSerializableException

I'm using JBoss AS 7 Hornetq. Our standalone java application interacts with a queue and sends messages. We had the entire environment setup and it was working pretty smoothly. Suddenly, one fine day, our standalone application failed with the below exception:
Caused by: java.io.NotSerializableException: org.hornetq.api.core.client.loadbalance.RoundRobinConnectionLoadBalancingPolicy
Detailed exception stack trace is below
javax.naming.NamingException: Failed to lookup [Root exception is java.io.NotSerializableException: org.hornetq.api.core.client.loadbalance.RoundRobinConnectionLoadBalancingPolicy]
at org.jboss.naming.remote.client.ClientUtil.namingException(ClientUtil.java:36)
at org.jboss.naming.remote.protocol.v1.Protocol$1.execute(Protocol.java:104)
at org.jboss.naming.remote.protocol.v1.RemoteNamingStoreV1.lookup(RemoteNamingStoreV1.java:79)
at org.jboss.naming.remote.client.RemoteContext.lookup(RemoteContext.java:79)
at org.jboss.naming.remote.client.RemoteContext.lookup(RemoteContext.java:83)
at javax.naming.InitialContext.lookup(Unknown Source)
at com.infosys.lbs.publishing.LocationProcessor.postMessageInQueue(LocationProcessor.java:377)
at com.infosys.lbs.publishing.LocationProcessor.process(LocationProcessor.java:69)
at com.infosys.lbs.publishing.main.Publisher.main(Publisher.java:34)
Caused by: java.io.NotSerializableException: org.hornetq.api.core.client.loadbalance.RoundRobinConnectionLoadBalancingPolicy
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:891)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1063)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:1019)
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:885)
at org.jboss.marshalling.river.RiverMarshaller.doWriteFields(RiverMarshaller.java:1063)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:1019)
at org.jboss.marshalling.river.RiverMarshaller.doWriteSerializableObject(RiverMarshaller.java:998)
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:885)
at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:62)
at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:119)
at org.jboss.naming.remote.protocol.v1.Protocol$1$2.write(Protocol.java:138)
at org.jboss.naming.remote.protocol.v1.WriteUtil.write(WriteUtil.java:61)
at org.jboss.naming.remote.protocol.v1.Protocol$1.handleServerMessage(Protocol.java:128)
at org.jboss.naming.remote.protocol.v1.RemoteNamingServerV1$MessageReciever$1.run(RemoteNamingServerV1.java:73)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: an exception which occurred:
in field loadBalancingPolicy
in field serverLocator
in object org.hornetq.jms.client.HornetQJMSConnectionFactory#ea074d
Exception was happening when the app was trying to lookup the connection factory
QueueConnectionFactory qcf = (QueueConnectionFactory)context.lookup("jms/RemoteConnectionFactory");
Below are the steps on how we resolved the issue
There was almost zero help for resolving this issue. A web search on this exception returned next to nothing. However, this particular thread on JBoss AS Dev site spinned a thought in my head: RemoteConnectionFactory is not found when looking up in a remote client
The scenario mentioned in this thread was not same as ours. (In our app this is the first and the only lookup happening.) This thread got me thinking towards a possible connection factory initialization issue. While there is nothing I could do to debug or find the issue around it, I thought that if I could reinitialize it, that would help.
So I tried lookup with java:jboss/exported/jms/RemoteConnectionFactory As expected it failed with a NamingException. Hoping that this naming syntax (using java:/) would have resulted in a reinitialization, I tried lookup again with jms/RemoteConnectionFactory. And bingo!!! it worked!
Unfortunately, we still don't know why it happened, and if it is just a one-off case! Documenting it here just in case some mortal soul hits this issue.

Liferay stopped at database shutdown caused a crash

I was stopping the Liferay portal, but few seconds after, I stopped the database (db2 quiesce, that means, that the connections are closed) and apparently, Liferay did not stopped correctly its execution.
After that, I restarted the database and liferay, but the portal does not work now. It shows this message in the browser:
HTTP Status 500 -
type Exception report
message
description The server encountered an internal error () that prevented it from fulfilling this request.
exception
javax.servlet.ServletException: Servlet execution threw an exception
com.liferay.portal.kernel.servlet.filters.invoker.InvokerFilterChain.doFilter(InvokerFilterChain.java:72)
...
root cause
java.lang.NoSuchMethodError: com.liferay.portal.util.PortalUtil.getCDNHostHttp()Ljava/lang/String;
com.liferay.portal.events.ServicePreActionExt.servicePre(ServicePreActionExt.java:937)
After looking in the logs, I found the following messages (they are edited):
SEVERE: Error waiting for multi-thread deployment of directories to completehostConfig.deployWar=Deploying web application archive {0}
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1000)
WARN [DefaultConnectionTester:203] SQL State '08001' of Exception which occurred during a Connection test (fallback DatabaseMetaData test) implies that the database is invalid, and the pool should refill itself with fresh Connections.
com.ibm.db2.jcc.am.DisconnectNonTransientConnectionException: [jcc][t4][2030][11211][3.63.75] A communication error occurred during operations on the connection's underlying socket, socket input stream, or socket output stream. Error location: Reply.fill() - insufficient data (-1). Message: Insufficient data. ERRORCODE=-4499, SQLSTATE=08001
at com.ibm.db2.jcc.am.fd.a(fd.java:321)
WARN [DefaultConnectionTester:136] SQL State '08001' of Exception tested by statusOnException() implies that the database is invalid, and the pool should refill itself with fresh Connections.
WARN [C3P0PooledConnectionPool:708] A ConnectionTest has failed, reporting that all previously acquired Connections are likely invalid. The pool will be reset.
WARN [NewPooledConnection:486] [c3p0] A PooledConnection that has already signalled a Connection error is still in use!
WARN [NewPooledConnection:487] [c3p0] Another error has occurred [ com.ibm.db2.jcc.am.SqlNonTransientConnectionException: [jcc][t4][10335][10366][3.63.75] Invalid operation: Connection is closed. ERRORCODE=-4470, SQLSTATE=08003 ] which will not be reported to listeners!
com.ibm.db2.jcc.am.SqlNonTransientConnectionException: [jcc][t4][10335][10366][3.63.75] Invalid operation: Connection is closed. ERRORCODE=-4470, SQLSTATE=08003
WARN [BasicResourcePool:1841] com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#4fad5112 -- Acquisition Attempt Failed!!! Clearing pending acquires. While trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (3). Last acquisition attempt exception:
com.ibm.db2.jcc.am.SqlNonTransientConnectionException: DB2 SQL Error: SQLCODE=-20157, SQLSTATE=08004, SQLERRMC=FUT5MAN;QUIESCE DATABASE;;, DRIVER=3.63.75
ERROR [PortalJobStore:109] MisfireHandler: Error handling misfires: Unexpected runtime exception: null
org.quartz.JobPersistenceException: Unexpected runtime exception: null [See nested exception: java.lang.reflect.UndeclaredThrowableException]
Caused by: java.lang.reflect.UndeclaredThrowableException
at $Proxy279.prepareStatement(Unknown Source)
at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.countMisfiredTriggersInState(StdJDBCDelegate.java:413)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source)
Caused by: java.sql.SQLException: Connections could not be acquired from the underlying database!
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:106)
Caused by: com.mchange.v2.resourcepool.CannotAcquireResourceException: A ResourcePool could not acquire a resource from its primary factory or source.
at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1319)
Now, I see that it is almost impossible to start the current Liferay installation. However, I have the database (I made a full backup), and the lucene's data directory. How can I recreate a Liferay installation with these two things? I would like to recover some of this data in a new installation, but I do not how.
This is not the best solution, but I installed Liferay with a new database. Once it was configured, I change the database configuration in order to use the other one.
Probably, it was a problem with the ROOT deployment, but this is very weird.
I could recover all the data from the Lucene and the database.
The database is still quiesced and the Liferay user doesn't have the QUIESCE_CONNECT privilege.
Unquiesce the database and restart Liferay.
Using DB2 instance owner (if you're on Windows, any administrator):
db2 connect to DBNAME
db2 unquiesce database
db2 connect reset
Regards.

OSB WLS Initialisation issue

facing some strange behavior in OSB, i have configured WLS with MQ in client mode, i am doing some minor test to check the connection, i have created a proxy service to read the message from Q1 and a Business Service(BS) to route it to Q2. The issue is the proxy is able to read the message but the BS is throwing this:
JMSPool BEA-169807 There was an error while making the initial connection to the JMS resource named ALSB_JMS_SessionPool_491704821 from within an EJB or a servlet. The server will attempt the connection again later. The error was javax.jms.JMSException: [JMSPool:169803]JNDI lookup of the JMS connection factory AKBConnFact failed: javax.naming.NoInitialContextException: Cannot instantiate class: com.sun.jndi.fscontext.RefFSContextFactory [Root exception is java.lang.ClassNotFoundException: com.sun.jndi.fscontext.RefFSContextFactory
Note: The classpath or the domain/lib folder contains the RefFSContextFactory class
Any ideas gang..? TIA
The answer is this is a bug in OSB which needs to be reported.
As a workaround you need to individually set the jars in the weblogic classpath in your domain/server/bin folder. just go through the link below for more details:
http://forums.oracle.com/forums/thread.jspa?threadID=2135523&start=0&tstart=0