openstack service can not find the reply exchange of rabbitmq - rabbitmq

My environment is openstack communicating on rabbitmq, just one rabbitmq node, not a cluster;
I find one compute node can't work , nova-compute service can't get ack of the rpc request from nova-conductor service.
And at the same time ,there are many logs like below in nova-conductor log.
Anyone know it ?
2016-06-07 14:37:54.184 13820 INFO oslo_messaging._drivers.impl_rabbit [req-83bfbe28-3729-4756-8dc3-fc1d8a05d71f - - - - -] The exchange to reply to reply_9270c332921a4fadb6b370fec2ecce16 doesn't exist yet, retrying...
2016-06-07 14:37:54.186 13820 INFO oslo_messaging._drivers.impl_rabbit [req-a9b99ca4-4dc2-4bb9-b003-279a7fab9aa4 - - - - -] The exchange to reply to reply_9270c332921a4fadb6b370fec2ecce16 doesn't exist yet, retrying...
2016-06-07 14:37:54.214 13820 INFO oslo_messaging._drivers.impl_rabbit [req-c7d6e857-f73e-45c3-9f1e-d92760822a2a - - - - -] The exchange to reply to reply_9270c332921a4fadb6b370fec2ecce16 doesn't exist yet, retrying...
2016-06-07 14:37:54.224 13822 INFO oslo_messaging._drivers.impl_rabbit [req-7be21851-d331-47f7-a0fc-8225a65ba68f - - - - -] The exchange to reply to reply_9270c332921a4fadb6b370fec2ecce16 doesn't exist yet, retrying...
2016-06-07 14:37:54.225 13822 INFO oslo_messaging._drivers.impl_rabbit [req-b14f1618-1fe7-4804-bb4d-25e87d9f764a - - - - -] The exchange to reply to reply_9270c332921a4fadb6b370fec2ecce16 doesn't exist yet, retrying...
2016-06-07 14:37:54.263 13824 INFO oslo_messaging._drivers.impl_rabbit [req-72e2a5fe-21b5-4221-bc00-3990e8f8d10e - - - - -] The exchange to reply to reply_9270c332921a4fadb6b370fec2ecce16 doesn't exist yet, retrying...
2016-06-07 14:37:54.263 13824 INFO oslo_messaging._drivers.impl_rabbit [req-b97dfd77-0955-4919-a4f1-469c73a6eb34 - - - - -] The exchange to reply to reply_9270c332921a4fadb6b370fec2ecce16 doesn't exist yet, retrying...
2016-06-07 14:37:54.269 13824 INFO oslo_messaging._drivers.impl_rabbit [req-ebb588bf-85ca-43ef-95d6-f755a28bb5be - - - - -] The exchange to reply to reply_9270c332921a4fadb6b370fec2ecce16 doesn't exist yet, retrying...
2016-06-07 14:37:54.281 13823 INFO oslo_messaging._drivers.impl_rabbit [req-b14f1618-1fe7-4804-bb4d-25e87d9f764a - - - - -] The exchange to reply to reply_9270c332921a4fadb6b370fec2ecce16 doesn't exist yet, retrying...
2016-06-07 14:37:54.313 13823 INFO oslo_messaging._drivers.impl_rabbit [req-3b5b4e32-dc3e-457d-9e48-ae545e4b34a7 - - - - -] The exchange to reply to reply_9270c332921a4fadb6b370fec2ecce16 doesn't exist yet, retrying...
2016-06-07 14:37:54.342 13820 INFO oslo_messaging._drivers.impl_rabbit [req-6692ac00-97a1-4ef4-9b05-8b271c6a2009 - - - - -] The exchange to reply to reply_9270c332921a4fadb6b370fec2ecce16 doesn't exist yet, retrying...

Related

Rabbit Exchange type getting changed at runtime from x-modulus-hash to x-consistent-hash

Very rare issue which we observed in Production.
Exchange type got changed at runtime from x-modulus-hash to x-consistent-hash which is causing client to shutdown. What could be the reason? How to prevent this error?
Exchange with type
2022-09-19 15:54:48.158 ERROR [AMQP Connection x.x.x.x:5671] [org.springframework.amqp.rabbit.connection.CachingConnectionFactory] - <Channel shutdown: channel error; protocol method: #method<channel.close>(reply-code=406, reply-text=PRECONDITION_FAILED - inequivalent arg 'type' for exchange 'message-exchange.shovel' in vhost '/': received ''x-modulus-hash'' but current is ''x-consistent-hash'', class-id=40, method-id=10)>

Connnections handling

So I've been using karate for a while now, and there has been an issue we were facing since over the last year: org.apache.http.conn.ConnectTimeoutException
Other threads about that mentioned connectionTimeout exception were solvable by specifying proxy, but taht did not help us.
After tons of investigation, it turned out that our Azure SNAT was exhausted, meaning Karate was opening way too many connections.
To verify this I enabled log debugging and used this feature
Background:
* url "https://www.karatelabs.io/"
Scenario:
* method GET
* method GET
the logs then had following lines
13:10:17.868 [main] DEBUG com.intuit.karate - request:
1 > GET https://www.karatelabs.io/
1 > Host: www.karatelabs.io
1 > Connection: Keep-Alive
1 > User-Agent: Apache-HttpClient/4.5.13 (Java/17.0.4.1)
1 > Accept-Encoding: gzip,deflate
13:10:17.868 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection request: [route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 0 of 5; total allocated: 0 of 10]
13:10:17.874 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection leased: [id: 0][route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 1 of 5; total allocated: 1 of 10]
13:10:17.875 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Opening connection {s}->https://www.karatelabs.io:443
13:10:17.883 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connecting to www.karatelabs.io/34.149.87.45:443
13:10:17.883 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Connecting socket to www.karatelabs.io/34.149.87.45:443 with timeout 30000
13:10:17.924 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled protocols: [TLSv1.3, TLSv1.2]
13:10:17.924 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled cipher suites:[...]
13:10:17.924 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Starting handshake
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Secure session established
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated protocol: TLSv1.3
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated cipher suite: TLS_AES_256_GCM_SHA384
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer principal: CN=karatelabs.io
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer alternative names: [karatelabs.io, www.karatelabs.io]
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - issuer principal: CN=Sectigo RSA Domain Validation Secure Server CA, O=Sectigo Limited, L=Salford, ST=Greater Manchester, C=GB
13:10:18.014 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connection established localIp<->serverIp
13:10:18.015 [main] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-0: set socket timeout to 120000
13:10:18.015 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Executing request GET / HTTP/1.1
...
13:10:18.066 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Connection can be kept alive indefinitely
...
...
13:10:18.196 [main] DEBUG com.intuit.karate - request:
2 > GET https://www.karatelabs.io/
13:10:18.196 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection request: [route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 0 of 5; total allocated: 0 of 10]
13:10:18.196 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection leased: [id: 1][route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 1 of 5; total allocated: 1 of 10]
13:10:18.196 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Opening connection {s}->https://www.karatelabs.io:443
13:10:18.196 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connecting to www.karatelabs.io/34.149.87.45:443
13:10:18.196 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Connecting socket to www.karatelabs.io/34.149.87.45:443 with timeout 30000
13:10:18.206 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled protocols: [TLSv1.3, TLSv1.2]
13:10:18.206 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled cipher suites:[...]
13:10:18.206 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Starting handshake
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Secure session established
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated protocol: TLSv1.3
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated cipher suite: TLS_AES_256_GCM_SHA384
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer principal: CN=karatelabs.io
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer alternative names: [karatelabs.io, www.karatelabs.io]
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - issuer principal: CN=Sectigo RSA Domain Validation Secure Server CA, O=Sectigo Limited, L=Salford, ST=Greater Manchester, C=GB
13:10:18.236 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connection established localIp<->serverIp
13:10:18.236 [main] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-1: set socket timeout to 120000
...
13:10:18.279 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Connection can be kept alive indefinitely
...
...
13:10:18.609 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager is shutting down
13:10:18.610 [Finalizer] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-1: Shutdown connection
13:10:18.611 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager shut down
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager is shutting down
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-2: Shutdown connection
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager shut down
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager is shutting down
"Connecting to socket" and "handshake" indicate that karate is establishing a new connection instead of using an already opened one, even though I am sending a request to the same host.
On the other hand, on longer scenarios, I was seeing "http-outgoing-x: Shutdown connection" after about ~1s from opening it, in the middle of the run, despite having "karate.configure('readTimeout', 120000)" specified.
I don't think that was intentional, especially after seeing the "keep-alive" header and the "Connection can be kept alive indefinitely" in the log"
That being said, is there any way to force karate to use the same connection instead of establishing a new one each request?
As far as we know, we use the Apache HTTP Client API the right way.
But you never know. The best thing is for you to dive into the code and see what we could be missing. Or you could provide a way to replicate following these instructions: https://github.com/karatelabs/karate/wiki/How-to-Submit-an-Issue

Openstack nova-compute service is not recognized as compute service in controller node (probably rabbitmq connection problem)

I was following the tutorials on Openstack docs website for a minimal deployment of the Stein release (I am installing it on Ubuntu 18.04 instances). I installed the nova services and made the required configuration. Then I reached this stage and ran the following command on the controller node.
openstack compute service list --service nova-compute
I was expecting to see a nova-compute service running on a compute host as the result, but I get nothing.
I looked at the nova-compute.log file on my comupte node (/var/log/nova/nova-compute)
and I have this error:
.
.
.
2022-04-25 06:33:46.682 4015 ERROR oslo.messaging._drivers.impl_rabbit [req-313f4c65-0f63-4d8b-8682-6295770701af - - - - -] Connection failed: timed out (retrying in 32.0 seconds): socket.timeout: timed out
2022-04-25 06:34:23.745 4015 ERROR oslo.messaging._drivers.impl_rabbit [req-313f4c65-0f63-4d8b-8682-6295770701af - - - - -] Connection failed: timed out (retrying in 32.0 seconds): socket.timeout: timed out
2022-04-25 06:35:00.803 4015 ERROR oslo.messaging._drivers.impl_rabbit [req-313f4c65-0f63-4d8b-8682-6295770701af - - - - -] Connection failed: timed out (retrying in 32.0 seconds): socket.timeout: timed out
2022-04-25 06:35:37.860 4015 ERROR oslo.messaging._drivers.impl_rabbit [req-313f4c65-0f63-4d8b-8682-6295770701af - - - - -] Connection failed: timed out (retrying in 32.0 seconds): socket.timeout: timed out
2022-04-25 06:36:14.920 4015 ERROR oslo.messaging._drivers.impl_rabbit [req-313f4c65-0f63-4d8b-8682-6295770701af - - - - -] Connection failed: timed out (retrying in 32.0 seconds): socket.timeout: timed out
.
.
.
Apparently, it has a problem connecting to the rabbitmq service. I searched a lot but I couldn't find anything of use in my case. I've been stuck on this for quite some time now. I'd be very happy if someone could give me an answer.
As Victor Lee suggested in the comments, I checked rabbitmq's port and nothing was wrong there. It was running with no problem. Turned out my firewall didn't allow incoming traffic through rabbitmq's port. So I added a rule to allow it.

WSO2 Carbon Initialization Failed (Fatal Error)

I have installed WSO2 Identity Server on my machine which is running Windows 10. I am trying to start the server using the command wso2server.bat --run, however I get the error WSO2 Carbon initialization Failed. The following is the complete log from the terminal:-
C:\Users\USER\Downloads\wso2is-5.4.0\bin>wso2server.bat --run
JAVA_HOME environment variable is set to C:\Program Files\Java\jdk1.8.0_152
CARBON_HOME environment variable is set to C:\Users\USER\Downloads\wso2is-5.4.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
[2017-12-21 13:57:30,161] INFO {org.wso2.carbon.core.internal.CarbonCoreActivator} - Starting WSO2 Carbon...
[2017-12-21 13:57:30,161] INFO {org.wso2.carbon.core.internal.CarbonCoreActivator} - Operating System : Windows 10 10.0, amd64
[2017-12-21 13:57:30,161] INFO {org.wso2.carbon.core.internal.CarbonCoreActivator} - Java Home : C:\Program Files\Java\jdk1.8.0_152\jre
[2017-12-21 13:57:30,161] INFO {org.wso2.carbon.core.internal.CarbonCoreActivator} - Java Version : 1.8.0_152
[2017-12-21 13:57:30,161] INFO {org.wso2.carbon.core.internal.CarbonCoreActivator} - Java VM : Java HotSpot(TM) 64-Bit Server VM 25.152-b16,Oracle Corporation
[2017-12-21 13:57:30,161] INFO {org.wso2.carbon.core.internal.CarbonCoreActivator} - Carbon Home : C:\Users\USER\Downloads\wso2is-5.4.0
[2017-12-21 13:57:30,161] INFO {org.wso2.carbon.core.internal.CarbonCoreActivator} - Java Temp Dir : C:\Users\USER\Downloads\wso2is-5.4.0\tmp
[2017-12-21 13:57:30,161] INFO {org.wso2.carbon.core.internal.CarbonCoreActivator} - User : USER, en-US, Asia/Calcutta
[2017-12-21 13:57:31,067] INFO {org.wso2.carbon.event.output.adapter.kafka.internal.ds.KafkaEventAdapterServiceDS} - Successfully deployed the Kafka output event adaptor service
[2017-12-21 13:57:31,129] INFO {org.wso2.carbon.event.processor.manager.core.internal.util.ManagementModeConfigurationLoader} - CEP started in Single node mode
[2017-12-21 13:57:32,676] INFO {org.wso2.carbon.ldap.server.configuration.LDAPConfigurationBuilder} - KDC server is disabled.
[2017-12-21 13:57:47,505] INFO {org.wso2.carbon.mex.internal.Office365SupportMexComponent} - Office365Support MexServiceComponent bundle activated successfully..
[2017-12-21 13:57:47,505] INFO {org.wso2.carbon.mex2.internal.DynamicCRMCustomMexComponent} - DynamicCRMSupport MexServiceComponent bundle activated successfully.
[2017-12-21 13:57:49,897] INFO {org.wso2.carbon.user.core.ldap.ReadWriteLDAPUserStoreManager} - LDAP connection created successfully in read-write mode
[2017-12-21 13:57:53,287] INFO {org.wso2.carbon.registry.core.jdbc.EmbeddedRegistryService} - Configured Registry in 59ms
[2017-12-21 13:57:53,350] INFO {org.wso2.carbon.registry.core.internal.RegistryCoreServiceComponent} - Registry Mode : READ-WRITE
[2017-12-21 13:57:53,365] INFO {org.wso2.carbon.attachment.mgt.server.internal.AttachmentServiceComponent} - Initialising Attachment Server
[2017-12-21 13:57:53,738] INFO {org.wso2.carbon.attachment.mgt.core.dao.impl.jpa.AbstractJPAVendorAdapter} - [Attachment-Mgt OpenJPA] DB Dictionary: h2
[2017-12-21 13:57:53,738] INFO {org.wso2.carbon.attachment.mgt.core.dao.impl.jpa.AbstractJPAVendorAdapter} - [Attachment-Mgt OpenJPA] Generate DDL Enabled.
[2017-12-21 13:57:53,988] INFO {org.wso2.carbon.identity.authenticator.x509Certificate.internal.X509CertificateServiceComponent} - X509 Certificate Servlet activated successfully..
[2017-12-21 13:57:54,738] INFO {org.wso2.carbon.attachment.mgt.server.internal.AttachmentServiceComponent} - Registering AttachmentServerService
[2017-12-21 13:57:55,738] INFO {org.wso2.carbon.bpel.core.internal.BPELServiceComponent} - Initializing BPEL Engine........
[2017-12-21 13:57:55,847] INFO {org.wso2.carbon.bpel.core.ode.integration.BPELServerImpl} - Using DAO Connection Factory class: org.apache.ode.dao.jpa.BPELDAOConnectionFactoryImpl
[2017-12-21 13:57:56,035] INFO {org.wso2.carbon.bpel.core.ode.integration.BPELServerImpl} - Registering E4X Extension...
[2017-12-21 13:57:56,035] INFO {org.wso2.carbon.bpel.core.ode.integration.BPELServerImpl} - Registering B4P Extension...
[2017-12-21 13:57:56,035] INFO {org.wso2.carbon.bpel.core.ode.integration.BPELServerImpl} - Registering B4P Filter...
[2017-12-21 13:57:56,050] INFO {org.wso2.carbon.bpel.core.ode.integration.BPELServerImpl} - Registering MBeans
[2017-12-21 13:57:56,128] INFO {org.wso2.carbon.humantask.core.internal.HumanTaskServiceComponent} - Initialising HumanTask Server
[2017-12-21 13:57:56,160] INFO {org.wso2.carbon.humantask.core.dao.jpa.AbstractJPAVendorAdapter} - [HT OpenJPA] DB Dictionary: h2
[2017-12-21 13:57:56,160] INFO {org.wso2.carbon.humantask.core.dao.jpa.AbstractJPAVendorAdapter} - [HT OpenJPA] Generate DDL Enabled.
[2017-12-21 13:57:56,191] INFO {org.wso2.carbon.humantask.core.internal.HumanTaskServiceComponent} - Registering Axis2ConfigurationContextObserver
[2017-12-21 13:57:56,191] INFO {org.wso2.carbon.humantask.core.internal.HumanTaskServiceComponent} - Registering HT related MBeans
[2017-12-21 13:57:56,206] INFO {org.wso2.carbon.humantask.core.internal.HumanTaskServiceComponent} - MXBean for Human tasks registered successfully
[2017-12-21 13:57:56,347] INFO {org.wso2.carbon.metrics.impl.util.JmxReporterBuilder} - Creating JMX reporter for Metrics with domain 'org.wso2.carbon.metrics'
[2017-12-21 13:57:56,363] INFO {org.wso2.carbon.metrics.impl.util.JDBCReporterBuilder} - Creating JDBC reporter for Metrics with source 'Lenovo-PC', data source 'jdbc/WSO2MetricsDB' and 60 seconds polling period
[2017-12-21 13:57:56,378] INFO {org.wso2.carbon.metrics.impl.reporter.AbstractReporter} - Started JDBC reporter for Metrics
[2017-12-21 13:57:56,378] INFO {org.wso2.carbon.metrics.impl.reporter.AbstractReporter} - Started JMX reporter for Metrics
[2017-12-21 13:58:43,732] INFO {org.wso2.carbon.registry.indexing.solr.SolrClient} - Default Embedded Solr Server Initialized
[2017-12-21 13:58:44,076] INFO {org.wso2.carbon.user.core.internal.UserStoreMgtDSComponent} - Carbon UserStoreMgtDSComponent activated successfully.
[2017-12-21 13:58:45,232] INFO {org.wso2.carbon.identity.user.store.configuration.deployer.UserStoreConfigurationDeployer} - User Store Configuration Deployer initiated.
[2017-12-21 13:58:45,232] INFO {org.wso2.carbon.identity.user.store.configuration.deployer.UserStoreConfigurationDeployer} - User Store Configuration Deployer initiated.
[2017-12-21 13:58:45,263] INFO {org.wso2.carbon.bpel.deployer.BPELDeployer} - Initializing BPEL Deployer for tenant -1234.
[2017-12-21 13:58:45,263] INFO {org.wso2.carbon.humantask.deployer.HumanTaskDeployer} - Initializing HumanTask Deployer for tenant -1234.
[2017-12-21 13:58:46,935] FATAL {org.wso2.carbon.core.init.CarbonServerManager} - WSO2 Carbon initialization Failed
org.apache.axiom.om.OMException: com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 middle byte 0x3f (at char #2621, byte #-1)
at org.apache.axiom.om.impl.builder.StAXOMBuilder.next(StAXOMBuilder.java:296)
at org.apache.axiom.om.impl.llom.OMDocumentImpl.getOMDocumentElement(OMDocumentImpl.java:109)
at org.apache.axiom.om.impl.builder.StAXOMBuilder.getDocumentElement(StAXOMBuilder.java:570)
at org.apache.axiom.om.impl.builder.StAXOMBuilder.getDocumentElement(StAXOMBuilder.java:566)
at org.apache.axis2.util.XMLUtils.toOM(XMLUtils.java:592)
at org.apache.axis2.util.XMLUtils.toOM(XMLUtils.java:575)
at org.apache.axis2.deployment.DescriptionBuilder.buildOM(DescriptionBuilder.java:97)
at org.apache.axis2.deployment.AxisConfigBuilder.populateConfig(AxisConfigBuilder.java:91)
at org.apache.axis2.deployment.DeploymentEngine.populateAxisConfiguration(DeploymentEngine.java:887)
at org.apache.axis2.deployment.FileSystemConfigurator.getAxisConfiguration(FileSystemConfigurator.java:116)
at org.apache.axis2.context.ConfigurationContextFactory.createConfigurationContext(ConfigurationContextFactory.java:64)
at org.apache.axis2.context.ConfigurationContextFactory.createConfigurationContextFromFileSystem(ConfigurationContextFactory.java:210)
at org.wso2.carbon.core.init.CarbonServerManager.getClientConfigurationContext(CarbonServerManager.java:573)
at org.wso2.carbon.core.init.CarbonServerManager.initializeCarbon(CarbonServerManager.java:458)
at org.wso2.carbon.core.init.CarbonServerManager.removePendingItem(CarbonServerManager.java:291)
at org.wso2.carbon.core.init.PreAxis2ConfigItemListener.bundleChanged(PreAxis2ConfigItemListener.java:118)
at org.eclipse.osgi.framework.internal.core.BundleContextImpl.dispatchEvent(BundleContextImpl.java:847)
at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230)
at org.eclipse.osgi.framework.eventmgr.EventManager$EventThread.run(EventManager.java:340)
Caused by: com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 middle byte 0x3f (at char #2621, byte #-1)
at com.ctc.wstx.sr.StreamScanner.constructFromIOE(StreamScanner.java:625)
at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:997)
at com.ctc.wstx.sr.StreamScanner.getNext(StreamScanner.java:754)
at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2000)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1134)
at org.apache.axiom.om.impl.builder.StAXOMBuilder.parserNext(StAXOMBuilder.java:681)
at org.apache.axiom.om.impl.builder.StAXOMBuilder.next(StAXOMBuilder.java:214)
... 18 more
Caused by: java.io.CharConversionException: Invalid UTF-8 middle byte 0x3f (at char #2621, byte #-1)
at com.ctc.wstx.io.UTF8Reader.reportInvalidOther(UTF8Reader.java:314)
at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:212)
at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:87)
at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)
at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:991)
... 23 more
I referred to the following question: wso2 app server (carbon) startup error, however that did not help me much. Please advise me as to how I should run the WSO2 server.
Try following.
Stop the WSO2 Server.
add -Dfile.encoding=UTF8 under CMD_LINE_ARGS in wso2server.bat file.
Restart the Server.
Also, note that JDK 8u152 has a known gzip bug which causes failures in WSO2 products. Use 8u144 instead.

zookeeper node fails to communicate with Leader node

I have a zookeeper cluster which includes 3 nodes. Zookeeper config is mentioned below. While restarting it shows a success message but it shows status as failure.
zoo.cfg
dataDir=/ngs/app/<app>/zookeeper-3.4.6/zookeeperdata/1
clientPort=2181
initLimit=5
syncLimit=2
server.1=pr2-ligerp-lapp27.<domain.com>:2888:3888
server.2=pr2-ligerp-lapp28.<domain.com>:2889:3889
server.3=pr2-ligerp-lapp29.<domain.com>:2890:3890
Please find the logs below:
sh zkServer.sh start
JMX enabled by default
Using config: /ngs/app/ligerp/solr/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
-bash-4.1$
-bash-4.1$ cat zookeeper.out
2017-04-18 18:58:13,840 [myid:] - INFO [main:QuorumPeerConfig#103] - Reading configuration from: /ngs/app/ligerp/solr/zookeeper-3.4.6/bin/../conf/zoo.cfg
2017-04-18 18:58:13,843 [myid:] - INFO [main:QuorumPeerConfig#340] - Defaulting to majority quorums
2017-04-18 18:58:13,845 [myid:1] - INFO [main:DatadirCleanupManager#78] - autopurge.snapRetainCount set to 3
2017-04-18 18:58:13,845 [myid:1] - INFO [main:DatadirCleanupManager#79] - autopurge.purgeInterval set to 0
2017-04-18 18:58:13,846 [myid:1] - INFO [main:DatadirCleanupManager#101] - Purge task is not scheduled.
2017-04-18 18:58:13,854 [myid:1] - INFO [main:QuorumPeerMain#127] - Starting quorum peer
2017-04-18 18:58:13,861 [myid:1] - INFO [main:NIOServerCnxnFactory#94] - binding to port 0.0.0.0/0.0.0.0:2181
2017-04-18 18:58:13,875 [myid:1] - INFO [main:QuorumPeer#959] - tickTime set to 3000
2017-04-18 18:58:13,875 [myid:1] - INFO [main:QuorumPeer#979] - minSessionTimeout set to -1
2017-04-18 18:58:13,875 [myid:1] - INFO [main:QuorumPeer#990] - maxSessionTimeout set to -1
2017-04-18 18:58:13,875 [myid:1] - INFO [main:QuorumPeer#1005] - initLimit set to 5
2017-04-18 18:58:13,884 [myid:1] - INFO [main:FileSnap#83] - Reading snapshot /ngs/app/ligerp/solr/zookeeper-3.4.6/zookeeperdata/1/version-2/snapshot.1300000032
2017-04-18 18:58:13,954 [myid:1] - INFO [Thread-1:QuorumCnxManager$Listener#504] - My election bind port: pr2-ligerp-lapp27.<domain>/10.136.145.38:3888
2017-04-18 18:58:13,960 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer#714] - LOOKING
2017-04-18 18:58:13,961 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection#815] - New election. My id = 1, proposed zxid=0x130000024b
2017-04-18 18:58:13,962 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection#597] - Notification: 1 (message format version), 1 (n.leader), 0x130000024b (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x13 (n.peerEpoch) LOOKING (my state)
2017-04-18 18:58:13,964 [myid:1] - INFO [WorkerSender[myid=1]:QuorumCnxManager#193] - Have smaller server identifier, so dropping the connection: (2, 1)
2017-04-18 18:58:13,964 [myid:1] - INFO [WorkerSender[myid=1]:QuorumCnxManager#193] - Have smaller server identifier, so dropping the connection: (3, 1)
2017-04-18 18:58:14,165 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager#193] - Have smaller server identifier, so dropping the connection: (2, 1)
2017-04-18 18:58:14,166 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager#193] - Have smaller server identifier, so dropping the connection: (3, 1)
2017-04-18 18:58:14,166 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection#849] - Notification time out: 400
2017-04-18 18:58:15,566 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager#193] - Have smaller server identifier, so dropping the connection: (2, 1)
2017-04-18 18:58:15,567 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager#193] - Have smaller server identifier, so dropping the connection: (3, 1)
2017-04-18 18:58:15,567 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection#849] - Notification time out: 800
2017-04-18 18:58:16,368 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager#193] - Have smaller server identifier, so dropping the connection: (2, 1)
2017-04-18 18:58:16,368 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager#193] - Have smaller server identifier, so dropping the connection: (3, 1)
2017-04-18 18:58:16,368 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection#849] - Notification time out: 1600
2017-04-18 18:58:17,969 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager#193] - Have smaller server identifier, so dropping the connection: (2, 1)
2017-04-18 18:58:17,969 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager#193] - Have smaller server identifier, so dropping the connection: (3, 1)
2017-04-18 18:58:17,970 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection#849] - Notification time out: 3200
But after checking status, we found that it's not running but we can still see the process id. Could someone help fixing this issue.
sh zkServer.sh status
JMX enabled by default
Using config: /ngs/app/ligerp/solr/zookeeper-3.4.6/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
This is the exception in the leader node of zookeeper:
2017-04-18 18:25:32,634 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2183:NIOServerCnxnFactory#197] - Accepted socket connection from /127.0.0.1:47916
2017-04-18 18:25:32,635 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2183:NIOServerCnxn#827] - Processing srvr command from /127.0.0.1:47916
2017-04-18 18:25:32,635 [myid:3] - INFO [Thread-22:NIOServerCnxn#1007] - Closed socket connection for client /127.0.0.1:47916 (no session established for client)
2017-04-18 18:30:01,662 [myid:3] - WARN [RecvWorker:1:QuorumCnxManager$RecvWorker#780] - Connection broken for id 1, my id = 3, error =
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:765)
2017-04-18 18:30:01,663 [myid:3] - WARN [RecvWorker:1:QuorumCnxManager$RecvWorker#783] - Interrupting SendWorker
2017-04-18 18:30:01,662 [myid:3] - ERROR [LearnerHandler-/10.136.145.38:47656:LearnerHandler#633] - Unexpected exception causing shutdown while sock still open
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:546)
2017-04-18 18:30:01,663 [myid:3] - WARN [SendWorker:1:QuorumCnxManager$SendWorker#697] - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:849)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:64)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:685)
2017-04-18 18:30:01,663 [myid:3] - WARN [LearnerHandler-/10.136.145.38:47656:LearnerHandler#646] - ******* GOODBYE /10.136.145.38:47656 ********
2017-04-18 18:30:01,663 [myid:3] - WARN [SendWorker:1:QuorumCnxManager$SendWorker#706] - Send worker leaving thread
2017-04-18 18:39:40,076 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2183:NIOServerCnxnFactory#197] - Accepted socket connection from /10.136.145.38:58748
2017-04-18 18:39:40,077 [myid:3] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2183:NIOServerCnxn#357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2017-04-18 18:39:40,078 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2183:NIOServerCnxn#1007] - Closed socket connection for client /10.136.145.38:58748 (no session established for client)
2017-04-18 18:42:46,516 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2183:NIOServerCnxnFactory#197] - Accepted socket connection from /127.0.0.1:47988
2017-04-18 18:42:46,516 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2183:NIOServerCnxn#827] - Processing srvr command from /127.0.0.1:47988
2017-04-18 18:42:46,517 [myid:3] - INFO [Thread-23:NIOServerCnxn#1007] - Closed socket connection for client /127.0.0.1:47988 (no session established for client)
Fixed the issue by making below changes in zoo.cfg of all the zookeeper nodes:
changes the zookeeper hostname to IP address
started zookeeper instances in a sequence of descending id.
increased initSize from 5 to 100
A sample zoo.cfg looks like this:
dataDir=/ngs/app/ligerp/solr/zookeeper-3.4.6/zookeeperdata/1
clientPort=2181
initLimit=100
syncLimit=2
server.1=10.136.145.38:2888:3888
server.2=10.136.145.39:2889:3889
server.3=10.136.145.40:2890:3890
In my case, it was simply a matter of not sufficient CPU allocation to the process.
Try allocating additional CPU and see if this fixes the problem.