PGAgent : Caught unhandled unknown exception; terminating - pgagent

The trouble with constantly running jobs on postgre DB, which will never finished.
I have tried the following actions to fix it:
apt-get update & upgrade(postgresql was updated to latest)
/etc/init.d/postgresql restart
postgresql.service - PostgreSQL RDBMS
Loaded: loaded (/lib/systemd/system/postgresql.service; enabled)
Active: active (exited) since Fri 2015-10-16 10:06:04 UTC; 9s ago
Process: 6787 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
Main PID: 6787 (code=exited, status=0/SUCCESS)
CGroup: /system.slice/postgresql.service
/etc/init.d/pgagent restart
pgagent.service - Postgres Job Agent Daemon
Loaded: loaded (/lib/systemd/system/pgagent.service; enabled)
Active: active (running) since Fri 2015-10-16 10:06:04 UTC; 1min 48s ago
Main PID: 6793 (pgagent)
CGroup: /system.slice/pgagent.service
└─6793 /usr/bin/pgagent -f -l 2 -s /var/log/pgagent hostaddr=localhost dbname=postgres user=postgresext
Oct 16 10:07:00 m-t-db-01 pgagent[6793]: *** Caught unhandled unknown exception; terminating
Oct 16 10:07:50 m-t-db-01 pgagent[6793]: *** Caught unhandled unknown exception; terminating
tried to enable debug mode on pgagent vim /etc/default/pgagent
EXTRA_OPTS="-f -l 2 -s /var/log/pgagent hostaddr=localhost dbname=postgres user=postgresext"
tried to reboot the machine
in /var/log/pgagent log I see only:
ERROR: Failed to query jobs table!
DEBUG: Creating primary connection
DEBUG: Connection Information:
DEBUG: user : postgresext
DEBUG: port : 0
DEBUG: host : localhost
DEBUG: dbname : postgres
DEBUG: password :
DEBUG: conn timeout : 0
DEBUG: Connection Information:
DEBUG: user : postgresext
DEBUG: port : 0
DEBUG: host : localhost
DEBUG: dbname : postgres
DEBUG: password :
DEBUG: conn timeout : 0
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
DEBUG: Database sanity check
DEBUG: Clearing zombies
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 1, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Creating primary connection
DEBUG: Connection Information:
DEBUG: user : postgresext
DEBUG: port : 0
DEBUG: host : localhost
DEBUG: dbname : postgres
DEBUG: password :
DEBUG: conn timeout : 0
DEBUG: Connection Information:
DEBUG: user : postgresext
DEBUG: port : 0
DEBUG: host : localhost
DEBUG: dbname : postgres
DEBUG: password :
DEBUG: conn timeout : 0
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
DEBUG: Database sanity check
DEBUG: Clearing zombies
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 1, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 1, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Creating job thread for job 8
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
DEBUG: Allocating new connection to database postgres
DEBUG: Starting job: 8
DEBUG: Creating job thread for job 5
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
DEBUG: Allocating new connection to database postgres
DEBUG: Starting job: 5
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres dbname=testdb
DEBUG: Sleeping...
DEBUG: Allocating new connection to database testdb
DEBUG: Executing SQL step 23 (part of job 8)
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres dbname=testdb
DEBUG: Allocating new connection to database testdb
DEBUG: Executing SQL step 15 (part of job 5)
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 5, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Destroying job thread for job 8
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 5, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 5, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 5, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 5, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
in vim /var/log/postgresql/postgresql-9.4-main.log I see only:
[unknown]#[unknown] LOG: incomplete startup packet
LOG: MultiXact member wraparound protections are now enabled
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
postgresext#testdb LOG: could not receive data from client: Connection reset by peer
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
postgresext#testdb LOG: could not receive data from client: Connection reset by peer
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
postgresext#testdb LOG: could not receive data from client: Connection reset by peer
postgresext#testdb LOG: could not receive data from client: Connection reset by peer
postgresext#testdb LOG: could not receive data from client: Connection reset by peer
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
I cannot figure out what the actual problem and how to fix it actually?
One more thing I have investigated is that pgagent[6793]: *** Caught unhandled unknown exception; terminating can be invoked by incorrect connection termination...

A little bit late to answering this, but I ran into the same issues when using pgAgent, and it caused nothing but frustration. I even tried to dig into the source to fix it, but just trying to get it to compile on my server was a total nightmare due to the dependencies it requires (for no good reason).
So with that in mind, I decided to write a new job scheduler for Postgres as a drop in replacement for pgAgent. It's called jpgAgent, and i've been using it at my company for a couple months now with no issues at all. Much more stable now that we're using jpgAgent, don't have to worry about the agent just stopping on us randomly and jobs not getting run as they should, where as previously that was a common problem.
Anyways, hope it helps you out.

Related

Connnections handling

So I've been using karate for a while now, and there has been an issue we were facing since over the last year: org.apache.http.conn.ConnectTimeoutException
Other threads about that mentioned connectionTimeout exception were solvable by specifying proxy, but taht did not help us.
After tons of investigation, it turned out that our Azure SNAT was exhausted, meaning Karate was opening way too many connections.
To verify this I enabled log debugging and used this feature
Background:
* url "https://www.karatelabs.io/"
Scenario:
* method GET
* method GET
the logs then had following lines
13:10:17.868 [main] DEBUG com.intuit.karate - request:
1 > GET https://www.karatelabs.io/
1 > Host: www.karatelabs.io
1 > Connection: Keep-Alive
1 > User-Agent: Apache-HttpClient/4.5.13 (Java/17.0.4.1)
1 > Accept-Encoding: gzip,deflate
13:10:17.868 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection request: [route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 0 of 5; total allocated: 0 of 10]
13:10:17.874 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection leased: [id: 0][route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 1 of 5; total allocated: 1 of 10]
13:10:17.875 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Opening connection {s}->https://www.karatelabs.io:443
13:10:17.883 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connecting to www.karatelabs.io/34.149.87.45:443
13:10:17.883 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Connecting socket to www.karatelabs.io/34.149.87.45:443 with timeout 30000
13:10:17.924 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled protocols: [TLSv1.3, TLSv1.2]
13:10:17.924 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled cipher suites:[...]
13:10:17.924 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Starting handshake
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Secure session established
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated protocol: TLSv1.3
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated cipher suite: TLS_AES_256_GCM_SHA384
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer principal: CN=karatelabs.io
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer alternative names: [karatelabs.io, www.karatelabs.io]
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - issuer principal: CN=Sectigo RSA Domain Validation Secure Server CA, O=Sectigo Limited, L=Salford, ST=Greater Manchester, C=GB
13:10:18.014 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connection established localIp<->serverIp
13:10:18.015 [main] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-0: set socket timeout to 120000
13:10:18.015 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Executing request GET / HTTP/1.1
...
13:10:18.066 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Connection can be kept alive indefinitely
...
...
13:10:18.196 [main] DEBUG com.intuit.karate - request:
2 > GET https://www.karatelabs.io/
13:10:18.196 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection request: [route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 0 of 5; total allocated: 0 of 10]
13:10:18.196 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection leased: [id: 1][route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 1 of 5; total allocated: 1 of 10]
13:10:18.196 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Opening connection {s}->https://www.karatelabs.io:443
13:10:18.196 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connecting to www.karatelabs.io/34.149.87.45:443
13:10:18.196 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Connecting socket to www.karatelabs.io/34.149.87.45:443 with timeout 30000
13:10:18.206 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled protocols: [TLSv1.3, TLSv1.2]
13:10:18.206 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled cipher suites:[...]
13:10:18.206 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Starting handshake
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Secure session established
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated protocol: TLSv1.3
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated cipher suite: TLS_AES_256_GCM_SHA384
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer principal: CN=karatelabs.io
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer alternative names: [karatelabs.io, www.karatelabs.io]
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - issuer principal: CN=Sectigo RSA Domain Validation Secure Server CA, O=Sectigo Limited, L=Salford, ST=Greater Manchester, C=GB
13:10:18.236 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connection established localIp<->serverIp
13:10:18.236 [main] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-1: set socket timeout to 120000
...
13:10:18.279 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Connection can be kept alive indefinitely
...
...
13:10:18.609 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager is shutting down
13:10:18.610 [Finalizer] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-1: Shutdown connection
13:10:18.611 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager shut down
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager is shutting down
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-2: Shutdown connection
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager shut down
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager is shutting down
"Connecting to socket" and "handshake" indicate that karate is establishing a new connection instead of using an already opened one, even though I am sending a request to the same host.
On the other hand, on longer scenarios, I was seeing "http-outgoing-x: Shutdown connection" after about ~1s from opening it, in the middle of the run, despite having "karate.configure('readTimeout', 120000)" specified.
I don't think that was intentional, especially after seeing the "keep-alive" header and the "Connection can be kept alive indefinitely" in the log"
That being said, is there any way to force karate to use the same connection instead of establishing a new one each request?
As far as we know, we use the Apache HTTP Client API the right way.
But you never know. The best thing is for you to dive into the code and see what we could be missing. Or you could provide a way to replicate following these instructions: https://github.com/karatelabs/karate/wiki/How-to-Submit-an-Issue

SignalR server in console host not sending keep-alives

I have an ASP.NET Core application using a SignalR hub. When running via a console application (development mode), no keep-alive requests are sent by the server to the client. Consequently, the connection is re-established every 30 seconds or so.
However, when running the same application via Service Fabric, keep-alive requests are sent and everything works as expected.
Here are the server logs when running under the console app:
dbug: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionManager[1]
New connection T2NKQg0jyrm7QAPz4p0ZWA created.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionDispatcher[4]
Establishing new connection.
dbug: Microsoft.AspNetCore.SignalR.HubConnectionHandler[5]
OnConnectedAsync started.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[1]
Socket opened using Sub-Protocol: '(null)'.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[9]
Message received. Type: Text, size: 32, EndOfMessage: True.
dbug: Microsoft.AspNetCore.SignalR.Internal.DefaultHubProtocolResolver[2]
Found protocol implementation for requested protocol: json.
dbug: Microsoft.AspNetCore.SignalR.HubConnectionContext[1]
Completed connection handshake. Using HubProtocol 'json'.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[11]
Sending payload: 3 bytes.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[9]
Message received. Type: Text, size: 11, EndOfMessage: True.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[9]
Message received. Type: Text, size: 11, EndOfMessage: True.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[4]
Waiting for the application to finish sending data.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[2]
Socket closed.
trce: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionContext[1]
Disposing connection T2NKQg0jyrm7QAPz4p0ZWA.
trce: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionContext[2]
Waiting for application to complete.
dbug: Microsoft.AspNetCore.SignalR.HubConnectionHandler[6]
OnConnectedAsync ending.
trce: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionContext[3]
Application complete.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionManager[2]
Removing connection T2NKQg0jyrm7QAPz4p0ZWA from the list of connections.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionManager[1]
New connection JW_1AnoGvhvNJ6MdWGb5RA created.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionDispatcher[4]
Establishing new connection.
dbug: Microsoft.AspNetCore.SignalR.HubConnectionHandler[5]
OnConnectedAsync started.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[1]
Socket opened using Sub-Protocol: '(null)'.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[9]
Message received. Type: Text, size: 32, EndOfMessage: True.
dbug: Microsoft.AspNetCore.SignalR.Internal.DefaultHubProtocolResolver[2]
Found protocol implementation for requested protocol: json.
dbug: Microsoft.AspNetCore.SignalR.HubConnectionContext[1]
Completed connection handshake. Using HubProtocol 'json'.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[11]
Sending payload: 3 bytes.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[9]
Message received. Type: Text, size: 11, EndOfMessage: True.
And the client logs:
Microsoft.AspNetCore.Http.Connections.Client.HttpConnection: Debug: Transport 'WebSockets' started.
Microsoft.AspNetCore.Http.Connections.Client.HttpConnection: Information: HttpConnection Started.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Information: Using HubProtocol 'json v1'.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Sending Hub Handshake.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Received message from application. Payload size: 32.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Message received. Type: Text, size: 3, EndOfMessage: True.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Handshake with server complete.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Receive loop starting.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Sending PingMessage message.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Received message from application. Payload size: 11.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Sending PingMessage message completed.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Information: HubConnection started.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: The HubConnection is attempting to transition from the Connecting state to the Connected state.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: Releasing Connection Lock in StartAsyncInner (/_/src/SignalR/clients/csharp/Client.Core/src/HubConnection.cs:280).
The thread 0x8184 has exited with code 0 (0x0).
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: Acquired the Connection Lock in order to ping the server.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Sending PingMessage message.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Sending PingMessage message completed.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: Releasing Connection Lock in RunTimerActions (/_/src/SignalR/clients/csharp/Client.Core/src/HubConnection.cs:1881).
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Received message from application. Payload size: 11.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: Waiting on Connection Lock in HandleConnectionClose (/_/src/SignalR/clients/csharp/Client.Core/src/HubConnection.cs:1279).
Microsoft.AspNetCore.Http.Connections.Client.HttpConnection: Debug: Disposing HttpConnection.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Information: Transport is stopping.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Send loop stopped.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Transport stopped.
Microsoft.AspNetCore.Http.Connections.Client.HttpConnection: Information: HttpConnection Disposed.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Canceling all outstanding invocations.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Receive loop canceled.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Receive loop stopped.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: The HubConnection is attempting to transition from the Connected state to the Reconnecting state.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Error: HubConnection reconnecting due to an error.
I won't include them here, but the logs when running under Service Fabric show that the server is correctly sending keep-alives to the client ("Sent a ping message to the client").
It might seem obvious that there is some difference in configuration between my console and Service Fabric hosts, but I've gone through it carefully and cannot see anything that would explain this. In fact, the SignalR integration differed only in that the development host configured detailed errors to be enabled, but even if I remove that the behavior remains the same.
Short of running my own build of ASP.NET Core (something I'm perhaps lazily attempting to avoid only because it was looking far from trivial to build), is there anything I might be missing that would explain this situation?

SCDF yarn connection error

I have deployed spring cloud dataflow on a virtual yarn cluster. starting the server ./bin/dataflow-server-yarn executes correctly. and returns
2016-11-02 10:31:59.786 INFO 42493 --- [ main] o.s.i.endpoint.EventDrivenConsumer : Adding {logging-channel-adapter:_org.springframework.integration.errorLogger} as a subscriber to the 'errorChannel' channel
2016-11-02 10:31:59.787 INFO 42493 --- [ main] o.s.i.channel.PublishSubscribeChannel : Channel 'spring-cloud-dataflow-server-yarn:9393.errorChannel' has 1 subscriber(s).
2016-11-02 10:31:59.787 INFO 42493 --- [ main] o.s.i.endpoint.EventDrivenConsumer : started _org.springframework.integration.errorLogger
2016-11-02 10:31:59.896 INFO 42493 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 9393 (http)
2016-11-02 10:31:59.901 INFO 42493 --- [ main] o.s.c.d.server.yarn.YarnDataFlowServer : Started YarnDataFlowServer in 16.026 seconds (JVM running for 16.485)
I can then start ./bin/dataflow-shell , from here i can import apps create and list streams without errors; however, if i try to deploy the created stream the following connection error happens
2016-11-02 10:52:58.275 INFO 42788 --- [nio-9393-exec-8] o.s.c.deployer.spi.yarn.YarnAppDeployer : Deploy request for org.springframework.cloud.deployer.spi.core.AppDeploymentRequest#23d59aea
2016-11-02 10:52:58.275 INFO 42788 --- [nio-9393-exec-8] o.s.c.deployer.spi.yarn.YarnAppDeployer : Deploying request for definition [AppDefinition#3350878c name = 'time', properties = map['spring.cloud.stream.bindings.output.producer.requiredGroups' -> 'ticktock', 'spring.cloud.stream.bindings.output.destination' -> 'ticktock.time']]
2016-11-02 10:52:58.275 INFO 42788 --- [nio-9393-exec-8] o.s.c.deployer.spi.yarn.YarnAppDeployer : Parameters for definition {spring.cloud.stream.bindings.output.producer.requiredGroups=ticktock, spring.cloud.stream.bindings.output.destination=ticktock.time}
2016-11-02 10:52:58.275 INFO 42788 --- [nio-9393-exec-8] o.s.c.deployer.spi.yarn.YarnAppDeployer : Deployment properties for request {spring.cloud.deployer.group=ticktock}
2016-11-02 10:52:58.276 INFO 42788 --- [rTaskExecutor-1] o.s.s.support.LifecycleObjectSupport : started org.springframework.statemachine.support.DefaultStateMachineExecutor#6d68eeb7
2016-11-02 10:52:58.276 INFO 42788 --- [rTaskExecutor-1] o.s.s.support.LifecycleObjectSupport : started RESOLVEINSTANCE WAITINSTANCE STARTCLUSTER CREATECLUSTER PUSHARTIFACT STARTINSTANCE CHECKINSTANCE PUSHAPP CHECKAPP WAITCHOICE STARTINSTANCECHOICE PUSHAPPCHOICE / / uuid=d7e5224f-c5f0-47c9-b2c2-066b117cc786 / id=null
2016-11-02 10:52:58.276 INFO 42788 --- [rTaskExecutor-1] o.s.c.d.s.y.DefaultYarnCloudAppService : Cachekey STREAMnull found YarnCloudAppServiceApplication org.springframework.cloud.deployer.spi.yarn.YarnCloudAppServiceApplication#79158163
2016-11-02 10:52:58.280 INFO 42788 --- [rTaskExecutor-1] o.s.c.d.s.y.DefaultYarnCloudAppService : Cachekey STREAMnull found YarnCloudAppServiceApplication org.springframework.cloud.deployer.spi.yarn.YarnCloudAppServiceApplication#79158163
2016-11-02 10:52:59.281 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:00.282 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:01.283 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:02.283 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:03.284 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:04.285 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:05.286 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:06.287 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:07.288 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:08.289 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:08.290 INFO 42788 --- [rTaskExecutor-1] o.s.c.d.s.y.DefaultYarnCloudAppService : Cachekey STREAMapp--spring.yarn.appName=scdstream:app:ticktock,--spring.yarn.client.launchcontext.arguments.--spring.cloud.deployer.yarn.appmaster.artifact=/dataflow//artifacts/cache/ found YarnCloudAppServiceApplication org.springframework.cloud.deployer.spi.yarn.YarnCloudAppServiceApplication#7d3bc3f0
2016-11-02 10:53:09.294 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:10.295 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:11.296 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:12.297 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:13.297 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:14.298 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:15.299 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:16.300 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:17.301 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:18.302 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:18.361 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.fs.TrashPolicyDefault : Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-11-02 10:53:18.747 INFO 42788 --- [rTaskExecutor-1] o.s.s.support.LifecycleObjectSupport : stopped org.springframework.statemachine.support.DefaultStateMachineExecutor#6d68eeb7
2016-11-02 10:53:18.747 INFO 42788 --- [rTaskExecutor-1] o.s.s.support.LifecycleObjectSupport : stopped RESOLVEINSTANCE WAITINSTANCE STARTCLUSTER CREATECLUSTER PUSHARTIFACT STARTINSTANCE CHECKINSTANCE PUSHAPP CHECKAPP WAITCHOICE STARTINSTANCECHOICE PUSHAPPCHOICE / / uuid=d7e5224f-c5f0-47c9-b2c2-066b117cc786 / id=null
2016-11-02 10:53:18.747 ERROR 42788 --- [rTaskExecutor-1] o.s.c.d.s.y.AbstractDeployerStateMachine : Passing through error state DefaultStateContext [stage=STATE_ENTRY, message=GenericMessage [payload=DEPLOY, headers={artifact=org.springframework.cloud.stream.app:time-source-kafka:jar:1.0.0.BUILD-SNAPSHOT, appVersion=app, groupId=ticktock, definitionParameters={spring.cloud.stream.bindings.output.producer.requiredGroups=ticktock, spring.cloud.stream.bindings.output.destination=ticktock.time}, count=1, clusterId=ticktock:time, id=f0f7b54e-e59d-69c0-a405-f0907fa46343, contextRunArgs=[--spring.yarn.appName=scdstream:app:ticktock, --spring.yarn.client.launchcontext.arguments.--spring.cloud.deployer.yarn.appmaster.artifact=/dataflow//artifacts/cache/], artifactDir=/dataflow//artifacts/cache/, timestamp=1478083978275}], messageHeaders={artifact=org.springframework.cloud.stream.app:time-source-kafka:jar:1.0.0.BUILD-SNAPSHOT, appVersion=app, groupId=ticktock, definitionParameters={spring.cloud.stream.bindings.output.producer.requiredGroups=ticktock, spring.cloud.stream.bindings.output.destination=ticktock.time}, count=1, clusterId=ticktock:time, id=f0f7b54e-e59d-69c0-a405-f0907fa46343, contextRunArgs=[--spring.yarn.appName=scdstream:app:ticktock, --spring.yarn.client.launchcontext.arguments.--spring.cloud.deployer.yarn.appmaster.artifact=/dataflow//artifacts/cache/], artifactDir=/dataflow//artifacts/cache/, timestamp=1478083978275}, extendedState=DefaultExtendedState [variables={artifact=org.springframework.cloud.stream.app:time-source-kafka:jar:1.0.0.BUILD-SNAPSHOT, appVersion=app, appname=scdstream:app:ticktock, definitionParameters={spring.cloud.stream.bindings.output.producer.requiredGroups=ticktock, spring.cloud.stream.bindings.output.destination=ticktock.time}, count=1, messageId=f0f7b54e-e59d-69c0-a405-f0907fa46343, clusterId=ticktock:time, error=org.springframework.yarn.YarnSystemException: Call From master/127.0.1.1 to localhost:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused; nested exception is java.net.ConnectException: Call From master/127.0.1.1 to localhost:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused}], transition=org.springframework.statemachine.transition.DefaultExternalTransition#77c2bbfc, stateMachine=UNDEPLOYMODULE DESTROYCLUSTER STOPCLUSTER DEPLOYMODULE RESOLVEINSTANCE WAITINSTANCE STARTCLUSTER CREATECLUSTER PUSHARTIFACT STARTINSTANCE CHECKINSTANCE PUSHAPP CHECKAPP WAITCHOICE STARTINSTANCECHOICE PUSHAPPCHOICE ERROR READY ERROR_JUNCTION UNDEPLOYEXIT DEPLOYEXIT / ERROR / uuid=436b73fe-991a-4c22-a418-334f895d41e5 / id=null, source=null, target=null, sources=null, targets=null, exception=null]
2016-11-02 10:53:18.748 WARN 42788 --- [nio-9393-exec-8] o.s.c.d.s.c.StreamDeploymentController : Exception when deploying the app StreamAppDefinition [streamName=ticktock, name=time, registeredAppName=time, properties={spring.cloud.stream.bindings.output.producer.requiredGroups=ticktock, spring.cloud.stream.bindings.output.destination=ticktock.time}]: java.util.concurrent.ExecutionException: org.springframework.yarn.YarnSystemException: Call From master/127.0.1.1 to localhost:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused; nested exception is java.net.ConnectException: Call From master/127.0.1.1 to localhost:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
changing the IP address to the localhost,yielded same results.
here is my server.yml
logging:
level:
org.apache.hadoop: INFO
org.springframework.yarn: INFO
maven:
remoteRepositories:
springRepo:
url: https://repo.spring.io/libs-snapshot
spring:
main:
show_banner: false
# Configured for Hadoop single-node running on localhost. Replace with property values reflecting your
# actual Hadoop cluster when running in a distributed environment.
hadoop:
fsUri: hdfs://192.168.137.135:8020
resourceManagerHost: 192.168.137.135
resourceManagerPort: 8032
resourceManagerSchedulerAddress: 192.168.137.135:8030
# Configured for Redis running on localhost. Replace at least host property when running in a
# distributed environment.
redis:
port: 6379
host: 192.168.137.135
# Configured for an embedded in-memory H2 database. Replace the datasource configuration with properties
# matching your preferred database to be used instead, if needed, or when running in a distributed environment.
#rabbitmq:
# addresses: localhost:5672
# for default embedded database
datasource:
url: jdbc:h2:tcp://localhost:19092/mem:dataflow
username: sa
password:
driverClassName: org.h2.Driver
# # for mysql/mariadb datasource
# datasource:
# url: jdbc:mysql://localhost:3306/yourDB
# username: yourUsername
# password: yourPassword
# driverClassName: org.mariadb.jdbc.Driver
# # for postgresql datasource
# datasource:
# url: jdbc:postgresql://localhost:5432/yourDB
# username: yourUsername
# password: yourPassword
# driverClassName: org.postgresql.Driver
cloud:
stream:
kafka:
binder:
brokers: localhost:9093
zkNodes: localhost:2181
config:
enabled: false
server:
bootstrap: true
deployer:
yarn:
app:
baseDir: /dataflow
streamappmaster:
memory: 512m
virtualCores: 1
javaOpts: "-Xms512m -Xmx512m"
streamcontainer:
priority: 5
memory: 256m
virtualCores: 1
javaOpts: "-Xms64m -Xmx256m"
taskappmaster:
memory: 512m
virtualCores: 1
javaOpts: "-Xms512m -Xmx512m"
taskcontainer:
priority: 10
memory: 256m
virtualCores: 1
javaOpts: "-Xms64m -Xmx256m"
# yarn:
# hostdiscovery:
# pointToPoint: false
# loopback: false
# preferInterface: ['eth', 'en']
# matchIpv4: 192.168.0.0/24
# matchInterface: eth\\d*
I found a workaround, it seems the port was the problem. I changed the port number on the yarn-site.xml and did the same on the servers.xml and voila!!

Grinder agent fails to report data to console through ssh tunnel

I'm using Vagrant ssh to connect to a remote Grinder agent while running the console locally:
vagrant ssh agent01 -c "./startAgent.sh" -- -R 6372:localhost:6372
The console can talk to the agent, and start the agent's threads:
2015-07-24 10:12:54,391 INFO agent: The Grinder 3.11
2015-07-24 10:12:54,507 INFO agent: connected to console at /127.0.0.1:6372
2015-07-24 10:12:54,507 INFO agent: waiting for console signal
2015-07-24 10:12:57,869 INFO agent: received a start message
2015-07-24 10:12:57,887 INFO agent: Worker process command line: java '-javaagent:grinder-3.11/lib/grinder-dcr-agent-3.11.jar' -classpath 'grinder-3.11/lib/grinder.jar' net.grinder.engine.process.WorkerProcessEntryPoint
2015-07-24 10:12:57,967 INFO agent: worker agent01-0 started
2015-07-24 10:12:57,976 INFO agent: worker agent01-1 started
2015-07-24 10:12:58,007 INFO agent: worker agent01-2 started
2015-07-24 10:12:58,032 INFO agent: worker agent01-3 started
2015-07-24 10:12:58,059 INFO agent: worker agent01-4 started
2015-07-24 10:12:58,095 INFO agent: worker agent01-5 started
2015-07-24 10:12:58,156 INFO agent: worker agent01-6 started
2015-07-24 10:12:58,182 INFO agent: worker agent01-7 started
2015-07-24 10:12:58,214 INFO agent: worker agent01-8 started
2015-07-24 10:12:58,250 INFO agent: worker agent01-9 started
Shortly thereafter, though, the agents say:
2015-07-24 10:13:38,016 INFO agent01-2: Report to console failed
net.grinder.communication.CommunicationException: Exception whilst sending message
at net.grinder.communication.AbstractSender.send(AbstractSender.java:57) ~[grinder-core-3.11.jar:na]
at net.grinder.communication.QueuedSenderDecorator.flush(QueuedSenderDecorator.java:60) ~[grinder-core-3.11.jar:na]
at net.grinder.engine.process.GrinderProcess.sendStatusMessage(GrinderProcess.java:638) [grinder-core-3.11.jar:na]
at net.grinder.engine.process.GrinderProcess.access$1100(GrinderProcess.java:110) [grinder-core-3.11.jar:na]
at net.grinder.engine.process.GrinderProcess$ReportToConsoleTimerTask.run(GrinderProcess.java:615) ~[grinder-core-3.11.jar:na]
at net.grinder.engine.process.GrinderProcess.run(GrinderProcess.java:465) [grinder-core-3.11.jar:na]
at net.grinder.engine.process.WorkerProcessEntryPoint.run(WorkerProcessEntryPoint.java:86) [grinder-core-3.11.jar:na]
at net.grinder.engine.process.WorkerProcessEntryPoint.main(WorkerProcessEntryPoint.java:59) [grinder-core-3.11.jar:na]
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method) ~[na:1.7.0_75]
at java.net.SocketOutputStream.socketWrite(Unknown Source) ~[na:1.7.0_75]
at java.net.SocketOutputStream.write(Unknown Source) ~[na:1.7.0_75]
...
and never provide data to the console.
What causes this? Is there ssh configuration I need to tweak?
To fix this, I added this line to /etc/ssh/sshd_config on the grinder agent host:
MaxStartups 1000
and changed the grinder agent properties from:
grinder.processes = 10
grinder.threads = 100
to
grinder.processes = 4
grinder.threads = 250

Configuration of CloudAMQP Connection

I'm having difficulty configuring my connection to CloudAMQP in my deployed grails application. I can run the application locally against a locally installed RabbitMQ instance but can't figure out how to correctly define my application to run on CloudBees using the CloudAMQP service.
In my Config.groovy, I'm defining my connection info and a queue:
rabbitmq {
connectionfactory {
username = 'USERNAME'
password = 'PASSWORD'
hostname = 'lemur.cloudamqp.com'
}
queues = {
testQueue autoDelete: false, durable: false, exclusive: false
}
}
When the application starts and tries to connect, I see the following log messages:
2013-08-23 21:29:59,195 [main] DEBUG listener.SimpleMessageListenerContainer - Starting Rabbit listener container.
2013-08-23 21:29:59,205 [SimpleAsyncTaskExecutor-1] DEBUG listener.BlockingQueueConsumer - Starting consumer Consumer: tag=[null], channel=null, acknowledgeMode=AUTO local queue size=0
2013-08-23 21:30:08,405 [SimpleAsyncTaskExecutor-1] WARN listener.SimpleMessageListenerContainer - Consumer raised exception, processing can restart if the connection factory supports it
org.springframework.amqp.AmqpIOException: java.io.IOException
at org.springframework.amqp.rabbit.connection.RabbitUtils.convertRabbitAccessException(RabbitUtils.java:112)
at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:163)
at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createConnection(CachingConnectionFactory.java:228)
at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils$1.createConnection(ConnectionFactoryUtils.java:119)
at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils.doGetTransactionalResourceHolder(ConnectionFactoryUtils.java:163)
at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils.getTransactionalResourceHolder(ConnectionFactoryUtils.java:109)
at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:199)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:524)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException
at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:106)
at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:102)
at com.rabbitmq.client.impl.AMQChannel.exnWrappingRpc(AMQChannel.java:124)
at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:381)
at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:516)
at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:545)
Caused by: com.rabbitmq.client.ShutdownSignalException: connection error; reason: java.net.SocketException: Connection reset
at com.rabbitmq.utility.ValueOrException.getValue(ValueOrException.java:67)
at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:33)
at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:343)
at com.rabbitmq.client.impl.AMQChannel.privateRpc(AMQChannel.java:216)
at com.rabbitmq.client.impl.AMQChannel.exnWrappingRpc(AMQChannel.java:118)
... 3 more
Caused by: java.net.SocketException: Connection reset
at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)
at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:131)
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:508)
2013-08-23 21:30:08,406 [SimpleAsyncTaskExecutor-1] INFO listener.SimpleMessageListenerContainer - Restarting Consumer: tag=[null], channel=null, acknowledgeMode=AUTO local queue size=0
2013-08-23 21:30:08,406 [SimpleAsyncTaskExecutor-1] DEBUG listener.BlockingQueueConsumer - Closing Rabbit Channel: null
2013-08-23 21:30:08,407 [SimpleAsyncTaskExecutor-2] DEBUG listener.BlockingQueueConsumer - Starting consumer Consumer: tag=[null], channel=null, acknowledgeMode=AUTO local queue size=0
Aug 23, 2013 9:30:11 PM org.apache.catalina.core.ApplicationContext log
INFO: Initializing Spring FrameworkServlet 'grails'
Aug 23, 2013 9:30:11 PM org.apache.coyote.http11.Http11Protocol init
INFO: Initializing Coyote HTTP/1.1 on http-8634
Aug 23, 2013 9:30:11 PM org.apache.coyote.http11.Http11Protocol start
INFO: Starting Coyote HTTP/1.1 on http-8634
According to https://developer.cloudbees.com/bin/view/RUN/CloudAMQP
when you bind your CloudAMQP service to your app - some config params are provided in the pattern of CLOUDAMQP_URL_ - this is the type of thing you would need to put in your config files so they can be wired in when the app launches.
Make sure to specify the virtualHost for CloudAMQP connections. That worked for me.