SCDF yarn connection error - hadoop-yarn

I have deployed spring cloud dataflow on a virtual yarn cluster. starting the server ./bin/dataflow-server-yarn executes correctly. and returns
2016-11-02 10:31:59.786 INFO 42493 --- [ main] o.s.i.endpoint.EventDrivenConsumer : Adding {logging-channel-adapter:_org.springframework.integration.errorLogger} as a subscriber to the 'errorChannel' channel
2016-11-02 10:31:59.787 INFO 42493 --- [ main] o.s.i.channel.PublishSubscribeChannel : Channel 'spring-cloud-dataflow-server-yarn:9393.errorChannel' has 1 subscriber(s).
2016-11-02 10:31:59.787 INFO 42493 --- [ main] o.s.i.endpoint.EventDrivenConsumer : started _org.springframework.integration.errorLogger
2016-11-02 10:31:59.896 INFO 42493 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 9393 (http)
2016-11-02 10:31:59.901 INFO 42493 --- [ main] o.s.c.d.server.yarn.YarnDataFlowServer : Started YarnDataFlowServer in 16.026 seconds (JVM running for 16.485)
I can then start ./bin/dataflow-shell , from here i can import apps create and list streams without errors; however, if i try to deploy the created stream the following connection error happens
2016-11-02 10:52:58.275 INFO 42788 --- [nio-9393-exec-8] o.s.c.deployer.spi.yarn.YarnAppDeployer : Deploy request for org.springframework.cloud.deployer.spi.core.AppDeploymentRequest#23d59aea
2016-11-02 10:52:58.275 INFO 42788 --- [nio-9393-exec-8] o.s.c.deployer.spi.yarn.YarnAppDeployer : Deploying request for definition [AppDefinition#3350878c name = 'time', properties = map['spring.cloud.stream.bindings.output.producer.requiredGroups' -> 'ticktock', 'spring.cloud.stream.bindings.output.destination' -> 'ticktock.time']]
2016-11-02 10:52:58.275 INFO 42788 --- [nio-9393-exec-8] o.s.c.deployer.spi.yarn.YarnAppDeployer : Parameters for definition {spring.cloud.stream.bindings.output.producer.requiredGroups=ticktock, spring.cloud.stream.bindings.output.destination=ticktock.time}
2016-11-02 10:52:58.275 INFO 42788 --- [nio-9393-exec-8] o.s.c.deployer.spi.yarn.YarnAppDeployer : Deployment properties for request {spring.cloud.deployer.group=ticktock}
2016-11-02 10:52:58.276 INFO 42788 --- [rTaskExecutor-1] o.s.s.support.LifecycleObjectSupport : started org.springframework.statemachine.support.DefaultStateMachineExecutor#6d68eeb7
2016-11-02 10:52:58.276 INFO 42788 --- [rTaskExecutor-1] o.s.s.support.LifecycleObjectSupport : started RESOLVEINSTANCE WAITINSTANCE STARTCLUSTER CREATECLUSTER PUSHARTIFACT STARTINSTANCE CHECKINSTANCE PUSHAPP CHECKAPP WAITCHOICE STARTINSTANCECHOICE PUSHAPPCHOICE / / uuid=d7e5224f-c5f0-47c9-b2c2-066b117cc786 / id=null
2016-11-02 10:52:58.276 INFO 42788 --- [rTaskExecutor-1] o.s.c.d.s.y.DefaultYarnCloudAppService : Cachekey STREAMnull found YarnCloudAppServiceApplication org.springframework.cloud.deployer.spi.yarn.YarnCloudAppServiceApplication#79158163
2016-11-02 10:52:58.280 INFO 42788 --- [rTaskExecutor-1] o.s.c.d.s.y.DefaultYarnCloudAppService : Cachekey STREAMnull found YarnCloudAppServiceApplication org.springframework.cloud.deployer.spi.yarn.YarnCloudAppServiceApplication#79158163
2016-11-02 10:52:59.281 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:00.282 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:01.283 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:02.283 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:03.284 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:04.285 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:05.286 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:06.287 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:07.288 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:08.289 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:08.290 INFO 42788 --- [rTaskExecutor-1] o.s.c.d.s.y.DefaultYarnCloudAppService : Cachekey STREAMapp--spring.yarn.appName=scdstream:app:ticktock,--spring.yarn.client.launchcontext.arguments.--spring.cloud.deployer.yarn.appmaster.artifact=/dataflow//artifacts/cache/ found YarnCloudAppServiceApplication org.springframework.cloud.deployer.spi.yarn.YarnCloudAppServiceApplication#7d3bc3f0
2016-11-02 10:53:09.294 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:10.295 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:11.296 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:12.297 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:13.297 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:14.298 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:15.299 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:16.300 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:17.301 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:18.302 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.ipc.Client : Retrying connect to server: localhost/192.168.137.135:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-11-02 10:53:18.361 INFO 42788 --- [rTaskExecutor-1] org.apache.hadoop.fs.TrashPolicyDefault : Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
2016-11-02 10:53:18.747 INFO 42788 --- [rTaskExecutor-1] o.s.s.support.LifecycleObjectSupport : stopped org.springframework.statemachine.support.DefaultStateMachineExecutor#6d68eeb7
2016-11-02 10:53:18.747 INFO 42788 --- [rTaskExecutor-1] o.s.s.support.LifecycleObjectSupport : stopped RESOLVEINSTANCE WAITINSTANCE STARTCLUSTER CREATECLUSTER PUSHARTIFACT STARTINSTANCE CHECKINSTANCE PUSHAPP CHECKAPP WAITCHOICE STARTINSTANCECHOICE PUSHAPPCHOICE / / uuid=d7e5224f-c5f0-47c9-b2c2-066b117cc786 / id=null
2016-11-02 10:53:18.747 ERROR 42788 --- [rTaskExecutor-1] o.s.c.d.s.y.AbstractDeployerStateMachine : Passing through error state DefaultStateContext [stage=STATE_ENTRY, message=GenericMessage [payload=DEPLOY, headers={artifact=org.springframework.cloud.stream.app:time-source-kafka:jar:1.0.0.BUILD-SNAPSHOT, appVersion=app, groupId=ticktock, definitionParameters={spring.cloud.stream.bindings.output.producer.requiredGroups=ticktock, spring.cloud.stream.bindings.output.destination=ticktock.time}, count=1, clusterId=ticktock:time, id=f0f7b54e-e59d-69c0-a405-f0907fa46343, contextRunArgs=[--spring.yarn.appName=scdstream:app:ticktock, --spring.yarn.client.launchcontext.arguments.--spring.cloud.deployer.yarn.appmaster.artifact=/dataflow//artifacts/cache/], artifactDir=/dataflow//artifacts/cache/, timestamp=1478083978275}], messageHeaders={artifact=org.springframework.cloud.stream.app:time-source-kafka:jar:1.0.0.BUILD-SNAPSHOT, appVersion=app, groupId=ticktock, definitionParameters={spring.cloud.stream.bindings.output.producer.requiredGroups=ticktock, spring.cloud.stream.bindings.output.destination=ticktock.time}, count=1, clusterId=ticktock:time, id=f0f7b54e-e59d-69c0-a405-f0907fa46343, contextRunArgs=[--spring.yarn.appName=scdstream:app:ticktock, --spring.yarn.client.launchcontext.arguments.--spring.cloud.deployer.yarn.appmaster.artifact=/dataflow//artifacts/cache/], artifactDir=/dataflow//artifacts/cache/, timestamp=1478083978275}, extendedState=DefaultExtendedState [variables={artifact=org.springframework.cloud.stream.app:time-source-kafka:jar:1.0.0.BUILD-SNAPSHOT, appVersion=app, appname=scdstream:app:ticktock, definitionParameters={spring.cloud.stream.bindings.output.producer.requiredGroups=ticktock, spring.cloud.stream.bindings.output.destination=ticktock.time}, count=1, messageId=f0f7b54e-e59d-69c0-a405-f0907fa46343, clusterId=ticktock:time, error=org.springframework.yarn.YarnSystemException: Call From master/127.0.1.1 to localhost:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused; nested exception is java.net.ConnectException: Call From master/127.0.1.1 to localhost:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused}], transition=org.springframework.statemachine.transition.DefaultExternalTransition#77c2bbfc, stateMachine=UNDEPLOYMODULE DESTROYCLUSTER STOPCLUSTER DEPLOYMODULE RESOLVEINSTANCE WAITINSTANCE STARTCLUSTER CREATECLUSTER PUSHARTIFACT STARTINSTANCE CHECKINSTANCE PUSHAPP CHECKAPP WAITCHOICE STARTINSTANCECHOICE PUSHAPPCHOICE ERROR READY ERROR_JUNCTION UNDEPLOYEXIT DEPLOYEXIT / ERROR / uuid=436b73fe-991a-4c22-a418-334f895d41e5 / id=null, source=null, target=null, sources=null, targets=null, exception=null]
2016-11-02 10:53:18.748 WARN 42788 --- [nio-9393-exec-8] o.s.c.d.s.c.StreamDeploymentController : Exception when deploying the app StreamAppDefinition [streamName=ticktock, name=time, registeredAppName=time, properties={spring.cloud.stream.bindings.output.producer.requiredGroups=ticktock, spring.cloud.stream.bindings.output.destination=ticktock.time}]: java.util.concurrent.ExecutionException: org.springframework.yarn.YarnSystemException: Call From master/127.0.1.1 to localhost:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused; nested exception is java.net.ConnectException: Call From master/127.0.1.1 to localhost:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
changing the IP address to the localhost,yielded same results.
here is my server.yml
logging:
level:
org.apache.hadoop: INFO
org.springframework.yarn: INFO
maven:
remoteRepositories:
springRepo:
url: https://repo.spring.io/libs-snapshot
spring:
main:
show_banner: false
# Configured for Hadoop single-node running on localhost. Replace with property values reflecting your
# actual Hadoop cluster when running in a distributed environment.
hadoop:
fsUri: hdfs://192.168.137.135:8020
resourceManagerHost: 192.168.137.135
resourceManagerPort: 8032
resourceManagerSchedulerAddress: 192.168.137.135:8030
# Configured for Redis running on localhost. Replace at least host property when running in a
# distributed environment.
redis:
port: 6379
host: 192.168.137.135
# Configured for an embedded in-memory H2 database. Replace the datasource configuration with properties
# matching your preferred database to be used instead, if needed, or when running in a distributed environment.
#rabbitmq:
# addresses: localhost:5672
# for default embedded database
datasource:
url: jdbc:h2:tcp://localhost:19092/mem:dataflow
username: sa
password:
driverClassName: org.h2.Driver
# # for mysql/mariadb datasource
# datasource:
# url: jdbc:mysql://localhost:3306/yourDB
# username: yourUsername
# password: yourPassword
# driverClassName: org.mariadb.jdbc.Driver
# # for postgresql datasource
# datasource:
# url: jdbc:postgresql://localhost:5432/yourDB
# username: yourUsername
# password: yourPassword
# driverClassName: org.postgresql.Driver
cloud:
stream:
kafka:
binder:
brokers: localhost:9093
zkNodes: localhost:2181
config:
enabled: false
server:
bootstrap: true
deployer:
yarn:
app:
baseDir: /dataflow
streamappmaster:
memory: 512m
virtualCores: 1
javaOpts: "-Xms512m -Xmx512m"
streamcontainer:
priority: 5
memory: 256m
virtualCores: 1
javaOpts: "-Xms64m -Xmx256m"
taskappmaster:
memory: 512m
virtualCores: 1
javaOpts: "-Xms512m -Xmx512m"
taskcontainer:
priority: 10
memory: 256m
virtualCores: 1
javaOpts: "-Xms64m -Xmx256m"
# yarn:
# hostdiscovery:
# pointToPoint: false
# loopback: false
# preferInterface: ['eth', 'en']
# matchIpv4: 192.168.0.0/24
# matchInterface: eth\\d*

I found a workaround, it seems the port was the problem. I changed the port number on the yarn-site.xml and did the same on the servers.xml and voila!!

Related

Connnections handling

So I've been using karate for a while now, and there has been an issue we were facing since over the last year: org.apache.http.conn.ConnectTimeoutException
Other threads about that mentioned connectionTimeout exception were solvable by specifying proxy, but taht did not help us.
After tons of investigation, it turned out that our Azure SNAT was exhausted, meaning Karate was opening way too many connections.
To verify this I enabled log debugging and used this feature
Background:
* url "https://www.karatelabs.io/"
Scenario:
* method GET
* method GET
the logs then had following lines
13:10:17.868 [main] DEBUG com.intuit.karate - request:
1 > GET https://www.karatelabs.io/
1 > Host: www.karatelabs.io
1 > Connection: Keep-Alive
1 > User-Agent: Apache-HttpClient/4.5.13 (Java/17.0.4.1)
1 > Accept-Encoding: gzip,deflate
13:10:17.868 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection request: [route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 0 of 5; total allocated: 0 of 10]
13:10:17.874 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection leased: [id: 0][route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 1 of 5; total allocated: 1 of 10]
13:10:17.875 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Opening connection {s}->https://www.karatelabs.io:443
13:10:17.883 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connecting to www.karatelabs.io/34.149.87.45:443
13:10:17.883 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Connecting socket to www.karatelabs.io/34.149.87.45:443 with timeout 30000
13:10:17.924 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled protocols: [TLSv1.3, TLSv1.2]
13:10:17.924 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled cipher suites:[...]
13:10:17.924 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Starting handshake
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Secure session established
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated protocol: TLSv1.3
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated cipher suite: TLS_AES_256_GCM_SHA384
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer principal: CN=karatelabs.io
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer alternative names: [karatelabs.io, www.karatelabs.io]
13:10:18.012 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - issuer principal: CN=Sectigo RSA Domain Validation Secure Server CA, O=Sectigo Limited, L=Salford, ST=Greater Manchester, C=GB
13:10:18.014 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connection established localIp<->serverIp
13:10:18.015 [main] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-0: set socket timeout to 120000
13:10:18.015 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Executing request GET / HTTP/1.1
...
13:10:18.066 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Connection can be kept alive indefinitely
...
...
13:10:18.196 [main] DEBUG com.intuit.karate - request:
2 > GET https://www.karatelabs.io/
13:10:18.196 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection request: [route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 0 of 5; total allocated: 0 of 10]
13:10:18.196 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection leased: [id: 1][route: {s}->https://www.karatelabs.io:443][total available: 0; route allocated: 1 of 5; total allocated: 1 of 10]
13:10:18.196 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Opening connection {s}->https://www.karatelabs.io:443
13:10:18.196 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connecting to www.karatelabs.io/34.149.87.45:443
13:10:18.196 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Connecting socket to www.karatelabs.io/34.149.87.45:443 with timeout 30000
13:10:18.206 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled protocols: [TLSv1.3, TLSv1.2]
13:10:18.206 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled cipher suites:[...]
13:10:18.206 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Starting handshake
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Secure session established
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated protocol: TLSv1.3
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - negotiated cipher suite: TLS_AES_256_GCM_SHA384
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer principal: CN=karatelabs.io
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - peer alternative names: [karatelabs.io, www.karatelabs.io]
13:10:18.236 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - issuer principal: CN=Sectigo RSA Domain Validation Secure Server CA, O=Sectigo Limited, L=Salford, ST=Greater Manchester, C=GB
13:10:18.236 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connection established localIp<->serverIp
13:10:18.236 [main] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-1: set socket timeout to 120000
...
13:10:18.279 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Connection can be kept alive indefinitely
...
...
13:10:18.609 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager is shutting down
13:10:18.610 [Finalizer] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-1: Shutdown connection
13:10:18.611 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager shut down
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager is shutting down
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-2: Shutdown connection
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager shut down
13:10:18.612 [Finalizer] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection manager is shutting down
"Connecting to socket" and "handshake" indicate that karate is establishing a new connection instead of using an already opened one, even though I am sending a request to the same host.
On the other hand, on longer scenarios, I was seeing "http-outgoing-x: Shutdown connection" after about ~1s from opening it, in the middle of the run, despite having "karate.configure('readTimeout', 120000)" specified.
I don't think that was intentional, especially after seeing the "keep-alive" header and the "Connection can be kept alive indefinitely" in the log"
That being said, is there any way to force karate to use the same connection instead of establishing a new one each request?
As far as we know, we use the Apache HTTP Client API the right way.
But you never know. The best thing is for you to dive into the code and see what we could be missing. Or you could provide a way to replicate following these instructions: https://github.com/karatelabs/karate/wiki/How-to-Submit-an-Issue

Let's Encrypt: SSL Certificate is valid for the Domain but not valid Specific Port [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
Improve this question
I am using VPS: Amazon EC2 and SSL Cert Provider: Let's Encrypt (through Certbot)
I have seen some kind of a question but the answer is not useful for my situation.
I have a domain api.example.com that is configured and fully functioning on an Ubuntu server. I used Certbot to configure the domain with HTTPS, however, I also have APIs configured to be accessed on a specific port of that domain, say 8443.
When I access api.example.com, I see the lock on the browser that says the site is secure, but whenever I try to access my api api.example.com:8443/v1/someAPI, the API returns the appropriate result, but without the site is secure. Because the main site is secure, while the API access location isn't, I am unable to make API calls accordingly, resulting in net::ERR_SSL_PROTOCOL_ERROR.
Myapplication.java
package com.xxx.xxxx;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.autoconfigure.orm.jpa.HibernateJpaAutoConfiguration;
#SpringBootApplication(exclude = HibernateJpaAutoConfiguration.class)
public class MyApplication {
public static void main(String[] args) {
System.setProperty("file.encoding", "UTF-8");
System.setProperty("server.port", "8080");
SpringApplication.run(MyApplication.class, args);
}
}
my application.properties:
# Database
db.driver: com.mysql.cj.jdbc.Driver
db.url: jdbc:mysql://123.123.123.123:123/ex?serverTimeZone=UTC&useSSL=false
db.username: xx
#db.password: xxx
db.password: xxxxxx
# Hibernate
hibernate.dialect: org.hibernate.dialect.MySQL5Dialect
hibernate.show_sql: false
hibernate.hbm2ddl.auto: validate
hibernate.format_sql = false
entitymanager.packagesToScan: com.example
# GZIP Server compression
server.compression.enabled: true
server.compression.min-response-size: 2048
server.compression.mime-types: application/json,application/xml,text/html,text/xml,text/plain
# File Path
file.path: /home/ec2-user/
file.report.path: /home/ec2-user/
jpa.repositories.enabled=false
multipart.enabled=true
multipart.max-file-size=50MB
multipart.max-request-size=50MB
spring.servlet.multipart.max-file-size=50MB
spring.servlet.multipart.max-request-size=50MB
# server base path
base.path: https://api.example.com:8443
# Origins to allow requests from
origins: *
#Error Page Configuration
server.error.whitelabel.enabled=false
spring.autoconfigure.exclude=org.springframework.boot.autoconfigure.web.ErrorMvcAutoConfiguration
reportUrl:https://example.com/report/
email=contact#example.com
emails=sales#contact#example.com
# SMTP Configuration
spring.mail.enabled=true
spring.mail.from=sales#contact#example.com
##Amazon SES SMTP config
spring.mail.host=email-smtp
spring.mail.username=fsdfskfjsldfjf
spring.mail.password=ffdfsfdsfdsfsdfdsf
spring.mail.port=123
eds.users: sales#marketsresearcher.biz
eds.host: smtp.gmail.com
eds.port: 123
eds.fromname==example
##SSL details
server.port:8443
security.require-ssl=true
server.ssl.key-store:classpath:abc.p12
server.ssl.key-store-password:abc
server.ssl.keyStoreType:PKCS12
server.ssl.keyAlias:abc
I have also added a rule for 8443 port in aws ec2 instance security group
I am getting an error on server log:
. ____ _ __ _ _
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v2.4.1)
2021-08-28 15:47:04.463 INFO 4513 --- [ main] c.a.MarketResearcher.ApplicationWar : Starting ApplicationWar v0.0.1-SNAPSHOT using Java 1.8.0_302 on ip-172-31-17-203.ap-south-1.compute.internal with PID 4513 (/home/ec2-user/MarketResearcher-0.0.1-SNAPSHOT.jar started by root in /home/ec2-user)
2021-08-28 15:47:04.467 INFO 4513 --- [ main] c.a.MarketResearcher.ApplicationWar : The following profiles are active: prod
2021-08-28 15:47:06.924 INFO 4513 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'org.springframework.ws.config.annotation.DelegatingWsConfiguration' of type [org.springframework.ws.config.annotation.DelegatingWsConfiguration$$EnhancerBySpringCGLIB$$b39d77f] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2021-08-28 15:47:07.008 INFO 4513 --- [ main] .w.s.a.s.AnnotationActionEndpointMapping : Supporting [WS-Addressing August 2004, WS-Addressing 1.0]
2021-08-28 15:47:07.705 INFO 4513 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port(s): 8443 (https)
2021-08-28 15:47:07.729 INFO 4513 --- [ main] o.apache.catalina.core.StandardService : Starting service [Tomcat]
2021-08-28 15:47:07.730 INFO 4513 --- [ main] org.apache.catalina.core.StandardEngine : Starting Servlet engine: [Apache Tomcat/9.0.41]
2021-08-28 15:47:07.852 INFO 4513 --- [ main] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring embedded WebApplicationContext
2021-08-28 15:47:07.852 INFO 4513 --- [ main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 3164 ms
2021-08-28 15:47:08.432 INFO 4513 --- [ main] org.hibernate.Version : HHH000412: Hibernate ORM core version 5.4.25.Final
2021-08-28 15:47:08.894 INFO 4513 --- [ main] o.hibernate.annotations.common.Version : HCANN000001: Hibernate Commons Annotations {5.1.2.Final}
2021-08-28 15:47:09.462 INFO 4513 --- [ main] org.hibernate.dialect.Dialect : HHH000400: Using dialect: org.hibernate.dialect.MySQL5Dialect
2021-08-28 15:47:09.613 INFO 4513 --- [ main] o.h.e.boot.internal.EnversServiceImpl : Envers integration enabled? : true
2021-08-28 15:47:12.758 INFO 4513 --- [ main] o.h.e.t.j.p.i.JtaPlatformInitiator : HHH000490: Using JtaPlatform implementation: [org.hibernate.engine.transaction.jta.platform.internal.NoJtaPlatform]
2021-08-28 15:47:13.322 INFO 4513 --- [ main] o.s.s.concurrent.ThreadPoolTaskExecutor : Initializing ExecutorService 'applicationTaskExecutor'
2021-08-28 15:47:15.192 INFO 4513 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 8443 (https) with context path ''
2021-08-28 15:47:15.223 INFO 4513 --- [ main] c.a.MarketResearcher.ApplicationWar : Started ApplicationWar in 11.771 seconds (JVM running for 12.677)
2021-08-28 15:52:41.387 INFO 4513 --- [nio-8443-exec-6] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring DispatcherServlet 'dispatcherServlet'
2021-08-28 15:52:41.388 INFO 4513 --- [nio-8443-exec-6] o.s.web.servlet.DispatcherServlet : Initializing Servlet 'dispatcherServlet'
2021-08-28 15:52:41.390 INFO 4513 --- [nio-8443-exec-6] o.s.web.servlet.DispatcherServlet : Completed initialization in 2 ms
2021-08-28 16:06:33.275 WARN 4513 --- [nio-8443-exec-4] org.hibernate.orm.deprecation : HHH90000022: Hibernate's legacy org.hibernate.Criteria API is deprecated; use the JPA javax.persistence.criteria.CriteriaQuery instead
2021-08-28 16:06:33.391 WARN 4513 --- [nio-8443-exec-4] org.hibernate.orm.deprecation : HHH90000022: Hibernate's legacy org.hibernate.Criteria API is deprecated; use the JPA javax.persistence.criteria.CriteriaQuery instead
2021-08-28 16:06:33.683 WARN 4513 --- [nio-8443-exec-4] org.hibernate.orm.deprecation : HHH90000022: Hibernate's legacy org.hibernate.Criteria API is deprecated; use the JPA javax.persistence.criteria.CriteriaQuery instead
2021-08-28 16:06:33.738 WARN 4513 --- [nio-8443-exec-1] org.hibernate.orm.deprecation : HHH90000022: Hibernate's legacy org.hibernate.Criteria API is deprecated; use the JPA javax.persistence.criteria.CriteriaQuery instead
2021-08-28 16:06:33.739 WARN 4513 --- [nio-8443-exec-3] org.hibernate.orm.deprecation : HHH90000022: Hibernate's legacy org.hibernate.Criteria API is deprecated; use the JPA javax.persistence.criteria.CriteriaQuery instead
2021-08-28 16:06:33.747 WARN 4513 --- [io-8443-exec-10] org.hibernate.orm.deprecation : HHH90000022: Hibernate's legacy org.hibernate.Criteria API is deprecated; use the JPA javax.persistence.criteria.CriteriaQuery instead
2021-08-28 16:06:33.845 WARN 4513 --- [nio-8443-exec-5] org.hibernate.orm.deprecation : HHH90000022: Hibernate's legacy org.hibernate.Criteria API is deprecated; use the JPA javax.persistence.criteria.CriteriaQuery instead
2021-08-28 16:06:33.866 WARN 4513 --- [nio-8443-exec-2] org.hibernate.orm.deprecation : HHH90000022: Hibernate's legacy org.hibernate.Criteria API is deprecated; use the JPA javax.persistence.criteria.CriteriaQuery instead
2021-08-28 16:06:34.021 WARN 4513 --- [nio-8443-exec-7] org.hibernate.orm.deprecation : HHH90000022: Hibernate's legacy org.hibernate.Criteria API is deprecated; use the JPA javax.persistence.criteria.CriteriaQuery instead
2021-08-29 19:08:38.141 INFO 4513 --- [nio-8443-exec-5] o.apache.coyote.http11.Http11Processor : Error parsing HTTP request header
Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.
java.lang.IllegalArgumentException: Invalid character found in the HTTP protocol [RTSP/1.00x0d0x0a0x0d...]
at org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:559) ~[tomcat-embed-core-9.0.41.jar!/:9.0.41]
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:261) ~[tomcat-embed-core-9.0.41.jar!/:9.0.41]
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65) [tomcat-embed-core-9.0.41.jar!/:9.0.41]
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:888) [tomcat-embed-core-9.0.41.jar!/:9.0.41]
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1597) [tomcat-embed-core-9.0.41.jar!/:9.0.41]
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) [tomcat-embed-core-9.0.41.jar!/:9.0.41]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_302]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_302]
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-embed-core-9.0.41.jar!/:9.0.41]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_302]
There is no concept of separate SSL for ports. Whenever the user enters https://api.example.com, the browser translates it as api.example.com:443 and sends the request to the server.
Similarly, when you hit http://api.example.com, it's translated to api.example.com:80. The port for SSL (HTTPS) is 443. The default port for HTTP is 80.
In your case, if you use the 8443 port explicitly, SSL won’t work. Probably, you can host in 443 and route the API internally.
create a virtual host for port 8843
<VirtualHost *:8843>
ServerName example.com
ServerAlias www.example.com
DocumentRoot /xxx/xxx/xxx
<Directory /xxx/xxx/xxx>
Require All Granted
</Directory>
SSLEngine on
SSLCertificateFile "/xxx/xxx/xxx/xxx.pem"
SSLCertificateKeyFile "/xxx/xxx/xxx/xxx.pem"
SSLCertificateChainFile "/xxx/xxx/xxx/xxx.pem"
</VirtualHost>

o.s.cloud.stream.binding.Binding Service : Failed to create producer binding

I am trying build spring config client in clustered env, for this to achieve I am using kafka to connect between clients. My client project works with local kafka server but when I try to connect to my remote kafka server application do not start up and throw me with below errors,
2019-01-09 00:06:00.364 INFO 17892 --- [ main] o.a.kafka.common.utils.AppInfoParser : Kafka version : 2.0.1
2019-01-09 00:06:00.364 INFO 17892 --- [ main] o.a.kafka.common.utils.AppInfoParser : Kafka commitId : fa14705e51bd2ce5
2019-01-09 00:08:00.369 INFO 17892 --- [thread | client] o.a.k.c.a.i.AdminMetadataManager : [AdminClient clientId=client] Metadata update failed
org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call.
2019-01-09 00:10:00.366 INFO 17892 --- [thread | client] o.a.k.c.a.i.AdminMetadataManager : [AdminClient clientId=client] Metadata update failed
org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call.
2019-01-09 00:10:00.371 ERROR 17892 --- [ main] o.s.cloud.stream.binding.BindingService : Failed to create producer binding; retrying in 30 seconds
org.springframework.cloud.stream.provisioning.ProvisioningException: Provisioning exception; nested exception is java.util.concurrent.TimeoutException
in pom.xml,
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-bus-kafka</artifactId>
<version>2.0.0.RELEASE</version>
</dependency>

Spring Cloud Config (Server) Monitor does not send events to clients via RabbitMQ

I'm using Spring Boot / 2.1.0.RELAESE and Spring Cloud Dependencies / Greenwich.M3, consider the following projects, with the following dependencies:
https://github.com/dnijssen/configuration
https://github.com/dnijssen/configurationclient, dependencies:
spring-boot-starter-actuator
spring-boot-starter-web
spring-cloud-starter-bus-amqp
spring-cloud-starter-config
https://github.com/dnijssen/configurationserver, dependencies:
spring-boot-starter-actuator
spring-cloud-config-monitor
spring-cloud-starter-stream-rabbit
spring-cloud-config-server
So my configurationserver fetches the properties for the configurationclient via a Git repository (namely https://github.com/dnijssen/configuration), when a change is made I'll trigger the /monitor endpoint on my configurationserver manually (just for testing). Which is supposed to send an event via RabbitMQ which I started up with the following Docker command:
docker run -d --hostname my-rabbit --name some-rabbit -p 5672:5672 -p 15672:15672 rabbitmq:3-management
This gives me the following response: ["*"]
With the following console output on my configurationserver:
2018-11-22 09:19:48.527 INFO 19316 --- [nio-8888-exec-6] o.s.c.c.monitor.PropertyPathEndpoint : Refresh for: *
2018-11-22 09:19:48.543 INFO 19316 --- [nio-8888-exec-6] o.s.a.r.c.CachingConnectionFactory : Attempting to connect to: [localhost:5672]
2018-11-22 09:19:48.550 INFO 19316 --- [nio-8888-exec-6] o.s.a.r.c.CachingConnectionFactory : Created new connection: rabbitConnectionFactory.publisher#79a8318:0/SimpleConnection#34a9bacb [delegate=amqp://guest#127.0.0.1:5672/, localPort= 55205]
2018-11-22 09:19:48.553 INFO 19316 --- [nio-8888-exec-6] o.s.amqp.rabbit.core.RabbitAdmin : Auto-declaring a non-durable, auto-delete, or exclusive Queue (springCloudBusInput.anonymous.cT0DOhixRX6H82A1zBDF9g) durable:false, auto-delete:true, exclusive:true. It will be redeclared if the broker stops and is restarted while the connection factory is alive, but all messages will be lost.
2018-11-22 09:19:48.880 INFO 19316 --- [nio-8888-exec-6] trationDelegate$BeanPostProcessorChecker : Bean 'configurationPropertiesRebinderAutoConfiguration' of type [org.springframework.cloud.autoconfigure.ConfigurationPropertiesRebinderAutoConfiguration$$EnhancerBySpringCGLIB$$135b8961] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2018-11-22 09:19:49.162 INFO 19316 --- [nio-8888-exec-6] o.s.boot.SpringApplication : No active profile set, falling back to default profiles: default
2018-11-22 09:19:49.165 INFO 19316 --- [nio-8888-exec-6] o.s.boot.SpringApplication : Started application in 0.6 seconds (JVM running for 29.712)
2018-11-22 09:19:49.228 INFO 19316 --- [nio-8888-exec-6] o.s.cloud.bus.event.RefreshListener : Received remote refresh request. Keys refreshed []
However my configurationclient does not seem to receive any event, and is thus not refreshing itself.
What am I missing here?
Kind regards,
Dennis Nijssen

PGAgent : Caught unhandled unknown exception; terminating

The trouble with constantly running jobs on postgre DB, which will never finished.
I have tried the following actions to fix it:
apt-get update & upgrade(postgresql was updated to latest)
/etc/init.d/postgresql restart
postgresql.service - PostgreSQL RDBMS
Loaded: loaded (/lib/systemd/system/postgresql.service; enabled)
Active: active (exited) since Fri 2015-10-16 10:06:04 UTC; 9s ago
Process: 6787 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
Main PID: 6787 (code=exited, status=0/SUCCESS)
CGroup: /system.slice/postgresql.service
/etc/init.d/pgagent restart
pgagent.service - Postgres Job Agent Daemon
Loaded: loaded (/lib/systemd/system/pgagent.service; enabled)
Active: active (running) since Fri 2015-10-16 10:06:04 UTC; 1min 48s ago
Main PID: 6793 (pgagent)
CGroup: /system.slice/pgagent.service
└─6793 /usr/bin/pgagent -f -l 2 -s /var/log/pgagent hostaddr=localhost dbname=postgres user=postgresext
Oct 16 10:07:00 m-t-db-01 pgagent[6793]: *** Caught unhandled unknown exception; terminating
Oct 16 10:07:50 m-t-db-01 pgagent[6793]: *** Caught unhandled unknown exception; terminating
tried to enable debug mode on pgagent vim /etc/default/pgagent
EXTRA_OPTS="-f -l 2 -s /var/log/pgagent hostaddr=localhost dbname=postgres user=postgresext"
tried to reboot the machine
in /var/log/pgagent log I see only:
ERROR: Failed to query jobs table!
DEBUG: Creating primary connection
DEBUG: Connection Information:
DEBUG: user : postgresext
DEBUG: port : 0
DEBUG: host : localhost
DEBUG: dbname : postgres
DEBUG: password :
DEBUG: conn timeout : 0
DEBUG: Connection Information:
DEBUG: user : postgresext
DEBUG: port : 0
DEBUG: host : localhost
DEBUG: dbname : postgres
DEBUG: password :
DEBUG: conn timeout : 0
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
DEBUG: Database sanity check
DEBUG: Clearing zombies
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 1, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Creating primary connection
DEBUG: Connection Information:
DEBUG: user : postgresext
DEBUG: port : 0
DEBUG: host : localhost
DEBUG: dbname : postgres
DEBUG: password :
DEBUG: conn timeout : 0
DEBUG: Connection Information:
DEBUG: user : postgresext
DEBUG: port : 0
DEBUG: host : localhost
DEBUG: dbname : postgres
DEBUG: password :
DEBUG: conn timeout : 0
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
DEBUG: Database sanity check
DEBUG: Clearing zombies
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 1, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 1, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Creating job thread for job 8
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
DEBUG: Allocating new connection to database postgres
DEBUG: Starting job: 8
DEBUG: Creating job thread for job 5
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres
DEBUG: Allocating new connection to database postgres
DEBUG: Starting job: 5
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres dbname=testdb
DEBUG: Sleeping...
DEBUG: Allocating new connection to database testdb
DEBUG: Executing SQL step 23 (part of job 8)
DEBUG: Creating DB connection: user=postgresext host=localhost dbname=postgres dbname=testdb
DEBUG: Allocating new connection to database testdb
DEBUG: Executing SQL step 15 (part of job 5)
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 5, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Destroying job thread for job 8
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 5, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 5, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 5, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
DEBUG: Clearing inactive connections
DEBUG: Connection stats: total - 5, free - 0, deleted - 0
DEBUG: Checking for jobs to run
DEBUG: Sleeping...
in vim /var/log/postgresql/postgresql-9.4-main.log I see only:
[unknown]#[unknown] LOG: incomplete startup packet
LOG: MultiXact member wraparound protections are now enabled
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
postgresext#testdb LOG: could not receive data from client: Connection reset by peer
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
postgresext#testdb LOG: could not receive data from client: Connection reset by peer
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
postgresext#testdb LOG: could not receive data from client: Connection reset by peer
postgresext#testdb LOG: could not receive data from client: Connection reset by peer
postgresext#testdb LOG: could not receive data from client: Connection reset by peer
postgresext#postgres LOG: could not receive data from client: Connection reset by peer
I cannot figure out what the actual problem and how to fix it actually?
One more thing I have investigated is that pgagent[6793]: *** Caught unhandled unknown exception; terminating can be invoked by incorrect connection termination...
A little bit late to answering this, but I ran into the same issues when using pgAgent, and it caused nothing but frustration. I even tried to dig into the source to fix it, but just trying to get it to compile on my server was a total nightmare due to the dependencies it requires (for no good reason).
So with that in mind, I decided to write a new job scheduler for Postgres as a drop in replacement for pgAgent. It's called jpgAgent, and i've been using it at my company for a couple months now with no issues at all. Much more stable now that we're using jpgAgent, don't have to worry about the agent just stopping on us randomly and jobs not getting run as they should, where as previously that was a common problem.
Anyways, hope it helps you out.