opentelemetry Connection refused: localhost/0:0:0:0:0:0:0:1:4317 - jaeger

I am using opentelemtry for tracing purpose following are the command but getting error
Can any one suggest what I am doing wrong here:
java -Dotel.traces.exporter=jaeger -Dotel.exporter.jaeger.endpoint=host:14250 -Dotel.resource.attributes=service.name=app-name \
-javaagent:./opentelemetry-javaagent-all.jar -jar app-1.0.0.jar
[opentelemetry.auto.trace 2021-03-17 12:41:19:593 +0530] [IntervalMetricReader-1] WARN io.opentelemetry.exporter.otlp.metrics.OtlpGrpcMetricExporter - Failed to export metrics
io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at io.grpc.Status.asRuntimeException(Status.java:534)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:617)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:803)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:782)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/0:0:0:0:0:0:0:1:4317
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)

As far as I understand, OTel has 2 modules - traces and metrics. The error seems to complain about the exports of "metrics", while your traces might actually work fine. It tries to use the default metrics exporter (non jaeger) which writes metrics to a local Otel collector https://github.com/open-telemetry/opentelemetry-collector ( localhost/0:0:0:0:0:0:0:1:4317)
At time of writing, metrics is still marked as Alpha as in https://github.com/open-telemetry/opentelemetry-java/blob/v1.0.1/QUICKSTART.mds
I have not used jaeger perhaps they have support for metrics as well. Try -Dotel.metrics.exporter=jaeger and see whether it works.
If you just want to remove that warning you might consider adding flag -Dotel.metrics.exporter=none which should disable the export of metrics, while traces should still function.

Related

Performance issues with AWS S3 Java SDK

We have been using AWS Java SDK S3 APIs to upload and download attachments. Of late experiencing degraded performance while storing and retrieving files from S3 bucket. An intermittent issue under high load was noted in the thread dumps related to the EC2CredentialsFetcher component of the S3 client SDK. We observed that SDK is blocking/taking considerable amount of time in looking-up the credentials from EC2 instance.
We are using AWS Java SDK version - 1.11.731 and below is the code snippet to create AmazonS3 client instance at the startup of application and reused for lifetime of application.
AmazonS3ClientBuilder builder = AmazonS3ClientBuilder.standard();
AmazonS3 client = builder
.withClientConfiguration( clientConfigurationInstance )
.withRegion( getRegion() )
.withForceGlobalBucketAccessEnabled( true )
.build();
An example of the stack trace given in the thread dump is shown below:
"http-apr-8080-exec-82" Id=103100 in BLOCKED on lock=com.amazonaws.auth.EC2CredentialsFetcher#34fe4624
owned by http-apr-8080-exec-32 Id=20565 BlockedCount : 343, BlockedTime : -1, WaitedCount : 42536, WaitedTime : -1
at com.amazonaws.auth.EC2CredentialsFetcher.fetchCredentials(EC2CredentialsFetcher.java:112)
at com.amazonaws.auth.EC2CredentialsFetcher.getCredentials(EC2CredentialsFetcher.java:82)
at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:141)
at com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper.getCredentials(EC2ContainerCredentialsProviderWrapper.java:51)
at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:110)
at com.amazonaws.services.s3.S3CredentialsProviderChain.getCredentials(S3CredentialsProviderChain.java:35)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1119)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:759)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:723)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4221)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4168)
at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1249)
at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1224)
As suggested by AWS support team and here https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/setup-credentials.html#refresh-credentials , we modified client creation code to leverage asynchronous IMDS credential refresh mechanism, as below :
AmazonS3ClientBuilder builder = AmazonS3ClientBuilder.standard();
builder.setCredentials( InstanceProfileCredentialsProvider.createAsyncRefreshingProvider(true) );
AmazonS3 client = builder
.withClientConfiguration( clientConfigurationInstance )
.withRegion( getRegion() )
.withForceGlobalBucketAccessEnabled( true )
.build();
The idea is to refresh the credentials in a daemon thread, so that file upload/download API consumer threads wont't be suffered with overhead of refreshing the credentials and hence hoping that end users will not observe any performance issues.
This change didn't result any performance benefits rather introduced other issues in the system with S3 integration. We are observing the below exception from imds-credentials-refresh daemon thread intermittently under high load :
2020-04-24 01:34:59,466 [-credentials-refresh] [ ] [ ] [ ] (anceProfileCredentialsProvider) ERROR - Failed to connect to service endpoint:
com.amazonaws.SdkClientException: Failed to connect to service endpoint:
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:70) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:75) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsResponse(InstanceMetadataServiceCredentialsFetcher.java:47) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.auth.BaseCredentialsFetcher.fetchCredentials(BaseCredentialsFetcher.java:112) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.auth.BaseCredentialsFetcher.getCredentials(BaseCredentialsFetcher.java:68) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.auth.InstanceProfileCredentialsProvider$2.run(InstanceProfileCredentialsProvider.java:118) ~[aws-java-sdk-core-1.11.731.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_171]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_171]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_171]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method) ~[?:1.8.0_171]
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[?:1.8.0_171]
at java.net.SocketInputStream.read(SocketInputStream.java:171) ~[?:1.8.0_171]
at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_171]
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) ~[?:1.8.0_171]
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) ~[?:1.8.0_171]
at java.io.BufferedInputStream.read(BufferedInputStream.java:345) ~[?:1.8.0_171]
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735) ~[?:1.8.0_171]
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678) ~[?:1.8.0_171]
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587) ~[?:1.8.0_171]
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) ~[?:1.8.0_171]
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) ~[?:1.8.0_171]
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:82) ~[aws-java-sdk-core-1.11.731.jar:?]
... 14 more
The exception has occurred relatively at high frequency, 1,074 times across 5 instances over a weekend.
Q#1) Does it have any impact on uploading/downloading files using the Java SDK AmazonS3 client (mentioned above)? What could the root cause behind these socket timeouts be? What should our next steps to resolve it be? Is it an issue with S3 SDK or with EC2 & IMDS connectivity ?
Also, intermittently under high load unable to upload/download files and the below exception is observed while trying to download the file :
java.io.IOException: Cannot find S3 object at /Production/prod1/filestorage/attachments/email-content-1587156640090_MM-6248348.html
at com.myorg.internal.cloud.aws.s3.S3WrapperInputStream.startRead(S3WrapperInputStream.java:142)
at com.myorg.internal.cloud.aws.s3.S3WrapperInputStream.read(S3WrapperInputStream.java:116)
...
at java.lang.Thread.run(Thread.java:748)
Caused by: com.amazonaws.SdkClientException: Failed to connect to service endpoint:
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:70)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:75)
at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsResponse(InstanceMetadataServiceCredentialsFetcher.java:47)
at com.amazonaws.auth.BaseCredentialsFetcher.fetchCredentials(BaseCredentialsFetcher.java:112)
at com.amazonaws.auth.BaseCredentialsFetcher.getCredentials(BaseCredentialsFetcher.java:68)
at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:166)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1251)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1272)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5052)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4998)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1486)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1341)
at com.myorg.internal.cloud.aws.s3.S3WrapperInputStream.startRead(S3WrapperInputStream.java:134)
... 198 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:82)
... 220 more
Q#2) After enabling the asynchronous refresh of IMDS credentials, does every or any create/get file API calls looks-up credentials from IMDS?
Q#3) How to avoid the timeouts and achieve relatively quicker turnaround consistently for API calls?
Q#4) As per https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html#instancedata-throttling , instance metadata service is being throttled. How to avoid this or address the associated issues?
Thanks,
Chandra

Apache Beam pipeline running on Dataflow failed to read from KafkaIO: SSL handshake failed

I'm building an Apache Beam pipeline to read from Kafka as an unbounded source.
I was able to run it locally using direct runner.
However, the pipeline would fail with the attached exception stack trace, when run using Google Cloud Dataflow runner on the cloud.
It seems it's ultimately the Conscrypt Java library that's throwing javax.net.ssl.SSLException: Unable to parse TLS packet header. I'm not really sure how to address this issue.
java.io.IOException: Failed to start reading from source: org.apache.beam.sdk.io.kafka.KafkaUnboundedSource#33b5ff70
com.google.cloud.dataflow.worker.WorkerCustomSources$UnboundedReaderIterator.start(WorkerCustomSources.java:783)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation$SynchronizedReaderIterator.start(ReadOperation.java:360)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:193)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1227)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:135)
com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:966)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
org.apache.beam.sdk.io.kafka.KafkaUnboundedReader.start(KafkaUnboundedReader.java:126)
com.google.cloud.dataflow.worker.WorkerCustomSources$UnboundedReaderIterator.start(WorkerCustomSources.java:778)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation$SynchronizedReaderIterator.start(ReadOperation.java:360)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:193)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1227)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:135)
com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:966)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
java.util.concurrent.FutureTask.report(FutureTask.java:122)
java.util.concurrent.FutureTask.get(FutureTask.java:206)
org.apache.beam.sdk.io.kafka.KafkaUnboundedReader.start(KafkaUnboundedReader.java:112)
com.google.cloud.dataflow.worker.WorkerCustomSources$UnboundedReaderIterator.start(WorkerCustomSources.java:778)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation$SynchronizedReaderIterator.start(ReadOperation.java:360)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:193)
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1227)
com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:135)
com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:966)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
Caused by: javax.net.ssl.SSLException: Unable to parse TLS packet header
org.conscrypt.ConscryptEngine.unwrap(ConscryptEngine.java:782)
org.conscrypt.ConscryptEngine.unwrap(ConscryptEngine.java:723)
org.conscrypt.ConscryptEngine.unwrap(ConscryptEngine.java:688)
org.conscrypt.Java8EngineWrapper.unwrap(Java8EngineWrapper.java:236)
org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:464)
org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:328)
org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:255)
org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:79)
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:460)
org.apache.kafka.common.network.Selector.poll(Selector.java:398)
org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:460)
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:238)
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:214)
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:190)
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:219)
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:205)
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.fetchCommittedOffsets(ConsumerCoordinator.java:468)
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.refreshCommittedOffsetsIfNeeded(ConsumerCoordinator.java:450)
org.apache.kafka.clients.consumer.KafkaConsumer.updateFetchPositions(KafkaConsumer.java:1772)
org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1411)
org.apache.beam.sdk.io.kafka.KafkaUnboundedReader.setupInitialOffset(KafkaUnboundedReader.java:641)
org.apache.beam.sdk.io.kafka.KafkaUnboundedReader.lambda$start$0(KafkaUnboundedReader.java:106)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
Looks like Conscrypt causes SSL errors in many cotexts like this. Dataflow worker in Beam 2.9.0 has an option to disable this. Please try. --experiment=disable_conscrypt_security_provider. Alternately, you can try Beam 2.4.x, which does not enable Conscrypt.

Unable to connect to Redis Server when using apache beam sdk

So I have a dataflow job doing
p.apply(RedisIO.read()
.withEndpoint(<public endpoint>, 6379)
.withAuth(<password>)
.withTimeout(60000)
.withKeyPattern("UID*"))
.apply(ParDo.of(new Format()))
.apply(TextIO.write().to(options.getOutput()));
The redis endpoint is public authenticated with a password with no firewall settings. When I run the above, I get the following error.
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-
plugin:1.6.0:java (default-cli) on project word-count-beam: An
exception occured while executing the Java class.
org.apache.beam.sdk.util.UserCodeException:
redis.clients.jedis.exceptions.JedisConnectionException:
java.net.ConnectException: Connection refused -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to
execute goal org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-
cli) on project word-count-beam: An exception occured while executing
the Java class. org.apache.beam.sdk.util.UserCodeException:
redis.clients.jedis.exceptions.JedisConnectionException:
java.net.ConnectException: Connection refused
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
Caused by: org.apache.maven.plugin.MojoExecutionException: An exception
occured while executing the Java class.
org.apache.beam.sdk.util.UserCodeException: redis.clients.jedis.exceptions.JedisConnectionException: java.net.ConnectException: Connection refused
at org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:339)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)
... 20 more
Caused by: org.apache.beam.runners.direct.repackaged.com.google.common.util.concurrent
.UncheckedExecutionException:
org.apache.beam.sdk.util.UserCodeException:
redis.clients.jedis.exceptions.JedisConnectionException:
java.net.ConnectException: Connection refused
So connection is not getting established with public redis endpoint. I am getting the same error when I am running through DirectRunner. Am i missing something here?
There is a known error in the Apache Beam source code for RedisIO where withEndpoint ignores the input host and will attempt to use localhost instead. Attempting to connect to a redis server on localhost when there is none will give the error you are seeing.
You can read more about the issue here, and see a pull request with a fix here.
Until that pull request gets merged you should be able to resolve the problem by implementing the change yourself by copying RedisIO.java into your project and changing
.setConnectionConfiguration(connectionConfiguration().withHost(host))
.setConnectionConfiguration(connectionConfiguration().withPort(port))
to
.setConnectionConfiguration(connectionConfiguration().withHost(host).withPort(port))
Note this same error occurs 3 times in RedisIO, once each for Read (line 168), ReadAll (line 233), and Write (line 365).

How to configure Apache NiFi for a Kerberized Hadoop Cluster

I have Apache NiFi running standalone and its working fine. But, when I am trying to setup Apache NiFi to access Hive or HDFS Kerberized Cloudera Hadoop Cluster. I am getting issues.
Can someone guide me on the documentation for Setting HDFS/Hive/HBase (with Kerberos)
Here is the configuration I gave in nifi.properties
# kerberos #
nifi.kerberos.krb5.file=/etc/krb5.conf
nifi.kerberos.service.principal=pseeram#JUNIPER.COM
nifi.kerberos.keytab.location=/uhome/pseeram/learning/pseeram.keytab
nifi.kerberos.authentication.expiration=10 hours
I referenced various links like, but none of those are helpful.
(Since the below link said it had issues in NiFi 0.7.1 version, I tried NiFi 1.1.0 version. I had the same bitter experience)
https://community.hortonworks.com/questions/62014/nifi-hive-connection-pool-error.html
https://community.hortonworks.com/articles/4103/hiveserver2-jdbc-connection-url-examples.html
Here are the errors I am getting logs:
ERROR [Timer-Driven Process Thread-7] o.a.nifi.processors.hive.SelectHiveQL
org.apache.nifi.processor.exception.ProcessException: org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Could not open client transport with JDBC Uri: jdbc:hive2://ddas1106a:10000/innovate: Peer indicated failure: Unsupported mechanism type PLAIN)
at org.apache.nifi.dbcp.hive.HiveConnectionPool.getConnection(HiveConnectionPool.java:292) ~[nifi-hive-processors-1.1.0.jar:1.1.0]
at sun.reflect.GeneratedMethodAccessor191.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_51]
at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_51]
at org.apache.nifi.controller.service.StandardControllerServiceProvider$1.invoke(StandardControllerServiceProvider.java:177) ~[na:na]
at com.sun.proxy.$Proxy83.getConnection(Unknown Source) ~[na:na]
at org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:158) ~[nifi-hive-processors-1.1.0.jar:1.1.0]
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-1.1.0.jar:1.1.0]
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) [nifi-framework-core-1.1.0.jar:1.1.0]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.1.0.jar:1.1.0]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.1.0.jar:1.1.0]
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.1.0.jar:1.1.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_51]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_51]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_51]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_51]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_51]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_51]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Could not open client transport with JDBC Uri: jdbc:hive2://ddas1106a:10000/innovate: Peer indicated failure: Unsupported mechanism type PLAIN)
at org.apache.commons.dbcp.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:1549) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.BasicDataSource.createDataSource(BasicDataSource.java:1388) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.nifi.dbcp.hive.HiveConnectionPool.getConnection(HiveConnectionPool.java:288) ~[nifi-hive-processors-1.1.0.jar:1.1.0]
... 18 common frames omitted
Caused by: java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://ddas1106a:10000/innovate: Peer indicated failure: Unsupported mechanism type PLAIN
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:231) ~[hive-jdbc-1.2.1.jar:1.2.1]
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:176) ~[hive-jdbc-1.2.1.jar:1.2.1]
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) ~[hive-jdbc-1.2.1.jar:1.2.1]
at org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.BasicDataSource.validateConnectionFactory(BasicDataSource.java:1556) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:1545) ~[commons-dbcp-1.4.jar:1.4]
... 21 common frames omitted
Caused by: org.apache.thrift.transport.TTransportException: Peer indicated failure: Unsupported mechanism type PLAIN
at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:199) ~[hive-exec-1.2.1.jar:1.2.1]
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:307) ~[hive-exec-1.2.1.jar:1.2.1]
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) ~[hive-exec-1.2.1.jar:1.2.1]
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:204) ~[hive-jdbc-1.2.1.jar:1.2.1]
... 27 common frames omitted
WARN [NiFi Web Server-29] o.a.nifi.dbcp.hive.HiveConnectionPool HiveConnectionPool[id=278beb67-0159-1000-cffa-8c8534c285c8] Configuration does not have security enabled, Keytab and Principal will be ignored
What you've added in nifi.properties file is useful for Kerberizing nifi cluster. In order to access kerberized hadoop cluster, you need to provide appropriate config files and keytabs in NiFi's HDFS processor.
For example, if you are using putHDFS to write to a Hadoop cluster:
Hadoop Configuration Resources : paths to core-site.xml and hdfs-site.xml
Kerberos Principal: Your principal to access hadoop cluster
kerberos keytab: Path to keytab generated using krb5.conf of hadoop cluster. nifi.kerberos.krb5.file in nifi.properties must be pointed to appropriate krb5.conf file.
Immaterial of whether NiFi is inside kerberized hadoop cluster or not, this post might be useful.
https://community.hortonworks.com/questions/84659/how-to-use-apache-nifi-on-kerberized-hdp-cluster-n.html

Why does a the ignite client open a port?

We have started using Apache Ignite and we are using TCP-communication. What we are seeing is that the clients are opening a port for communication just like the server.
My first assumption was that we don't need to open up from the server to the client, everything seemed to be working fine. However, in some cases when the topology is changing we got stack traces in the logs that indicates that the server is initiating communication with the client on this port and fails.
My question is why is the server trying to communicate directly with the client? Do we need to let the servers communicate with the client or can we simply ignore the error messages?
Below is an example of the stack trace:
2016-07-04 16:02:32,298 ERROR [marshaller-cache-#67%PMCacheCluster%] [org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler] [NONE] - Failed to send event notification to node: ad8937b4-eb38-442a-8e06-9625c6246d7b
org.apache.ignite.IgniteCheckedException: Failed to send message (node may have left the grid or TCP connection cannot be established due to firewall issues) [node=TcpDiscoveryNode [id=ad8937b4-eb38-442a-8e06-9625c6246d7b, addrs=[xxx.xx.x.xxx], sockAddrs=[/xxx.xx.x.xxx:0, /xxx.xx.x.xxx:0], discPort=0, order=51, intOrder=29, lastExchangeTime=1467640045240, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true], topic=T4 [topic=TOPIC_CACHE, id1=ee261127-933b-36b7-b4ef-f5be9bb4bff2, id2=ad8937b4-eb38-442a-8e06-9625c6246d7b, id3=0], msg=GridContinuousMessage [type=MSG_EVT_NOTIFICATION, routineId=7107ffc5-9868-422f-8509-4739558869f7, data=null, futId=null], policy=2]
at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1290)
at org.apache.ignite.internal.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1508)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1229)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1200)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1182)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendNotification(GridContinuousProcessor.java:843)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.addNotification(GridContinuousProcessor.java:802)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.onEntryUpdate(CacheContinuousQueryHandler.java:787)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.access$700(CacheContinuousQueryHandler.java:91)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler$1.onEntryUpdated(CacheContinuousQueryHandler.java:412)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryManager.onEntryUpdated(CacheContinuousQueryManager.java:343)
at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2522)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2246)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1644)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1484)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:2940)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:129)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:260)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:258)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:244)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:81)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:203)
at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219)
at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:847)
at org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:105)
at org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:810)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.ignite.spi.IgniteSpiException: Failed to send message to remote node: TcpDiscoveryNode [id=ad8937b4-eb38-442a-8e06-9625c6246d7b, addrs=[xxx.xx.x.xxx], sockAddrs=[/xxx.xx.x.xxx:0, /xxx.xx.x.xxx:0], discPort=0, order=51, intOrder=29, lastExchangeTime=1467640045240, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1993)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1933)
at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1285)
... 30 common frames omitted
Caused by: org.apache.ignite.IgniteCheckedException: Failed to connect to node (is node still alive?). Make sure that each ComputeTask and GridCacheTransaction has a timeout set in order to prevent parties from waiting forever in case of network issues [nodeId=ad8937b4-eb38-442a-8e06-9625c6246d7b, addrs=[/xxx.xx.x.xxx:47100, /xxx.xx.x.xxx:47100]]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2496)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2137)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2031)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1967)
... 32 common frames omitted
Suppressed: org.apache.ignite.IgniteCheckedException: Failed to connect to address: /xxx.xx.x.xxx:47100
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2501)
... 35 common frames omitted
Caused by: java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2360)
... 35 common frames omitted
Suppressed: org.apache.ignite.IgniteCheckedException: Failed to connect to address: /xxx.xx.x.xxx:47100
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2501)
... 35 common frames omitted
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2360)
... 35 common frames omitted
2016-07-04 16:02:34,923 ERROR [marshaller-cache-#67%PMCacheCluster%] [org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler] [NONE] - Failed to send event notification to node: 95d9812d-4a16-4589-93a8-0bf2aa6b8413
Client nodes are different from server nodes mostly by the fact that they don't hold cache data and don't execute computations.
Other than that, client nodes are first-class cluster citizens and participate in communications the same way as servers do. So yes, they need to accept connections.
See https://apacheignite.readme.io/docs/clients-vs-servers