Unable to run s3 sink connector for persisting kafka data on Minio - amazon-s3

I have a Kubernetes cluster running with minikube, inside the cluster is running one kafka pod, one zookeeper pod and minio pod everyone with it service. Everything looks working properly. I have a minio topic generated called minio-topic working on kafka, and minio has one bucket called kafka-bucket, I have tried to run s3-sink-connector with this properties:
name=s3-sink
connector.class=io.confluent.connect.s3.S3SinkConnector
tasks.max=1
topics=minio_topic
s3.region=us-east-1
s3.bucket.name=kafka-bucket
s3.part.size=5242880
flush.size=3
store.url=http://l27.0.0.1:9000/
storage.class=io.confluent.connect.s3.storage.S3Storage
#format.class=io.confluent.connect.s3.format.avro.AvroFormat
schema.generator.class=io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator
format.class=io.confluent.connect.s3.format.json.JsonFormat
partitioner.class=io.confluent.connect.storage.partitioner.DefaultPartitioner
schema.compatibility=NONE
After running the connector i am getting this error
[2020-05-06 18:00:46,238] ERROR WorkerSinkTask{id=s3-sink-0} Task threw an uncaught and unrecoverable
exception (org.apache.kafka.connect.runtime.WorkerTask:179)
org.apache.kafka.connect.errors.ConnectException: java.lang.reflect.InvocationTargetException
at io.confluent.connect.storage.StorageFactory.createStorage(StorageFactory.java:55)
at io.confluent.connect.s3.S3SinkTask.start(S3SinkTask.java:99)
at org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:301)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:189)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at io.confluent.connect.storage.StorageFactory.createStorage(StorageFactory.java:50)
... 10 more
Caused by: java.lang.IllegalArgumentException: hostname cannot be null
at com.amazonaws.util.AwsHostNameUtils.parseRegion(AwsHostNameUtils.java:79)
at com.amazonaws.util.AwsHostNameUtils.parseRegionName(AwsHostNameUtils.java:59)
at com.amazonaws.AmazonWebServiceClient.computeSignerByURI(AmazonWebServiceClient.java:277)
at com.amazonaws.AmazonWebServiceClient.setEndpoint(AmazonWebServiceClient.java:229)
at com.amazonaws.services.s3.AmazonS3Client.setEndpoint(AmazonS3Client.java:688)
at com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:362)
at
com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:337)
at com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:38)
at io.confluent.connect.s3.storage.S3Storage.newS3Client(S3Storage.java:96)
at io.confluent.connect.s3.storage.S3Storage.<init>(S3Storage.java:65)
... 15 more
The credentials are well defined into .aws/credentials, Does anyone know what possibly be the mistake in the configuration?

hostname cannot be null at com.amazonaws.util.AwsHostNameUtils.parseRegion
I suggest you read up on the Minio blog on setting the store.url correctly as well as verify which region your Minio cluster thinks it's running on.

Related

Performance issues with AWS S3 Java SDK

We have been using AWS Java SDK S3 APIs to upload and download attachments. Of late experiencing degraded performance while storing and retrieving files from S3 bucket. An intermittent issue under high load was noted in the thread dumps related to the EC2CredentialsFetcher component of the S3 client SDK. We observed that SDK is blocking/taking considerable amount of time in looking-up the credentials from EC2 instance.
We are using AWS Java SDK version - 1.11.731 and below is the code snippet to create AmazonS3 client instance at the startup of application and reused for lifetime of application.
AmazonS3ClientBuilder builder = AmazonS3ClientBuilder.standard();
AmazonS3 client = builder
.withClientConfiguration( clientConfigurationInstance )
.withRegion( getRegion() )
.withForceGlobalBucketAccessEnabled( true )
.build();
An example of the stack trace given in the thread dump is shown below:
"http-apr-8080-exec-82" Id=103100 in BLOCKED on lock=com.amazonaws.auth.EC2CredentialsFetcher#34fe4624
owned by http-apr-8080-exec-32 Id=20565 BlockedCount : 343, BlockedTime : -1, WaitedCount : 42536, WaitedTime : -1
at com.amazonaws.auth.EC2CredentialsFetcher.fetchCredentials(EC2CredentialsFetcher.java:112)
at com.amazonaws.auth.EC2CredentialsFetcher.getCredentials(EC2CredentialsFetcher.java:82)
at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:141)
at com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper.getCredentials(EC2ContainerCredentialsProviderWrapper.java:51)
at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:110)
at com.amazonaws.services.s3.S3CredentialsProviderChain.getCredentials(S3CredentialsProviderChain.java:35)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1119)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:759)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:723)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4221)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4168)
at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1249)
at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1224)
As suggested by AWS support team and here https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/setup-credentials.html#refresh-credentials , we modified client creation code to leverage asynchronous IMDS credential refresh mechanism, as below :
AmazonS3ClientBuilder builder = AmazonS3ClientBuilder.standard();
builder.setCredentials( InstanceProfileCredentialsProvider.createAsyncRefreshingProvider(true) );
AmazonS3 client = builder
.withClientConfiguration( clientConfigurationInstance )
.withRegion( getRegion() )
.withForceGlobalBucketAccessEnabled( true )
.build();
The idea is to refresh the credentials in a daemon thread, so that file upload/download API consumer threads wont't be suffered with overhead of refreshing the credentials and hence hoping that end users will not observe any performance issues.
This change didn't result any performance benefits rather introduced other issues in the system with S3 integration. We are observing the below exception from imds-credentials-refresh daemon thread intermittently under high load :
2020-04-24 01:34:59,466 [-credentials-refresh] [ ] [ ] [ ] (anceProfileCredentialsProvider) ERROR - Failed to connect to service endpoint:
com.amazonaws.SdkClientException: Failed to connect to service endpoint:
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:70) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:75) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsResponse(InstanceMetadataServiceCredentialsFetcher.java:47) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.auth.BaseCredentialsFetcher.fetchCredentials(BaseCredentialsFetcher.java:112) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.auth.BaseCredentialsFetcher.getCredentials(BaseCredentialsFetcher.java:68) ~[aws-java-sdk-core-1.11.731.jar:?]
at com.amazonaws.auth.InstanceProfileCredentialsProvider$2.run(InstanceProfileCredentialsProvider.java:118) ~[aws-java-sdk-core-1.11.731.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_171]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_171]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_171]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method) ~[?:1.8.0_171]
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[?:1.8.0_171]
at java.net.SocketInputStream.read(SocketInputStream.java:171) ~[?:1.8.0_171]
at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_171]
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) ~[?:1.8.0_171]
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) ~[?:1.8.0_171]
at java.io.BufferedInputStream.read(BufferedInputStream.java:345) ~[?:1.8.0_171]
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735) ~[?:1.8.0_171]
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678) ~[?:1.8.0_171]
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587) ~[?:1.8.0_171]
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) ~[?:1.8.0_171]
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) ~[?:1.8.0_171]
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:82) ~[aws-java-sdk-core-1.11.731.jar:?]
... 14 more
The exception has occurred relatively at high frequency, 1,074 times across 5 instances over a weekend.
Q#1) Does it have any impact on uploading/downloading files using the Java SDK AmazonS3 client (mentioned above)? What could the root cause behind these socket timeouts be? What should our next steps to resolve it be? Is it an issue with S3 SDK or with EC2 & IMDS connectivity ?
Also, intermittently under high load unable to upload/download files and the below exception is observed while trying to download the file :
java.io.IOException: Cannot find S3 object at /Production/prod1/filestorage/attachments/email-content-1587156640090_MM-6248348.html
at com.myorg.internal.cloud.aws.s3.S3WrapperInputStream.startRead(S3WrapperInputStream.java:142)
at com.myorg.internal.cloud.aws.s3.S3WrapperInputStream.read(S3WrapperInputStream.java:116)
...
at java.lang.Thread.run(Thread.java:748)
Caused by: com.amazonaws.SdkClientException: Failed to connect to service endpoint:
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:70)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:75)
at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsResponse(InstanceMetadataServiceCredentialsFetcher.java:47)
at com.amazonaws.auth.BaseCredentialsFetcher.fetchCredentials(BaseCredentialsFetcher.java:112)
at com.amazonaws.auth.BaseCredentialsFetcher.getCredentials(BaseCredentialsFetcher.java:68)
at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:166)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1251)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1272)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5052)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4998)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1486)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1341)
at com.myorg.internal.cloud.aws.s3.S3WrapperInputStream.startRead(S3WrapperInputStream.java:134)
... 198 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:82)
... 220 more
Q#2) After enabling the asynchronous refresh of IMDS credentials, does every or any create/get file API calls looks-up credentials from IMDS?
Q#3) How to avoid the timeouts and achieve relatively quicker turnaround consistently for API calls?
Q#4) As per https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html#instancedata-throttling , instance metadata service is being throttled. How to avoid this or address the associated issues?
Thanks,
Chandra

Failure on installing S3 connector for AEM 6.3

I am trying to connect S3 data store following this instructions. I am getting exact error described in this SOF question.
Steps:
Created a vanilla AEM 6.3 instance and able to upload images to DAM
Downloaded S3 connector and copied all .jar files into crx-quickstart/install folder
Copied org.apache.jackrabbit.oak.segment.SegmentNodeStoreService.config file and set customBlobStore=B"true"
Copied org.apache.jackrabbit.oak.plugins.blob.datastore.S3DataStore.config file and looks like this:
accessKey="scribed" connectionTimeout="120000" maxConnections="40" maxErrorRetry="10" s3Bucket="myproj-s3bucket" s3Region="ap-southeast-1" s3EndPoint="https://scribed.signin.aws.amazon.com/console" secretKey="scribed" socketTimeout="120000" writeThreads="30" cacheSize="16GB" cachePurgeTrigFactory="1"
(have scribed the key and secret)
When I restart my AEM none of consoles start. It throws
HTTP ERROR: 503 Problem accessing /. Reason: AuthenticationSupport service missing. Cannot authenticate request.
This is the exception trace:
15.05.2017 07:42:56.156 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.blob.cloud.s3.Utils Configuring Amazon Client from property file.
15.05.2017 07:42:59.401 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.blob.cloud.s3.Utils S3 service endpoint [https://170564245278.signin.aws.amazon.com/console]
15.05.2017 07:43:04.292 *ERROR* [FelixStartLevel] org.apache.jackrabbit.oak-blob-cloud [org.apache.jackrabbit.oak.plugins.blob.datastore.S3DataStore(2946)] The activate method has thrown an exception (java.lang.NullPointerException: null value in entry: component.id=null) java.lang.NullPointerException: null value in entry: component.id=null at com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:33) at com.google.common.collect.ImmutableMap.entryOf(ImmutableMap.java:135) at com.google.common.collect.ImmutableMap$Builder.put(ImmutableMap.java:206) at com.google.common.collect.Maps.fromProperties(Maps.java:1187) at org.apache.jackrabbit.oak.blob.cloud.s3.S3Backend.init(S3Backend.java:166) at org.apache.jackrabbit.oak.plugins.blob.AbstractSharedCachingDataStore.init(AbstractSharedCachingDataStore.java:163) at org.apache.jackrabbit.oak.plugins.blob.datastore.AbstractDataStoreService.activate(AbstractDataStoreService.java:87) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.felix.scr.impl.inject.BaseMethod.invokeMethod(BaseMethod.java:224) at org.apache.felix.scr.impl.inject.BaseMethod.access$500(BaseMethod.java:39) at org.apache.felix.scr.impl.inject.BaseMethod$Resolved.invoke(BaseMethod.java:617) at org.apache.felix.scr.impl.inject.BaseMethod.invoke(BaseMethod.java:501) at org.apache.felix.scr.impl.inject.ActivateMethod.invoke(ActivateMethod.java:302) at org.apache.felix.scr.impl.inject.ActivateMethod.invoke(ActivateMethod.java:294) at org.apache.felix.scr.impl.manager.SingleComponentManager.createImplementationObject(SingleComponentManager.java:298) at org.apache.felix.scr.impl.manager.SingleComponentManager.createComponent(SingleComponentManager.java:109) at org.apache.felix.scr.impl.manager.SingleComponentManager.getService(SingleComponentManager.java:906) at org.apache.felix.scr.impl.manager.SingleComponentManager.getServiceInternal(SingleComponentManager.java:879) at org.apache.felix.scr.impl.manager.AbstractComponentManager.activateInternal(AbstractComponentManager.java:749) at org.apache.felix.scr.impl.manager.AbstractComponentManager.enableInternal(AbstractComponentManager.java:675) at org.apache.felix.scr.impl.manager.AbstractComponentManager.enable(AbstractComponentManager.java:430) at org.apache.felix.scr.impl.manager.ConfigurableComponentHolder.enableComponents(ConfigurableComponentHolder.java:657) at org.apache.felix.scr.impl.BundleComponentActivator.initialEnable(BundleComponentActivator.java:341) at org.apache.felix.scr.impl.Activator.loadComponents(Activator.java:390) at org.apache.felix.scr.impl.Activator.access$200(Activator.java:54) at org.apache.felix.scr.impl.Activator$ScrExtension.start(Activator.java:265) at org.apache.felix.utils.extender.AbstractExtender.createExtension(AbstractExtender.java:259) at org.apache.felix.utils.extender.AbstractExtender.modifiedBundle(AbstractExtender.java:232) at org.osgi.util.tracker.BundleTracker$Tracked.customizerModified(BundleTracker.java:482) at org.osgi.util.tracker.BundleTracker$Tracked.customizerModified(BundleTracker.java:415) at org.osgi.util.tracker.AbstractTracked.track(AbstractTracked.java:232) at org.osgi.util.tracker.BundleTracker$Tracked.bundleChanged(BundleTracker.java:444) at org.apache.felix.framework.util.EventDispatcher.invokeBundleListenerCallback(EventDispatcher.java:916) at org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:835) at org.apache.felix.framework.util.EventDispatcher.fireBundleEvent(EventDispatcher.java:517) at org.apache.felix.framework.Felix.fireBundleEvent(Felix.java:4542) at org.apache.felix.framework.Felix.startBundle(Felix.java:2173) at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1372) at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:308) at java.lang.Thread.run(Thread.java:745)
15.05.2017 07:43:04.308 *INFO* [FelixStartLevel] com.day.cq.cq-compat-codeupgrade BundleEvent RESOLVED
15.05.2017 07:43:04.310 *INFO* [FelixStartLevel] com.day.cq.cq-compat-codeupgrade BundleEvent STARTING
15.05.2017 07:43:04.310 *INFO* [FelixStartLevel] com.day.cq.cq-compat-codeupgrade BundleEvent STARTED
Am I missing any steps or config? Please help out
I got the answer to my question with help of my lead comparing the working config against failed. This parameter was incorrect:
s3EndPoint="https://scribed.signin.aws.amazon.com/console"
This can be blank as the connector will rebuild using s3Region. or it is https://region.aws.amazon.com. Since the error logs were throwing irrelevant errors, I was misguided. Removing this one parameter made difference.
Second observation was, while starting AEM, initially it does throw the error. But eventually it starts up. Need to wait for 3-4 mins. On logs I see connection refused during startup. But on subsequent request once all config is loaded, it is able to connect and upload successfully.

How to configure Apache NiFi for a Kerberized Hadoop Cluster

I have Apache NiFi running standalone and its working fine. But, when I am trying to setup Apache NiFi to access Hive or HDFS Kerberized Cloudera Hadoop Cluster. I am getting issues.
Can someone guide me on the documentation for Setting HDFS/Hive/HBase (with Kerberos)
Here is the configuration I gave in nifi.properties
# kerberos #
nifi.kerberos.krb5.file=/etc/krb5.conf
nifi.kerberos.service.principal=pseeram#JUNIPER.COM
nifi.kerberos.keytab.location=/uhome/pseeram/learning/pseeram.keytab
nifi.kerberos.authentication.expiration=10 hours
I referenced various links like, but none of those are helpful.
(Since the below link said it had issues in NiFi 0.7.1 version, I tried NiFi 1.1.0 version. I had the same bitter experience)
https://community.hortonworks.com/questions/62014/nifi-hive-connection-pool-error.html
https://community.hortonworks.com/articles/4103/hiveserver2-jdbc-connection-url-examples.html
Here are the errors I am getting logs:
ERROR [Timer-Driven Process Thread-7] o.a.nifi.processors.hive.SelectHiveQL
org.apache.nifi.processor.exception.ProcessException: org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Could not open client transport with JDBC Uri: jdbc:hive2://ddas1106a:10000/innovate: Peer indicated failure: Unsupported mechanism type PLAIN)
at org.apache.nifi.dbcp.hive.HiveConnectionPool.getConnection(HiveConnectionPool.java:292) ~[nifi-hive-processors-1.1.0.jar:1.1.0]
at sun.reflect.GeneratedMethodAccessor191.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_51]
at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_51]
at org.apache.nifi.controller.service.StandardControllerServiceProvider$1.invoke(StandardControllerServiceProvider.java:177) ~[na:na]
at com.sun.proxy.$Proxy83.getConnection(Unknown Source) ~[na:na]
at org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:158) ~[nifi-hive-processors-1.1.0.jar:1.1.0]
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-1.1.0.jar:1.1.0]
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) [nifi-framework-core-1.1.0.jar:1.1.0]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.1.0.jar:1.1.0]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.1.0.jar:1.1.0]
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.1.0.jar:1.1.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_51]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_51]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_51]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_51]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_51]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_51]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Could not open client transport with JDBC Uri: jdbc:hive2://ddas1106a:10000/innovate: Peer indicated failure: Unsupported mechanism type PLAIN)
at org.apache.commons.dbcp.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:1549) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.BasicDataSource.createDataSource(BasicDataSource.java:1388) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.nifi.dbcp.hive.HiveConnectionPool.getConnection(HiveConnectionPool.java:288) ~[nifi-hive-processors-1.1.0.jar:1.1.0]
... 18 common frames omitted
Caused by: java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://ddas1106a:10000/innovate: Peer indicated failure: Unsupported mechanism type PLAIN
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:231) ~[hive-jdbc-1.2.1.jar:1.2.1]
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:176) ~[hive-jdbc-1.2.1.jar:1.2.1]
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) ~[hive-jdbc-1.2.1.jar:1.2.1]
at org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.BasicDataSource.validateConnectionFactory(BasicDataSource.java:1556) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:1545) ~[commons-dbcp-1.4.jar:1.4]
... 21 common frames omitted
Caused by: org.apache.thrift.transport.TTransportException: Peer indicated failure: Unsupported mechanism type PLAIN
at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:199) ~[hive-exec-1.2.1.jar:1.2.1]
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:307) ~[hive-exec-1.2.1.jar:1.2.1]
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) ~[hive-exec-1.2.1.jar:1.2.1]
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:204) ~[hive-jdbc-1.2.1.jar:1.2.1]
... 27 common frames omitted
WARN [NiFi Web Server-29] o.a.nifi.dbcp.hive.HiveConnectionPool HiveConnectionPool[id=278beb67-0159-1000-cffa-8c8534c285c8] Configuration does not have security enabled, Keytab and Principal will be ignored
What you've added in nifi.properties file is useful for Kerberizing nifi cluster. In order to access kerberized hadoop cluster, you need to provide appropriate config files and keytabs in NiFi's HDFS processor.
For example, if you are using putHDFS to write to a Hadoop cluster:
Hadoop Configuration Resources : paths to core-site.xml and hdfs-site.xml
Kerberos Principal: Your principal to access hadoop cluster
kerberos keytab: Path to keytab generated using krb5.conf of hadoop cluster. nifi.kerberos.krb5.file in nifi.properties must be pointed to appropriate krb5.conf file.
Immaterial of whether NiFi is inside kerberized hadoop cluster or not, this post might be useful.
https://community.hortonworks.com/questions/84659/how-to-use-apache-nifi-on-kerberized-hdp-cluster-n.html

WSO2 API Manager - ERROR - APIKeyMgtServiceComponent Error in initializing thrift transport

Can anyone help me with this problem?
Some information
IP address was replaced for a fake one
there is no other service running in the server
MgtHostName and hostname are set o machine's IP
Running on AWS.
Security groups allows AnyTraffic from anywhere (only for troubleshooting this)
[2015-11-02 03:52:46,251] INFO - CarbonUIServiceComponent API Store Default Context : http://52.52.52.52:9763/store
[2015-11-02 03:52:46,465] INFO - DefaultKeyValidationHandler org.wso2.carbon.apimgt.keymgt.handlers.DefaultKeyValidationHandler Initialised
[2015-11-02 03:52:46,465] INFO - APIKeyValidationService Initialised KeyValidationHandler instance successfully
[2015-11-02 03:52:46,472] ERROR - APIKeyMgtServiceComponent Error in initializing thrift transport
org.apache.thrift.transport.TTransportException: Could not bind to port 10397
at org.apache.thrift.transport.TSSLTransportFactory.createServer(TSSLTransportFactory.java:117)
at org.apache.thrift.transport.TSSLTransportFactory.getServerSocket(TSSLTransportFactory.java:103)
at org.wso2.carbon.apimgt.keymgt.internal.APIKeyMgtServiceComponent.startThriftService(APIKeyMgtServiceComponent.java:211)
at org.wso2.carbon.apimgt.keymgt.internal.APIKeyMgtServiceComponent.activate(APIKeyMgtServiceComponent.java:89)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.eclipse.equinox.internal.ds.model.ServiceComponent.activate(ServiceComponent.java:260)
at org.eclipse.equinox.internal.ds.model.ServiceComponentProp.activate(ServiceComponentProp.java:146)
at org.eclipse.equinox.internal.ds.model.ServiceComponentProp.build(ServiceComponentProp.java:347)
at org.eclipse.equinox.internal.ds.InstanceProcess.buildComponent(InstanceProcess.java:620)
at org.eclipse.equinox.internal.ds.InstanceProcess.buildComponents(InstanceProcess.java:197)
at org.eclipse.equinox.internal.ds.Resolver.getEligible(Resolver.java:343)
at org.eclipse.equinox.internal.ds.SCRManager.serviceChanged(SCRManager.java:222)
at org.eclipse.osgi.internal.serviceregistry.FilteredServiceListener.serviceChanged(FilteredServiceListener.java:107)
at org.eclipse.osgi.framework.internal.core.BundleContextImpl.dispatchEvent(BundleContextImpl.java:861)
at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230)
at org.eclipse.osgi.framework.eventmgr.ListenerQueue.dispatchEventSynchronous(ListenerQueue.java:148)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.publishServiceEventPrivileged(ServiceRegistry.java:819)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.publishServiceEvent(ServiceRegistry.java:771)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.register(ServiceRegistrationImpl.java:130)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.registerService(ServiceRegistry.java:214)
at org.eclipse.osgi.framework.internal.core.BundleContextImpl.registerService(BundleContextImpl.java:433)
at org.eclipse.osgi.framework.internal.core.BundleContextImpl.registerService(BundleContextImpl.java:451)
at org.wso2.carbon.identity.thrift.authentication.internal.ThriftAuthenticationServiceComponent.activate(ThriftAuthenticationServiceComponent.java:69)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.eclipse.equinox.internal.ds.model.ServiceComponent.activate(ServiceComponent.java:260)
at org.eclipse.equinox.internal.ds.model.ServiceComponentProp.activate(ServiceComponentProp.java:146)
at org.eclipse.equinox.internal.ds.model.ServiceComponentProp.build(ServiceComponentProp.java:347)
at org.eclipse.equinox.internal.ds.InstanceProcess.buildComponent(InstanceProcess.java:620)
at org.eclipse.equinox.internal.ds.InstanceProcess.buildComponents(InstanceProcess.java:197)
at org.eclipse.equinox.internal.ds.Resolver.getEligible(Resolver.java:343)
at org.eclipse.equinox.internal.ds.SCRManager.serviceChanged(SCRManager.java:222)
at org.eclipse.osgi.internal.serviceregistry.FilteredServiceListener.serviceChanged(FilteredServiceListener.java:107)
at org.eclipse.osgi.framework.internal.core.BundleContextImpl.dispatchEvent(BundleContextImpl.java:861)
at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230)
at org.eclipse.osgi.framework.eventmgr.ListenerQueue.dispatchEventSynchronous(ListenerQueue.java:148)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.publishServiceEventPrivileged(ServiceRegistry.java:819)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.publishServiceEvent(ServiceRegistry.java:771)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.register(ServiceRegistrationImpl.java:130)
at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.registerService(ServiceRegistry.java:214)
at org.eclipse.osgi.framework.internal.core.BundleContextImpl.registerService(BundleContextImpl.java:433)
at org.eclipse.equinox.http.servlet.internal.Activator.registerHttpService(Activator.java:81)
at org.eclipse.equinox.http.servlet.internal.Activator.addProxyServlet(Activator.java:60)
at org.eclipse.equinox.http.servlet.internal.ProxyServlet.init(ProxyServlet.java:40)
at org.wso2.carbon.tomcat.ext.servlet.DelegationServlet.init(DelegationServlet.java:38)
at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1284)
at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1197)
at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1087)
at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:5229)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5516)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1575)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1565)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.BindException: Cannot assign requested address
sun.security.ssl.SSLServerSocketFactoryImpl.createServerSocket(SSLServerSocketFactoryImpl.java:91)
at org.apache.thrift.transport.TSSLTransportFactory.createServer(TSSLTransportFactory.java:109)
[2015-11-02 03:52:46,475] ERROR - APIKeyMgtServiceComponent Failed to initialize key management service.
java.lang.Exception: Error in initializing thrift transport
at org.wso2.carbon.apimgt.keymgt.internal.APIKeyMgtServiceComponent.startThriftService(APIKeyMgtServiceComponent.java:236)
at org.wso2.carbon.apimgt.keymgt.internal.APIKeyMgtServiceComponent.activate(APIKeyMgtServiceComponent.java:89)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
I did not find why i am getting this error.
I check firewall settings, port being used by another process and other stuff.
But i could solve the problem changing thrift server host on xml config file to point to my local ip instead of my public ip. (since i am running my machine on aws).
The port should be occuip by another process. You need to find the process and stop it Or, You have to change the Thrift port number.
If you are doing the second option, here is the configurations.
The port offset specified earlier in carbon.xml does not affect the ports of the Thrift client and server because Thrift is run as a separate server within WSO2 servers. Therefore, you must change the Thrift ports separately using <ThfirtClientPort> and <ThriftServerPort> elements in the <APIM_HOME>/repository/conf/api-manager.xml file. For example, the following configuration sets an offset of 2 to the default Thrift port, which is 10397:
<!--
Configurations related to enable thrift support for key-management related communication.
If you want to switch back to Web Service Client, change the value of "KeyValidatorClientType" to "WSClient".
In a distributed environment;
-If you are at the Gateway node, you need to point "ThriftClientPort" value to the "ThriftServerPort" value given at KeyManager node.
-If you need to start two API Manager instances in the same machine, you need to give different ports to "ThriftServerPort" value in two nodes.
-ThriftServerHost - Allows to configure a hostname for the thrift server. It uses the carbon hostname by default.
-->
<KeyValidatorClientType>ThriftClient</KeyValidatorClientType>
<ThriftClientPort>10399</ThriftClientPort>
<ThriftClientConnectionTimeOut>10000</ThriftClientConnectionTimeOut>
<ThriftServerPort>10399</ThriftServerPort>
<!--ThriftServerHost>localhost</ThriftServerHost-->
<EnableThriftServer>true</EnableThriftServer>

javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided

I am getting this error when i try to connect to hive metastore using Spark SQL HiveContext.
i am running this on standalone cluster using spark-submit command from my desktop, not from the hadoop cluster.
Is it something to do with Security related issue? do i have to add something in the hive_site.xml? is there anything we need to update in the below entry?
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>kerberos</value>
</property>
The spark version is 1.4.0 and the hive-site.xml is placed under conf folder.
below is the error log.
15/08/25 18:27:15 INFO HiveContext: Initializing execution hive, version 0.13.1
15/08/25 18:27:16 INFO metastore: Trying to connect to metastore with URI thrift://metastore.com:9083
15/08/25 18:27:16 ERROR TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Ker
beros tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:336)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:214)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:105)
at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:163)
at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:161)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:167)
at com.cap1.ct.SparkSQLHive.main(SparkSQLHive.java:17)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
... 35 more
15/08/25 18:27:16 WARN metastore: Failed to connect to the MetaStore Server...
15/08/25 18:27:16 INFO metastore: Waiting 1 seconds before next connection attempt.
15/08/25 18:27:17 INFO metastore: Trying to connect to metastore with URI thrift://metastore.com:9083
15/08/25 18:27:17 ERROR TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Ker
beros tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:336)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:214)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:105)
at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:163)
at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:161)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:167)
at com.cap1.ct.SparkSQLHive.main(SparkSQLHive.java:17)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
... 35 more
Prerequisite: your hive-site.xml works for hive cli with kerberos enabled.
Spark with hive needs another property:
-Djavax.security.auth.useSubjectCredsOnly=false
Quote from official troubleshooting
GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos Ticket)
Cause: This may occur if no valid Kerberos credentials are obtained. In particular, this occurs if you want the underlying mechanism to obtain credentials but you forgot to indicate this by setting the javax.security.auth.useSubjectCredsOnly system property value to false (for example via -Djavax.security.auth.useSubjectCredsOnly=false in your execution command).
This issue is inherently tied to krb5.config file which has permissible servers. If it's not found or doesn't have a server domain entry then you may run into this.