HiveMetaStoreClient fails to connect to a Kerberized cluster - hive

Kerberized HDP-2.6.3.0.
I have a test code running on my local Windows 7 machine. Note the commented code, as well, that's not making a difference.
private static void connectHiveMetastore() throws MetaException, MalformedURLException {
System.setProperty("hadoop.home.dir", "E:\\Development\\Software\\Virtualization");
/*Start : Commented or un-commented, immaterial ...*/
System.setProperty("javax.security.auth.useSubjectCredsOnly","false");
System.setProperty("java.security.auth.login.config","E:\\Development\\lib\\hdp\\loginconf.ini");
System.setProperty("java.security.krb5.conf","E:\\Development\\lib\\hdp\\krb5.conf");
/*End : Commented or un-commented, immaterial ...*/
Configuration configuration = new Configuration();
/*Start : Commented or un-commented, immaterial ...*/
//configuration.addResource("E:\\Development\\lib\\hdp\\client_config\\HDFS_CLIENT\\core-site.xml");
//configuration.addResource("E:\\Development\\lib\\hdp\\client_config\\HDFS_CLIENT\\hdfs-site.xml");
//configuration.addResource("E:\\Development\\lib\\hdp\\client_config\\HIVE_CLIENT\\hive-site.xml");
//configuration.set("hive.server2.authentication","KERBEROS");
//configuration.set("hadoop.security.authentication", "Kerberos");
/*End : Commented or un-commented, immaterial ...*/
HiveConf hiveConf = new HiveConf(configuration,Configuration.class);
/*Start : Commented or un-commented, immaterial ...*/
/*URL url = new File("E:\\Development\\lib\\hdp\\client_config\\HIVE_CLIENT\\hive-site.xml").toURI().toURL();
HiveConf.setHiveSiteLocation(url);*/
//hiveConf.setVar(HiveConf.ConfVars.HIVE_SERVER2_AUTHENTICATION,"KERBEROS");
/*End : Commented or un-commented, immaterial ...*/
hiveConf.setVar(HiveConf.ConfVars.METASTOREURIS,"thrift://l4283t.sss.se.com:9083,thrift://l4284t.sss.se.com:9083");
HiveMetaStoreClient hiveMetaStoreClient = new HiveMetaStoreClient(hiveConf);
System.out.println("Metastore client : "+hiveMetaStoreClient);
System.out.println("Is local metastore ? "+hiveMetaStoreClient.isLocalMetaStore());
System.out.println(hiveMetaStoreClient.getAllDatabases());
hiveMetaStoreClient.close();
}
The output I receive on the local machine(Note 'Is local metastore ? false'):
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/E:/Development/lib/hdp/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/E:/Development/lib/hdp/hive2/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2017-11-28 14:20:14,631 INFO [main] hive.metastore (HiveMetaStoreClient.java:open(443)) - Trying to connect to metastore with URI thrift://l4284t.sss.se.com:9083
2017-11-28 14:20:14,873 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-11-28 14:20:15,005 WARN [main] security.ShellBasedUnixGroupsMapping (ShellBasedUnixGroupsMapping.java:getUnixGroups(87)) - got exception trying to get groups for user ojoqcu: Incorrect command line arguments.
2017-11-28 14:20:15,042 WARN [main] hive.metastore (HiveMetaStoreClient.java:open(511)) - set_ugi() not successful, Likely cause: new client talking to old server. Continuing without it.
org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:380)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:230)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:3802)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:3788)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:503)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:282)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:188)
at MetaStoreClient.connectHiveMetastore(MetaStoreClient.java:47)
at MetaStoreClient.main(MetaStoreClient.java:14)
2017-11-28 14:20:15,045 INFO [main] hive.metastore (HiveMetaStoreClient.java:open(539)) - Connected to metastore.
Metastore client : org.apache.hadoop.hive.metastore.HiveMetaStoreClient#5fdcaa40
Is local metastore ? false
Exception in thread "main" MetaException(message:Got exception: org.apache.thrift.transport.TTransportException null)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:1256)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:1129)
at MetaStoreClient.connectHiveMetastore(MetaStoreClient.java:51)
at MetaStoreClient.main(MetaStoreClient.java:14)
2017-11-28 14:20:15,057 ERROR [main] hive.log (MetaStoreUtils.java:logAndThrowMetaException(1254)) - Got exception: org.apache.thrift.transport.TTransportException null
org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_all_databases(ThriftHiveMetastore.java:771)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_all_databases(ThriftHiveMetastore.java:759)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:1127)
at MetaStoreClient.connectHiveMetastore(MetaStoreClient.java:51)
at MetaStoreClient.main(MetaStoreClient.java:14)
2017-11-28 14:20:15,058 ERROR [main] hive.log (MetaStoreUtils.java:logAndThrowMetaException(1255)) - Converting exception to MetaException
Process finished with exit code 1
The metastore log throws:
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status -128
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:609)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:606)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1846)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:606)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException: Invalid status -128
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184)
at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
In the client-configs(which I downloaded from the cluster via Ambari) as well as the /etc/hive2/conf/hive-site.xml, /etc/hive/conf/conf.server/hive-site.xml, hive-metastore/conf/hive-site.xml
have the same following property:
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.server2.thrift.sasl.qop</name>
<value>auth</value>
</property>
For reference, providing relevant parts of the client config hive-site.xml:
<configuration>
<property>
<name>ambari.hive.db.schema.name</name>
<value>hive_metastore</value>
</property>
<property>
<name>atlas.rest.address</name>
<value>https://l4284t.sss.se.com:21443</value>
</property>
<property>
<name>hive.server2.thrift.http.path</name>
<value>cliservice</value>
</property>
<property>
<name>hive.server2.thrift.http.port</name>
<value>10001</value>
</property>
<property>
<name>hive.server2.thrift.max.worker.threads</name>
<value>500</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.sasl.qop</name>
<value>auth</value>
</property>
<property>
<name>hive.server2.transport.mode</name>
<value>http</value>
</property>
<property>
<name>hive.server2.use.SSL</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/etc/security/keytabs/hive.service.keytab</value>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive/_HOST#GLOBAL.SCD.COM</value>
</property>
<property>
<name>hive.metastore.pre.event.listeners</name>
<value>org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener</value>
</property>
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://l4283t.sss.se.com:9083,thrift://l4284t.sss.se.com:9083</value>
</property>
<property>
<name>hive.security.metastore.authenticator.manager</name>
<value>org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/etc/security/keytabs/hive.service.keytab</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive/_HOST#GLOBAL.SCD.COM</value>
</property>
<property>
<name>hive.server2.authentication.spnego.keytab</name>
<value>/etc/security/keytabs/spnego.service.keytab</value>
</property>
<property>
<name>hive.server2.authentication.spnego.principal</name>
<value>HTTP/_HOST#GLOBAL.SCD.COM</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
<property>
<name>hive.server2.keystore.path</name>
<value>/etc/security/ssl/default-key.jks</value>
</property>
<property>
<name>hive.server2.logging.operation.enabled</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://l4373t.sss.se.com/hive_metastore</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
</configuration>
Hive database connection via jdbc using the following lines of code works:
System.setProperty("javax.security.auth.useSubjectCredsOnly","false");
System.setProperty("java.security.auth.login.config","E:\\Development\\lib\\hdp\\loginconf.ini");
System.setProperty("java.security.krb5.conf","E:\\Development\\lib\\hdp\\krb5.conf");
**********Edit-1**********
I commented the created an 'uber' jar and executed the above class directly on one of the metastore servers(Linux machines), yet, the error persists.

Related

HiveServer2 'Failed to connect to localhost:10000 Unknown HS2 problem when communicating with Thrift server. '

beeline -u "jdbc:hive2://namenode:10000/default"
Connecting to jdbc:hive2://namenode:10000/default
19/05/11 17:21:52 [main]: WARN jdbc.HiveConnection: Failed to connect to namenode:10000
Unknown HS2 problem when communicating with Thrift server.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://namenode:10000/default: java.net.SocketException: Connection reset (state=08S01,code=0)
Beeline version 3.1.0 by Apache Hive
hive-site.xml
javax.jdo.option.ConnectionURL
jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true
<property>
<name>metastore.warehouse.dir</name>
<value>hdfs://namenode:9820/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
</property>
<property>
<name>metastore.schema.verification</name>
<value>true</value>
</property>
<property>
<name>metastore.hmshandler.retry.attemp</name>
<value>10</value>
<description>The number of times to retry a HMSHandler call if there were a connection error.</description>
</property>
<property>
<name>metastore.thrift.uris</name>
<value>thrift://namenode:9083</value>
</property>
<property>
<name>metastore.thrift.port</name>
<value>9083</value>
</property>
<property>
<name>hive.server2.transport.mode</name>
<value>binary</value>
<description>The server transport mode. The value can be binary or http. Set to http to enable HTTP transport mode.
</description>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
<description>TCP port number to listen on</description>
</property>
<property>
<name>hiver.server2.thrift.bind.host</name>
<value>namenode</value>
<description>TCP interface to bind to</description>
</property>
<property>
<name>hive.server2.thrift.http.port</name>
<value>100002</value>
<description>HTTP port number to listen on</description>
</property>
<property>
<name>hive.server2.thrift.http.min.worker.threads</name>
<value>5</value>
<description>Maximum worker threads in the server pool</description>
</property>
<property>
<name>hive.server2.thrift.http.max.worker.threads</name>
<value>500</value>
<description>Maximum worker threads in the server pool</description>
</property>
<property>
<name>hive.server2.thrift.min.worker.threads</name>
<value>5</value>
<description>Minimum number of worker threads</description>
</property>
<property>
<name>hive.server2.thrift.max.worker.threads</name>
<value>500</value>
<description>Maximum number of worker threads</description>
</property>
<property>
<name>hive.server2.authentication</name>
<value>NOSASL</value>
<description>
Expects one of [nosasl, none, ldap, kerberos, pam, custom].
Client authentication types.
NONE: no authentication check
LDAP: LDAP/AD based authentication
KERBEROS: Kerberos/GSSAPI authentication
CUSTOM: Custom authentication provider
(Use with property hive.server2.custom.authentication.class)
PAM: Pluggable authentication module
NOSASL: Raw transport
</description>
</property>
<property>
<name>hive.server2.thrift.http.path</name>
<value>cliservice</value>
<description>The path component of the URL endpoint when in HTTP mode</description>
</property>
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
<description>
Should metastore do authorization against database notification related APIs such as get_next_notification.
If set to true, then only the superusers in proxy settings have the permission
</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>hdfs://namenode:9820/tmp/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission.</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/${user.name}</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/${user.name}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.scratch.dir.permission</name>
<value>733</value>
<description>The permission for the user specific scratch directories that get created.</description>
</property>
Hello all,
hive server is listening on port 10000(tcp) and 10002(http).
If I hit, beeline -u jdbc:hive2://
It works, but when I try to access from host-ip or hostname,
It shows above errors. Anyone have idea?

Apache Phoenix UDF not working on server side

1 I have created jar with custom UDF function and copied jar into dynamic.jar.dir so when I use my UDF function as part of SELECT I getting result without issues.
2 But when function is a part of WHERE clause I getting error that class of my custom function is not found.
select PK FROM "my.custom.view" where MY_FUN(ARRAY["COLF"."COL1"], 'SOMEPARAM') limit 1;
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: BooleanExpressionFilter failed during reading: java.lang.ClassNotFoundException: com.myCompany.phoenix.MyCustomFunction
at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:96)
at org.apache.phoenix.util.ServerUtil.throwIOException(ServerUtil.java:62)
at org.apache.phoenix.filter.BooleanExpressionFilter.readFields(BooleanExpressionFilter.java:109)
at org.apache.phoenix.filter.SingleKeyValueComparisonFilter.readFields(SingleKeyValueComparisonFilter.java:133)
at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:131)
at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:101)
at org.apache.phoenix.filter.SingleCQKeyValueComparisonFilter.parseFrom(SingleCQKeyValueComparisonFilter.java:50)
... 16 more
base-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:57000/user/hbase</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>21081</value>
</property>
<property>
<name>hbase.client.keyvalue.maxsize</name>
<value>0</value>
</property>
<!-- SEP is basically replication, so enable it -->
<property>
<name>hbase.replication</name>
<value>true</value>
</property>
<property>
<name>hbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily</name>
<value>128</value>
</property>
<property>
<name>hbase.fs.tmp.dir</name>
<value>/tmp/hbase</value>
</property>
<property>
<name>phoenix.functions.allowUserDefinedFunctions</name>
<value>true</value>
</property>
<property>
<name>hbase.dynamic.jars.dir</name>
<value>${hbase.rootdir}/lib/</value>
</property>
<property>
<name>fs.hdfs.impl</name>
<value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
</property>
</configuration>
Manually adding jar with:
hdfs dfs -copyFromLocal -f /my.jar hdfs:///user/hbase/lib/my.jar
For function creation using:
CREATE FUNCTION MY_FUN(BINARY[], VARCHAR) RETURNS BOOLEAN as 'com.myCompany.phoenix.MyCustomFunction' using jar 'hdfs://localhost:57000/user/hbase/lib/my.jar';
I ran into something similar when I upgraded to Phoenix 5.0 from 4.7. I got an exception stating that I now needed to place my UDF .jar into /apps/hbase/data/lib due to permission issues. In the old environment I was able to get away using the /apps/hbase/lib directory. Maybe this is happening to you as well but it's not alerting you to the new path change.

test to see if HiveServer2 metastore is working correctly

I recently upgraded our cluster's HiveServer to HiveServer2. I also set up the Hive Metastore (in remote mode) and moved away from embedded mode (which we were previously running).
I want to test that things are properly configured and that the metatdata is acutally being stored in the remote metastore. What would be the easiest way to do this? Are their certain logs I could check to verify this behavior?
I am afraid things are not configured correctly, and I am still running my metastore in local mode, as when I query the postgresql database on the machine hosting the metastore, there are no rows in the metastore DB (despite the fact that I have created test tables through beeline).
It might be worth mentioning is that the end goal of this is to be able to query data stored in HDFS via SparkSQL. Do I need HiveServer2 to accomplish this? Apologies, I am new to a lot of this technology.
Here is my hive-site.xml:
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:postgresql://w7/metastore</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.postgresql.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>password</value>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://w7:9083</value>
<description>IP address (or fully-qualified domain name) and port of the metastore host</description>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>hive.warehouse.subdir.inherit.perms</name>
<value>true</value>
</property>
<property>
<name>hive.zookeeper.quorum</name>
<value>mn</value>
</property>
<property>
<name>hive.zookeeper.client.port</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>mn</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hive.zookeeper.namespace</name>
<value>hive_zookeeper_namespace_hive</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.class</name>
<value>org.apache.hadoop.hive.thrift.MemoryTokenStore</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
</property>
<property>
<name>hive.server2.use.SSL</name>
<value>false</value>
</property>
<property>
<name>hive.support.concurrency</name>
<description>Enable Hive's Table Lock Manager Service</description>
<value>true</value>
</property>
</configuration>

NoClassDefFoundError HBase with YARN

I know that this is one of the topic that's asked much. Still after I digged into all of the topics I could find (most of them talking about CLASSPATH), I cant solve mine.
Examples of the topics I found and tried:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
java.lang.NoClassDefFoundError with HBase Scan
I'm using Hadoop 2.5.1 with HBase 0.98.11 on Ubuntu 14.04
I set up pseudo-distributed mode and running hadoop with hbase successfully. After I want to set up the full-distributed mode, jobs fail with NoClassDefFound error. I tried adding "export HADOOP_CLASSPATH=/usr/local/hbase-0.98.11-hadoop2/bin/hbase classpath" into hadoop-env (also yarn-env), still dont work.
One notice I found is if I comment the
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
I can run the jobs SUCCESSFULLY. BUT it seems that I run it on single not multi node.
Here are some of the configs:
mapred-site
<property>
<name>mapred.job.tracker</name>
<value>hadoop1:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>`
hdfs-site
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
yarn-site
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>shuffle service that needs to be set for Map Reduce to run
</description>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
In yarn-env and hadoop-env there is just as default except the HADOOP_CLASSPATH (which doesn't change things even if I add it or not..)
Here is the error trace:
2015-04-25 23:29:25,143 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
at apriori2$FrequentItemsReduce.reduce(apriori2.java:550)
at apriori2$FrequentItemsReduce.reduce(apriori2.java:532)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1651)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1611)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1462)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:700)
at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1990)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:774)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Really thanks for every help sir.
With Yarn, you need to set "yarn.application.classpath" property with the classpath for your MapReduce job. "export HADOOP_CLASSPATH" would not work with Yarn.

Cannot connect to Hiveserver2 on http mode

I am planning to apply Knox to our HDP 2.2 cluster but I encountered the hiveserver2 http mode connection. I configured hive-site.xml like below:
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
</property>
<property>
<name>fs.hdfs.impl.disable.cache</name>
<value>true</value>
</property>
<property>
<name>fs.file.impl.disable.cache</name>
<value>true</value>
</property>
<property>
<name>hive.server2.allow.user.substitution</name>
<value>true</value>
</property>
<property>
<name>hive.server2.thrift.http.path</name>
<value>cliservice</value>
<description>Path component of URL endpoint when in HTTP mode.</description>
</property>
<property>
<name>hive.server2.transport.mode</name>
<value>http</value>
</property>
<property>
<name>hive.server2.thrift.http.port</name>
<value>10010</value>
</property>
<property>
<name>hive.server2.thrift.http.min.worker.threads</name>
<value>5</value>
</property>
<property>
<name>hive.server2.thrift.http.max.worker.threads</name>
<value>100</value>
</property>
However, when I try to connect to the server via beeline, the beeline hangs. There is no exception and error messages in the log file. If I change the transport mode to "biniary", the problem is totally gone. I don't know why it happens.
What should I do for that?