hue connect hive had an error - hive

Failed to open new session: java.lang.RuntimeException:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException):
User: hadoop is not allowed to impersonate cheng
User:hadoop is my hadoop install use,and cheng is ubuntu user.
I have already the following configuration in my core-site.xml:
<name>hadoop.proxyuser.hive.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hive.groups</name>
<value>*</value>
</property>

the hive user is not exist before,so I change the hadoop.proxyuser.hive.groups to
hadoop.proxyuser.hadoop.group and so on.in hue config hue.ini,set the hue user.
so the problem is solution.

Related

Apache hive-metastore standalone showing error "Kerberos principal should have 3 parts:"

I am deploying hive-metastore-3.0.0 using kerberos over DC/OS I have generated principal and keytab correctly and verified the same but while providing repective settings in metastore-site.xml still server showing error "Kerberos principal should have 3 parts:" its by default pickup my user "nobody or root" by which I run service but not the principal.
Rquest you to please help is there any addition property I have to set ?
My metastore-site.xml is:
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
<description>If true, the metastore thrift interface will be secured with SASL. Clients must authenticate with Kerberos.</description>
</property>
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>hive-metastore.keytab</value>
<description>The path to the Kerberos Keytab file containing the metastore thrift server's service principal.</description>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive-metastore/node-0-server.hive-metastore.autoip.dcos.thisdcos.directory#LOCAL</value>
<description>The service principal for the metastore thrift server. The special string _HOST will be replaced automatically with the correct host name.</description>
</property>
<property>
<name>hive.metastore.authentication</name>
<value>KERBEROS</value>
<description>authenticationtype</description>
</property>```

'hiveserver2 not listening on port 10000 and 10001'

When I run:
hive --service hiveserver2 --hiveconf hive.server2.thrift.port=10000 --hiveconf hive.root.logger=INFO,console
It shows
Starting HiveServer2
and nothing listens on port 10000 and 10001
The HiveServer2 service does not output error information, causing it hard to diagnostic the problem. You can try to start the metastore service provided by Hive, which listens on port 9083 and might give some information when your configuration is not properly set:
hive --service metastore # not detach from terminal to see logs
In my case, this service cannot be started, with error message:
MetaException(message:Hive Schema version 3.1.0 does not match metastore's schema
version 1.2.0 Metastoed or corrupt)
One of the direct solution to resolve this error is to ignore the version difference by setting the hive-site.xml if there is only one hive version in your machine (another solution is to modify the metastore_db version):
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
After this problem is resolved, the HiveServer2 service can be running and listening on port 10000.
hive --service hiveserver2 > /dev/null 2>&1 &
If your HiveServer2 access metastore via Derby or MySQL JDBC driver, then the aforementioned metastore service is not needed for HiveServer2. However, if HiveServer2 access metastore via thrift protocol, as configed in conf/hive-site.xml like
<property>
<name>hive.metastore.uris</name>
<value>thrift://hadoop-master:9083</value>
<description>
Thrift URI for the remote metastore.
Used by metastore client to connect to remote metastore.
</description>
</property>
Then, the metastore service must be started at first.
I had a hard time to set up hive-3.1.2. I write this maybe it helps someone out. in order to diagnose the problem first try to launch metastore and hiveserver2 like this:
metastore:
hive --service metastore --hiveconf hive.root.logger=INFO,console
hiveserver2:
hive --service hiveserver2 --hiveconf hive.server2.thrift.port=10000 --hiveconf hive.root.logger=INFO,console
then carefully read the the exceptions were thrown.
my problem was user hive is not allowed to perform this api call
and to solve that I added the following property to hive-site.xml:
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
<description>
Should metastore do authorization against database notification related APIs such as get_next_notification.
If set to true, then only the superusers in proxy settings have the permission
</description>
</property>
also I add my full hive-site.xml as a sample:
<configuration>
<property>
<name>datanucleus.schema.autoCreateTables</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://server-2:3306/metastore?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>mysql_username</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mysql_password</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://server-2:9083</value>
</property>
<property>
<name>atanucleus.fixedDatastore</name>
<value>true</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>server-2</value>
</property>
<property>
<name>hive.server2.transport.mode</name>
<value>binary</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
</configuration>
Thanks. There is typo. It should hive.metastore not as shown below.
**metastore**.metastore.event.db.notification.api.auth
false

Configuration: Hiveserver2 & Beeline

I am trying to connect Beeline with HiveServer2 and i am getting the below alert.
Need help to connect Beeline with HiveServer2.
[hdpsysuser#hdpmaster bin]$ beeline
which: no hbase in (/usr/local/bin:/usr/local/sbin:/enter code here usr/bin:/usr/sbin:/bin:/sbin:/home/hdpuser/.local/bin:/home/hdpuser/bin:/home/hdpsysuser/.local/bin:/home/hdpsysuser/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/hive/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin)
Beeline version 2.1.1 by Apache Hive
beeline> show tables;
No current connection
beeline> !connect jdbc:hive2://hdpmaster:10000
Connecting to jdbc:hive2://hdpmaster:10000
Enter username for jdbc:hive2://hdpmaster:10000: hdpsysuser
Enter password for jdbc:hive2://hdpmaster:10000: **********
17/05/09 01:51:20 [main]: WARN jdbc.HiveConnection: Failed to connect to
hdpmaster:10000
Error: Could not open client transport with JDBC Uri:
jdbc:hive2://hdpmaster:10000: Failed to open new session: java.lang.RuntimeException:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hdpsysuser is not allowed to impersonate hdpsysuser (state=08S01,code=0)
add below property in hive-site.xml in hive conf
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
</property>
Also if you want user ABC to impersonate all(*), add below properties to your
core-site.xml
<property>
<name>hadoop.proxyuser.ABC.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.ABC.hosts</name>
<value>*</value>
</property>

Accesing Hdfs from Spark gives TokenCache error Can't get Master Kerberos principal for use as renewer

I'm trying to run a test Spark script in order to connect Spark to hadoop.
The script is the following
from pyspark import SparkContext
sc = SparkContext("local", "Simple App")
file = sc.textFile("hdfs://hadoop_node.place:9000/errs.txt")
errors = file.filter(lambda line: "ERROR" in line)
errors.count()
When I run it with pyspark I get
py4j.protocol.Py4JJavaError: An error occurred while calling
o21.collect. : java.io.IOException: Can't get Master Kerberos
principal for use as renewer
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:116)
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:187)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:251)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:140)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
at org.apache.spark.api.python.PythonRDD.getPartitions(PythonRDD.scala:46)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:898)
at org.apache.spark.rdd.RDD.collect(RDD.scala:608)
at org.apache.spark.api.java.JavaRDDLike$class.collect(JavaRDDLike.scala:243)
at org.apache.spark.api.java.JavaRDD.collect(JavaRDD.scala:27)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:744)
This happens despite the facts that
I've done a kinit and a klist shows I have the correct tokens
when I issue a ./bin/hadoop fs -ls hdfs://hadoop_node.place:9000/errs.txt
it shows the file
Both the local hadoop client and spark have the same configuration file
The core-site.xml in the spark/conf and hadoop/conf folders is the following
(got it from one of the hadoop nodes)
<configuration>
<property>
<name>hadoop.security.auth_to_local</name>
<value>
RULE:[1:$1](.*#place)s/#place//
RULE:[2:$1/$2#$0](.*/node1.place#place)s/^([a-zA-Z]*).*/$1/
RULE:[2:$1/$2#$0](.*/node2.place#place)s/^([a-zA-Z]*).*/$1/
RULE:[2:$1/$2#$0](.*/node3.place#place)s/^([a-zA-Z]*).*/$1/
RULE:[2:$1/$2#$0](.*/node4.place#place)s/^([a-zA-Z]*).*/$1/
RULE:[2:$1/$2#$0](.*/node5.place#place)s/^([a-zA-Z]*).*/$1/
RULE:[2:$1/$2#$0](.*/node6.place#place)s/^([a-zA-Z]*).*/$1/
RULE:[2:$1/$2#$0](.*/node7.place#place)s/^([a-zA-Z]*).*/$1/
RULE:[2:nobody]
DEFAULT
</value>
</property>
<property>
<name>net.topology.node.switch.mapping.impl</name>
<value>org.apache.hadoop.net.TableMapping</value>
</property>
<property>
<name>net.topology.table.file.name</name>
<value>/etc/hadoop/conf/topology.table.file</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://server.place:9000/</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
<property>
<name>hadoop.proxyuser.hive.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hive.groups</name>
<value>*</value>
</property>
</configuration>
Can someone point out what am I missing?
After creating my own hadoop cluster in order to better understand how hadoop works. I fixed it.
You have to provide Spark with a valid .keytab file which has been generated for an account which has at least read access to the hadoop cluster.
Also, you have to provide spark with the hdfs-site.xml of your hdfs cluster.
So for my case I had to create a keytab file which when you run
klist -k -e -t
on it you get entries like the following
host/fully.qualified.domain.name#REALM.COM
In my case the host was the literal word host and not a variable.
Also in your hdfs-site.xml you have to provide the path of the keytab file and say that
host/_HOST#REALM.COM
will be your account.
Cloudera has a pretty detailed writeup on how to do it.
Edit
after playing a little bit with different configurations I think the following should be noted.
You have to provide spark with the exact hdfs-site.xml and core-site.xml of your hadoop cluster. Otherwise it wont work

Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

i just installed hive and mysql..
and copied the mysqlconnector to the hive_home/lib folder
but when i try show databases and create table commands in the hive> prompt giving me the error as below:
create database saty;
FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
and my hive_site.xml is
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hadoop?CreateDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
and i dont have a directory called /user/hive/warehouse in my file system.
i created these path with mkdir command.. and tried after reboot..
bout still getting the error..
regards,
satya
Try Specifying these two properties
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>username</value>
<description>Username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>password</value>
<description>Password to use against metastore database</description>
</property>
And Similarly the username of your mysql login should have the permission to access the db sepcified in the JDBC connection string. This Can be acheived using the following command
GRANT ALL ON Databasename.* TO username#'%' IDENTIFIED BY 'password';
The answer is located at http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/5.0/CDH5-Installation-Guide/cdh5ig_hive_schema_tool.html
To suppress the schema check and allow the metastore to implicitly modify the schema, you need to set the hive.metastore.schema.verification configuration property to false in hive-site.xml.
reconfigure your hive using -hiveconf hive.root.logger=warn,console than find the detail reason that why you could not instantiate your hive mate store client.
The problem i met was wrong mysql configuration, the error message is "Binary logging not possible. Message: Transaction level 'READ-COMMITTED' in InnoDB is not safe for binlog mode 'STATEMENT'". When i change binlog_format from "statement" to "binlog_format=mixed", then hive meta store client instantiate successfully.
Hope it work for you.
I had similar issue and the cause in my case was selinux where it prevented Postgres from proper running.
I inserted following line to the first line of /etc/rc3.d/S64postgresql -
echo 0 > /selinux/enforce # Disable selinux temporarily
and restarted the node with the hive metastore.
So generally you can check two things:
Check whether the DB for the metastore is running properly
Whether the user/passwd from the property is correct