Hadoop Metastore Will Not Initialize

Hadoop Metastore Will Not Initialize - hive

preamble: i'm new to hadoop / hive. have installed standalone hadoop and now am trying to get hive to work. i keep getting an error about initializing the metastore and cannot seem to figure out how to resolve. (hadoop 2.7.2 and hive 2.0)
HADOOP_HOME AND HIVE_HOME ARE SET
ubuntu15-laptop: ~ $>echo $HADOOP_HOME
/usr/hadoop/hadoop-2.7.2
ubuntu15-laptop: ~ $>echo $HIVE_HOME
/usr/hive
hdfs is working
ubuntu15-laptop: ~ $>hadoop fs -ls /
Found 2 items
drwxrwxr-x - testuser supergroup 0 2016-04-13 21:37 /tmp
drwxrwxr-x - testuser supergroup 0 2016-04-13 21:38 /user
ubuntu15-laptop: ~ $>hadoop fs -ls /user
Found 1 items
drwxrwxr-x - testuser supergroup 0 2016-04-13 21:38 /user/hive
ubuntu15-laptop: ~ $>hadoop fs -ls /user/hive
Found 1 items
drwxrwxr-x - testuser supergroup 0 2016-04-13 21:38 /user/hive/warehouse
ubuntu15-laptop: ~ $>groups
testuser adm cdrom sudo dip plugdev lpadmin sambashare
hive is not working. says i need to initialize my metastore
ubuntu15-laptop: ~ $>hive
Logging initialized using configuration in
jar:file:/usr/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
Exception in thread "main" java.lang.RuntimeException: Hive metastore database
is not initialized. Please use schematool (e.g. ./schematool -initSchema
-dbType ...) to create the schema. If needed, don't forget to include the
option to auto-create the underlying database in your JDBC connection string
(e.g. ?createDatabaseIfNotExist=true for mysql)
so i try to initialize it useing postgres - but schematool tries to use derby
ubuntu15-laptop: ~ $>schematool -initSchema -dbType postgres
Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User: APP
Starting metastore schema initialization to 2.0.0
Initialization script hive-schema-2.0.0.postgres.sql
Error: Syntax error: Encountered "statement_timeout" at line 1, column 5.
(state=42X01,code=30000)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization
FAILED! Metastore state would be inconsistent !!
*** schemaTool failed ***
so i change hive-site.xml to use postgres drivers etc. but because i don't
have the drivers installed, it fails
ubuntu15-laptop: ~ $>cp /usr/hive/conf/hive-site.xml.templ /usr/hive/conf/hive-site.xml
ubuntu15-laptop: ~ $>schematool -initSchema -dbType postgres
Metastore connection URL: jdbc:postgresql://localhost:5432/hivedb
Metastore Connection Driver : org.postgresql.Driver
Metastore connection User: 123456
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to load driver
*** schemaTool failed ***
so then i try to use derby
first move the hive-site.xml out of the way again so default is derby
ubuntu15-laptop: ~ $>mv /usr/hive/conf/hive-site.xml /usr/hive/conf/hive-site.xml.templ
then i try intializing again with derby but it appears to already be
initialized per the error "Error: FUNCTION 'NUCLEUS_ASCII' already exists"
ubuntu15-laptop: ~ $>schematool -initSchema -dbType derby
Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User: APP
Starting metastore schema initialization to 2.0.0
Initialization script hive-schema-2.0.0.derby.sql
Error: FUNCTION 'NUCLEUS_ASCII' already exists. (state=X0Y68,code=30000)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization
FAILED! Metastore state would be inconsistent !!
*** schemaTool failed ***
I've been at this for two days. Any help would be very much appreciated.

So..
Here's what happened.
After installing hive, the first thing I did was run hive, which attempted to create/initialize the metastore_db, but apparently didn't get it right. On that initial run, I got this error:
Exception in thread "main" java.lang.RuntimeException: Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql)
Running hive, even though it failed, created a metastore_db directory in the directory from which I ran hive:
ubuntu15-laptop: ~ $>ls -l |grep meta
drwxrwxr-x 5 testuser testuser 4096 Apr 14 12:44 metastore_db
So when I then tried running
ubuntu15-laptop: ~ $>schematool -initSchema -dbType derby
The metastore already existed, but not in complete form.
Soooooo the answer is:
Before you run hive for the first time, run
schematool -initSchema -dbType derby
If you already ran hive and then tried to initSchema and it's failing:
mv metastore_db metastore_db.tmp
Re run
schematool -initSchema -dbType derby
Run hive again
**Also of note: if you change directories, the metastore_db created above won't be found! I'm sure there's a good reason for this that I don't know yet because I'm literally trying to use hive for the first time today. Ahhh here's information on this: metastore_db created wherever I run Hive

Related

schematool: command not found

I am trying to install Hive on my Ubuntu 19.10 machine .
I am using this doc https://phoenixnap.com/kb/install-hive-on-ubuntu.
As mentioned in step 6, where I am trying to initiate Derby Database, I write the command in the right path : ~/apache-hive-3.1.2-bin/bin
schematool –initSchema –dbType derby
But I get this error :
schematool: command not found.
How can I resolve this please ?

I had the same question before.
Maybe because of the wrong configuration files, like hive-site.xml, hive-env.sh. A blank space in my configuration file caused this error.

Default path for schematool is $HIVE_HOME/bin/schematool (/apache-hive-3.1.2-bin/bin/schematool in your case). Try to add this HIVE_HOME on your .bashrc file, worked for me.
# Hive
export HIVE_HOME=/<your hive path>
export PATH=$PATH:$HIVE_HOME/bin

Try this
using this command I resolved this issue
hive --service schematool -dbType mysql -password hive -username hive -validate

run ./schematool –initSchema –dbType derby
don't forget the ./

How to fix Exception while running locally spark-sql program on windows10 by enabling HiveSupport?

I am working on SPARK-SQL 2.3.1 and
I am trying to enable the hiveSupport for while creating a session as below
.enableHiveSupport()
.config("spark.sql.warehouse.dir", "c://tmp//hive")
I ran below command
C:\Software\hadoop\hadoop-2.7.1\bin>winutils.exe chmod 777 C:\tmp\hive
While running my program getting:
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
How to fix this issue and run my local windows machine?

Try to use this command:
hadoop fs -chmod -R 777 /tmp/hive/
This is Spark Exception, not Windows. You need to set correct permissions for the HDFS folder, not only for your local directory.

Hive table loading: Unable to move source file

I begin learning BigData with Hadoop Hive
I can't upload local data to Hive table
Hive command is:
load data local inpath '/usr/local/nhanvien/testHive.txt' into table nhanvien;
I get error :
Loading data to table hivetest.nhanvien Failed with exception Unable
to move source file:/usr/local/nhanvien/testHive.txt to destination
hdfs://localhost:9000/user/hive/warehouse/hivetest.db/nhanvi‌en/testHive_copy_3.t‌xt
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.MoveTask
was try:
hadoop fs -chmod g+w /user/hive/warehouse
sudo chmod -R 777 /home/abc/employeedetails
it still get this error
can someone give me solution ?

You can try with:
export HADOOP_USER_NAME=hdfs
hive -e "load data local inpath '/usr/local/nhanvien/testHive.txt' into table nhanvien;"

Its a permission issue. Try giving permission to local file and directory where your file exists.
sudo chmod -R 777 /usr/local/nhanvien/testHive.txt
Then
Login as $HDFS_USER and run the following command:
hdfs dfs -chown -R $HIVE_USER:$HDFS_USER /user/hive
hdfs dfs -chmod -R 775 /user/hive
hdfs dfs -chmod -R 775 /user/hive/warehouse
You can also configure for hdfs-site.xml such as:
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
This configure will disable permissions on HDFS. So, a regular user can do the operations on HDFS.
Hope this help.

What permissions are required to run Hive Cli

I'm seeing an issue with running the Hive CLI. When I run the CLI on an edge node I receive the following error regarding HDFS permissions:
c784gnj:~ # sudo hive
/usr/lib/hive/conf/hive-env.sh: line 5: /usr/lib/hive/lib/hive-hbase-handler-1.1.0-cdh5.5.2.jar,/usr/lib/hbase/hbase-common.jar,/usr/lib/hbase/lib/htrace-core4-4.0.1-incubating.jar,/usr/lib/hbase/lib/htrace-core-3.2.0-incubating.jar,/usr/lib/hbase/lib/htrace-core.jar,/usr/lib/hbase/hbase-hadoop2-compat.jar,/usr/lib/hbase/hbase-client.jar,/usr/lib/hbase/hbase-server.jar,/usr/lib/hbase/hbase-hadoop-compat.jar,/usr/lib/hbase/hbase-protocol.jar: No such file or directory
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
16/10/11 10:35:49 WARN conf.HiveConf: HiveConf of name hive.metastore.local does not exist
Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-1.1.0-cdh5.5.2.jar!/hive-log4j.properties
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: Permission denied: user=app1_K, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:216)
What is hive trying to write to in the /user directory in HDFS?
I can already see that /user/hive is created:
drwxrwxr-t - hive hive 0 2015-03-16 22:17 /user/hive
As you can see I am behind kerberos auth on Hadoop.
Thanks in advance!

Log says you need to set permission on HDFS /user directory to user app1_K
Command
hadoop fs -setfacl -m -R user:app1_K:rwx /user
Execute this command as privileged user from Hadoop bin
If you get similar permission error on any other hdfs directory, then you have to grant permission on that directory.
Refer the below link for more information.
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#ACLs_Access_Control_Lists

Instead of disabling HDFS access privileges altogether, as suggested by #Kumar, you might simply create a HDFS home dir for every new user on the system, so that Hive/Spark/Pig/Sqoop jobs have a valid location to create temp files...
On a Kerberized cluster:
kinit hdfs#MY.REALM
hdfs dfs -mkdir /user/app1_k
hdfs dfs -chown app1_k:app1_k /user/app1_k
Otherwise:
export HADOOP_USER_NAME=hdfs
hdfs dfs -mkdir /user/app1_k
hdfs dfs -chown app1_k:app1_k /user/app1_k

How to use Apache hive with fully distributed cluster

I am using hadoop 1.2.1 having 3 data nodes and one namenode. My hbase version is 0.94.14. I have configured apache hive 1.0 on name node machine.
I have to import hbase table data to hive. When I run a query, it gives following error in log file
ERROR org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - Cannot resolve the host name for /192.168.3.9 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '9.3.168.192.in-addr.arpa'
What is the problem in my setup. I have followed this tutorial for hadoop installation.
In hadoop namenode log file following warning appears when I run query in hive
WARN org.apache.hadoop.hdfs.server.namenode.FSEditLog: Cannot roll edit log, edits.new files already exists in all healthy directories:
Is there any information needed for hive about how many datanode hadoop have?
Also my Hmaster is running on some other machine and I have configured hive at namnode machine/

Your hadoop, zookeeper, hbase and hive should be in running condition.
1) COPY THESE FILES TO THE HADOOP LIBRARY.
sudo cp /usr/lib/hive/lib/hive-common-0.7.0-cdh3u0.jar /usr/lib/hadoop/lib/
sudo cp /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar /usr/lib/hadoop/lib/
2)STOP HBASE AND HADOOP USING FOLLOWING COMMONDS
/usr/lib/hadoop/bin/stop-all.sh
/usr/lib/hbase/bin/stop-hbase.sh
3) RESTART HBASE AND HADOOP USING COMMONDS
/usr/lib/hadoop/bin/start-all.sh
/usr/lib/hadoop/bin/start-hbase.sh

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Hadoop Metastore Will Not Initialize - hive

Related

schematool: command not found

How to fix Exception while running locally spark-sql program on windows10 by enabling HiveSupport?

Hive table loading: Unable to move source file

What permissions are required to run Hive Cli

How to use Apache hive with fully distributed cluster

Categories

Resources