Failed to schematool -initSchema -dbType derby - hive

org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : Failed to create database 'metastore_db', see the next exception for details.
SQL Error code: 40000
Use --verbose for detailed stacktrace.
* schemaTool failed *

FYI,
please check the permission on hive installation directory.
hive installation directory should be owned by the same user that is for hadoop.
that's how it worked for me.

Related

Manjaro update error: failed to update core (unable to lock database)

I am a noob on an i5 desktop. I could not update my system. Terminal reads as follows, after I type and enter "sudo pacman -Syu"
[sudo] password for user-name: ********
:: Synchronizing package databases...
error: failed to update core (unable to lock database)
error: failed to update extra (unable to lock database)
error: failed to update community (unable to lock database)
error: failed to update multilib (unable to lock database)
error: failed to synchronize all databases
Help!!
I'm sorry, this is already covered in the wiki...
sudo rm /var/lib/pacman/db.lck
pacman-ArchWiki: "Failed to init transaction (unable to lock database)" error
If you've already deleted /var/lib/pacman/db.lck, like OP posted, and still got the same problem, try this:
sudo rm /var/tmp/pamac/dbs/db.lck
This can be solved by quitting Software Update program from task manager.
This worked for me.

the file is not owned by hive and load data is also not ran as hive

I use HDP3.1. And I Ambari to deploy hadoop cluster and hive. I want to use only one user(hdfs) to run the all programs(such as hadoop, hive, sqoop, yarn...). So I change the users all to hdfs in set ACCOUNTS step when deploy hadoop cluster in ambari. After deployed, I run sqoop to import data from mysql to hive. I have the following issue.
19/02/20 18:44:13 INFO hive.HiveImport: ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. org.apache.hadoop.hive.ql.metadata.HiveException: Load Data failed for hdfs://datacenter1:8020/user/hdfs/person/part-m-00000 as the file is not owned by hive and load data is also not ran as hive
19/02/20 18:44:13 INFO hive.HiveImport: INFO : Completed executing command(queryId=hdfs_20190220184412_d61d8591-04fc-41a7-b412-d64935ddd046); Time taken: 0.235 seconds
19/02/20 18:44:13 INFO hive.HiveImport: Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. org.apache.hadoop.hive.ql.metadata.HiveException: Load Data failed for hdfs://datacenter1:8020/user/hdfs/person/part-m-00000 as the file is not owned by hive and load data is also not ran as hive (state=08S01,code=1)
19/02/20 18:44:13 INFO hive.HiveImport: Closing: 0: jdbc:hive2://datacenter2:2181,datacenter1:2181,datacenter3:2181/default;password=hdfs;serviceDiscoveryMode=zooKeeper;user=hdfs;zooKeeperNamespace=hiveserver2
19/02/20 18:44:13 ERROR tool.ImportTool: Import failed: java.io.IOException: Hive exited with status 2
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:299)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:234)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:558)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:656)
at org.apache.sqoop.Sqoop.run(Sqoop.java:150)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:186)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:240)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:249)
at org.apache.sqoop.Sqoop.main(Sqoop.java:258)
This issue happened in reduce step. I don't why it need hive user. Does anyone know how to resolve it?
change config of hive in hive-site.xml.
Change value: hive.load.data.owner from hive to nifi.
Restart all hive service and check again.
Duonghb

Yarn Launch Container Failed with Privilege Issue

Stack: Ambari 2.4.2.0, HDP 2.5.3.0, CentOS 6.8, FreeIPA 3.0.0
When I try to use hdp user to submit a job on yarn, _000001 container can be created and launched successfully, but I got error when _000002 container is being launched after container created:
2018-11-27 22:13:35,919 WARN privileged.PrivilegedOperationExecutor (PrivilegedOperationExecutor.java:executePrivilegedOperation(170)) - Shell execution returned exit code: 255. Privileged Execution Operation Output:
main : command provided 1
main : run as user is hdp
main : requested yarn user is hdp
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...
Full command array for failed execution:
[/usr/hdp/current/hadoop-yarn-nodemanager/bin/container-executor, hdp, hdp, 1, application_1543327888220_0001, container_e14_1543327888220_0001_01_000002, /hadoop/yarn/local/usercache/hdp/appcache/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/launch_container.sh, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.tokens, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.pid, /hadoop/yarn/local, /hadoop/yarn/log, cgroups=none]
2018-11-27 22:13:35,921 WARN runtime.DefaultLinuxContainerRuntime (DefaultLinuxContainerRuntime.java:launchContainer(107)) - Launch container failed. Exception: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=255:
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:175)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:103)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:89)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:392)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: ExitCodeException exitCode=255:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:933)
at org.apache.hadoop.util.Shell.run(Shell.java:844)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1123)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
... 9 more
There is no more log about Privilege, anybody has some idea?
Thanks in advance!
Problem resolved and the problem is submitted job itself not YARN/Privilege.
Suggestion is that you'd better try to find details in container log not resourcemanager/nodemanager log.

"No FileSystem for scheme: s3" when importing data from postgres to s3 using sqoop

I tried to import data from local postgres to s3 using sqoop. My command is
sqoop import --connect jdbc:postgresql://localhost/postgres --username username --password password --table table --driver org.postgresql.Driver --target-dir s3://xxxxxxxx/data/ -m 1```
I was able to import to a local directory, but failed for a s3 bucket.
The log is posted as below.
Warning: /usr/local/Cellar/sqoop/1.4.6/libexec/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/Cellar/sqoop/1.4.6/libexec/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /usr/local/Cellar/sqoop/1.4.6/libexec/bin/../../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
17/05/25 23:39:50 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
17/05/25 23:39:50 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/05/25 23:39:51 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
17/05/25 23:39:51 INFO manager.SqlManager: Using default fetchSize of 1000
17/05/25 23:39:51 INFO tool.CodeGenTool: Beginning code generation
17/05/25 23:39:51 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM table AS t WHERE 1=0
17/05/25 23:39:51 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM table AS t WHERE 1=0
17/05/25 23:39:51 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local
Note: /tmp/sqoop-user/compile/faa4eb1b79a8e71f5c732c605f8968d8/table.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/05/25 23:40:00 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-user/compile/faa4eb1b79a8e71f5c732c605f8968d8/table.jar
17/05/25 23:40:00 INFO mapreduce.ImportJobBase: Beginning import of table
17/05/25 23:40:00 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
17/05/25 23:40:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/05/25 23:40:01 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
17/05/25 23:40:01 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM table AS t WHERE 1=0
17/05/25 23:40:01 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: java.io.IOException: No FileSystem for scheme: s3
java.lang.RuntimeException: java.io.IOException: No FileSystem for scheme: s3
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(FileOutputFormat.java:164)
at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:156)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:259)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Caused by: java.io.IOException: No FileSystem for scheme: s3
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2798)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2809)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2848)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2830)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(FileOutputFormat.java:160)
... 11 more
Really not sure how to add details to the log, or how to make log not look like code so it will not ask for more details. sorry.

Unable to initialize hive with Derby from Brew install

It had been my understanding that Derby creates file(s) in the current directory. But there are none there.
So I had tried to do the hive initialization using Derby: but .. it seems there is a derby database already.
schematool --verbose -initSchema -dbType derby
Starting metastore schema initialization to 2.1.0
Initialization script hive-schema-2.1.0.derby.sql
Connecting to jdbc:derby:;databaseName=metastore_db;create=true
Connected to: Apache Derby (version 10.10.2.0 - (1582446))
Driver: Apache Derby Embedded JDBC Driver (version 10.10.2.0 - (1582446))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:derby:> !autocommit on
Autocommit status: true
0: jdbc:derby:> CREATE FUNCTION "APP"."NUCLEUS_ASCII" (C CHAR(1)) RETURNS INTEGER LANGUAGE JAVA PARAMETER STYLE JAVA READS SQL DATA CALLED ON NULL INPUT EXTERNAL NAME 'org.datanucleus.store.rdbms.adapter.DerbySQLFunction.ascii'
Error: FUNCTION 'NUCLEUS_ASCII' already exists. (state=X0Y68,code=30000)
Closing: 0: jdbc:derby:;databaseName=metastore_db;create=true
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:291)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:264)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:505)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: Schema script failed, errorcode 2
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:390)
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:347)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:287)
So .. where is it?
Update I have reinstalled hive from scratch using
brew reinstall hive
And the same error occurs.
Another update Given the new direction of this error it now is answered by within another question:
An answer to a non-os/x - but similar otherwise - question was found that can serve here:
https://stackoverflow.com/a/40017753/1056563
I installed hive with HomeBrew(MacOS) at /usr/local/Cellar/hive and afer running schematool -dbType derby -initSchema I get the following error message:
Starting metastore schema initialization to 2.0.0 Initialization script hive-schema-2.0.0.derby.sql Error: FUNCTION 'NUCLEUS_ASCII' already exists. (state=X0Y68,code=30000) org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
However, I can't find either metastore_db or metastore_db.tmp folder under install path, so I tried:
find /usr/ -name hive-schema-2.0.0.derby.sql
vi /usr/local/Cellar/hive/2.0.1/libexec/scripts/metastore/upgrade/derby/hive-schema-2.0.0.derby.sql
comment the 'NUCLEUS_ASCII' function and 'NUCLEUS_MATCHES' function
rerun schematool -dbType derby -initSchema, then everything goes well!
Homebrew installs Hive (version 2.3.1) unconfigured. The default settings are to use in-process Derby database (Hive already includes the required lib).
The only thing you have to do (immediatelly after brew install hive) is to initialize the database:
schematool -initSchema -dbType derby
and then you can run hive, and it will work. However, if you tried to run hive before initializing the database, Hive will actually semi-create an incomplete database and will fail to work:
show tables;
FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Since the database is semi-created, schematool will now fail as well:
Error: FUNCTION 'NUCLEUS_ASCII' already exists. (state=X0Y68,code=30000)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
To fix that, you will have to delete the database:
rm -Rf metastore_db
and run the initilization command again.
Noticed that I deleted the metastore_db from current directory? This is another problem: Hive is configured to create and use the Derby database in current working dir. This is because it has the following default value for ‘javax.jdo.option.ConnectionURL’:
jdbc:derby:;databaseName=metastore_db;create=true
To fix that, create file /usr/local/opt/hive/libexec/conf/hive-site.xml as
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:/usr/local/var/hive/metastore_db;create=true</value>
</property>
</configuration>
and recreate the database like before. Now the database is in /usr/local/var/hive, so in case you again accidentally ran hive before initializing the DB, delete it with:
rm -Rf /usr/local/var/hive
You might have to look at the hive configuration file. That should tell you where it is being initialized.