org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified for initialization: 3.1.0 - hive

I finally resolved so many issues, just to run into the following issue. I have absolutely no idea why this error is occurring at the moment. If someone can spot the issue here, please let me know.
$ schematool -dbType derby -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/apache/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/hadoop-3.3.4/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User: APP
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : Failed to start database 'metastore_db' with class loader sun.misc.Launcher$AppClassLoader#6a5fc7f7, see the next exception for details.
SQL Error code: 40000
Use --verbose for detailed stacktrace.
*** schemaTool failed ***
With --verbose:
$ schematool -dbType derby -initSchema --verbose
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/apache/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/hadoop-3.3.4/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:derby://localhost:1527/metastore_db;create=true
Metastore Connection Driver : org.apache.derby.jdbc.ClientDriver
Metastore connection User: APP
Starting metastore schema initialization to 3.1.0
org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified for initialization: 3.1.0
org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified for initialization: 3.1.0
at org.apache.hadoop.hive.metastore.MetaStoreSchemaInfo.generateInitFileName(MetaStoreSchemaInfo.java:137)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:585)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:567)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1517)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
*** schemaTool failed ***

Related

Flink submit task failed

I am using Flink1.6.1 and Hadoop2.7.5. on first I start a flink
bin/yarn-session.sh -n 2 -jm 1024 -tm 1024 -d
then submit a task
./bin/flink run ./examples/batch/WordCount.jar -input hdfs://CS-201:9000/LICENSE -output hdfs://CS-201:9000/wordcount-result.txt
I got a error:
[root#CS-201 flink-1.6.1]# ./bin/flink run
./examples/batch/WordCount.jar -input hdfs://CS-201:9000/LICENSE
-output hdfs://CS-201:9000/wordcount-result.txt 2019-05-19 15:31:11,357 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli
- Found Yarn properties file under /tmp/.yarn-properties-root. 2019-05-19 15:31:11,357 INFO
org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found
Yarn properties file under /tmp/.yarn-properties-root. 2019-05-19
15:31:11,737 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli
- YARN properties set default parallelism to 2 2019-05-19 15:31:11,737 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli -
YARN properties set default parallelism to 2 YARN properties set
default parallelism to 2 2019-05-19 15:31:11,777 INFO
org.apache.hadoop.yarn.client.RMProxy -
Connecting to ResourceManager at CS-201/192.168.1.201:8032 2019-05-19
15:31:11,887 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli
- No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-05-19 15:31:11,887 INFO
org.apache.flink.yarn.cli.FlinkYarnSessionCli - No
path for the flink jar passed. Using the location of class
org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-05-19 15:31:11,891 WARN
org.apache.flink.yarn.AbstractYarnClusterDescriptor -
Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable
is set.The Flink YARN Client needs one of these to be set to properly
load the Hadoop configuration for accessing YARN. 2019-05-19
15:31:11,979 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor
- Found application JobManager host name 'cs-202' and port '52389' from supplied application id 'application_1558248666499_0003' Starting
execution of program
------------------------------------------------------------ The program finished with the following exception:
org.apache.flink.client.program.ProgramInvocationException: Could not
retrieve the execution result. (JobID:
471f0c2d047aba74ea621c5bfe782cbf) at
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:260)
at
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
at
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:474)
at
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
at
org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:85)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
at
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
at
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:426)
at
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:804)
at
org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:280)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215)
at
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1044)
at
org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1120)
at java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at
org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1120)
Caused by: org.apache.flink.runtime.client.JobSubmissionException:
Failed to submit JobGraph. at
org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:379)
at
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
at
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at
org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:213)
at
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at
java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
at
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) Caused by:
java.util.concurrent.CompletionException:
org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could
not complete the operation. Exception is not retryable. at
java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
at
java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
at
java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
at
java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899)
... 12 more Caused by:
org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could
not complete the operation. Exception is not retryable. ... 10 more
Caused by: java.util.concurrent.CompletionException:
org.apache.flink.runtime.rest.util.RestClientException: [Job
submission failed.] at
java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
at
java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
at
java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
at
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:953)
at
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
... 4 more Caused by:
org.apache.flink.runtime.rest.util.RestClientException: [Job
submission failed.] at
org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:310)
at
org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:294)
at
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
... 5 more
why it happen? and How to fix that..

Apache flink - Timeout after submitting job on hadoop / yarn cluster

I am trying to upgrade our job from flink 1.4.2 to 1.7.1 but I keep running into timeouts after submitting the job. The flink job runs on our hadoop cluster (version 2.7) with Yarn.
I've seen the following behavior:
Using the same flink-conf.yaml as we used in 1.4.2: 1.5.6 / 1.6.3 / 1.7.1 all versions timeout while 1.4.2 works.
Using 1.5.6 with "mode: legacy" (to switch off flip-6) works
Using 1.7.1 with "mode: legacy" gives timeout (I assume this option was removed but the documentation is outdated? https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#legacy)
When the timeout happens I get the following stacktrace:
INFO class java.time.Instant does not contain a getter for field seconds
INFO class com.bol.fin_hdp.cm1.domain.Cm1Transportable does not contain a getter for field globalId
INFO Submitting job 5af931bcef395a78b5af2b97e92dcffe (detached: false).
INFO ------------------------------------------------------------
INFO The program finished with the following exception:
INFO org.apache.flink.client.program.ProgramInvocationException: The main method caused an error.
INFO at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
INFO at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:420)
INFO at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:404)
INFO at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:798)
INFO at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:289)
INFO at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215)
INFO at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1035)
INFO at org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1111)
INFO at java.security.AccessController.doPrivileged(Native Method)
INFO at javax.security.auth.Subject.doAs(Subject.java:422)
INFO at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
INFO at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
INFO at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1111)
INFO Caused by: java.lang.RuntimeException: org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result.
INFO at com.bol.fin_hdp.job.starter.IntervalJobStarter.startJob(IntervalJobStarter.java:43)
INFO at com.bol.fin_hdp.job.starter.IntervalJobStarter.startJobWithConfig(IntervalJobStarter.java:32)
INFO at com.bol.fin_hdp.Main.main(Main.java:8)
INFO at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
INFO at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
INFO at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
INFO at java.lang.reflect.Method.invoke(Method.java:498)
INFO at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
INFO ... 12 more
INFO Caused by: org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result.
INFO at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:258)
INFO at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464)
INFO at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
INFO at com.bol.fin_hdp.cm1.job.Job.execute(Job.java:54)
INFO at com.bol.fin_hdp.job.starter.IntervalJobStarter.startJob(IntervalJobStarter.java:41)
INFO ... 19 more
INFO Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
INFO at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:371)
INFO at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
INFO at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
INFO at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
INFO at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
INFO at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:216)
INFO at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
INFO at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
INFO at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
INFO at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
INFO at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$1(RestClient.java:301)
INFO at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
INFO at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
INFO at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
INFO at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
INFO at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:214)
INFO at org.apache.flink.shaded.netty4.io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
INFO at org.apache.flink.shaded.netty4.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120)
INFO at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
INFO at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
INFO at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
INFO at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
INFO at java.lang.Thread.run(Thread.java:748)
INFO Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Number of retries has been exhausted.
INFO at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:213)
INFO ... 17 more
INFO Caused by: java.util.concurrent.CompletionException: org.apache.flink.shaded.netty4.io.netty.channel.ConnectTimeoutException: connection timed out: shd-hdp-b-slave-01...
INFO at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
INFO at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
INFO at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:943)
INFO at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
INFO ... 15 more
INFO Caused by: org.apache.flink.shaded.netty4.io.netty.channel.ConnectTimeoutException: connection timed out: shd-hdp-b-slave-017.example.com/some.ip.address:46500
INFO at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:212)
INFO ... 7 more
What changed in flip-6 that might cause this behavior and how can I fix this?
For our jobs on YARN w/Flink 1.6, we had to bump up the web.timeout setting via -yD web.timeout=100000.
In our case, there was a firewall between the machine submitting the job and our Hadoop cluster.
In newer Flink versions (1.7 and up) Flink uses REST to submit jobs. The port number for this REST service is random on yarn setups and could not be set.
Flink 1.8.0 introduced a config option to set this to a port or port range using:
rest.bind-port: 55520-55530

"No FileSystem for scheme: s3" when importing data from postgres to s3 using sqoop

I tried to import data from local postgres to s3 using sqoop. My command is
sqoop import --connect jdbc:postgresql://localhost/postgres --username username --password password --table table --driver org.postgresql.Driver --target-dir s3://xxxxxxxx/data/ -m 1```
I was able to import to a local directory, but failed for a s3 bucket.
The log is posted as below.
Warning: /usr/local/Cellar/sqoop/1.4.6/libexec/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/Cellar/sqoop/1.4.6/libexec/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /usr/local/Cellar/sqoop/1.4.6/libexec/bin/../../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
17/05/25 23:39:50 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
17/05/25 23:39:50 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/05/25 23:39:51 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
17/05/25 23:39:51 INFO manager.SqlManager: Using default fetchSize of 1000
17/05/25 23:39:51 INFO tool.CodeGenTool: Beginning code generation
17/05/25 23:39:51 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM table AS t WHERE 1=0
17/05/25 23:39:51 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM table AS t WHERE 1=0
17/05/25 23:39:51 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local
Note: /tmp/sqoop-user/compile/faa4eb1b79a8e71f5c732c605f8968d8/table.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/05/25 23:40:00 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-user/compile/faa4eb1b79a8e71f5c732c605f8968d8/table.jar
17/05/25 23:40:00 INFO mapreduce.ImportJobBase: Beginning import of table
17/05/25 23:40:00 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
17/05/25 23:40:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/05/25 23:40:01 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
17/05/25 23:40:01 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM table AS t WHERE 1=0
17/05/25 23:40:01 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: java.io.IOException: No FileSystem for scheme: s3
java.lang.RuntimeException: java.io.IOException: No FileSystem for scheme: s3
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(FileOutputFormat.java:164)
at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:156)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:259)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Caused by: java.io.IOException: No FileSystem for scheme: s3
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2798)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2809)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2848)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2830)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(FileOutputFormat.java:160)
... 11 more
Really not sure how to add details to the log, or how to make log not look like code so it will not ask for more details. sorry.

Unable to initialize hive with Derby from Brew install

It had been my understanding that Derby creates file(s) in the current directory. But there are none there.
So I had tried to do the hive initialization using Derby: but .. it seems there is a derby database already.
schematool --verbose -initSchema -dbType derby
Starting metastore schema initialization to 2.1.0
Initialization script hive-schema-2.1.0.derby.sql
Connecting to jdbc:derby:;databaseName=metastore_db;create=true
Connected to: Apache Derby (version 10.10.2.0 - (1582446))
Driver: Apache Derby Embedded JDBC Driver (version 10.10.2.0 - (1582446))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:derby:> !autocommit on
Autocommit status: true
0: jdbc:derby:> CREATE FUNCTION "APP"."NUCLEUS_ASCII" (C CHAR(1)) RETURNS INTEGER LANGUAGE JAVA PARAMETER STYLE JAVA READS SQL DATA CALLED ON NULL INPUT EXTERNAL NAME 'org.datanucleus.store.rdbms.adapter.DerbySQLFunction.ascii'
Error: FUNCTION 'NUCLEUS_ASCII' already exists. (state=X0Y68,code=30000)
Closing: 0: jdbc:derby:;databaseName=metastore_db;create=true
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:291)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:264)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:505)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: Schema script failed, errorcode 2
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:390)
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:347)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:287)
So .. where is it?
Update I have reinstalled hive from scratch using
brew reinstall hive
And the same error occurs.
Another update Given the new direction of this error it now is answered by within another question:
An answer to a non-os/x - but similar otherwise - question was found that can serve here:
https://stackoverflow.com/a/40017753/1056563
I installed hive with HomeBrew(MacOS) at /usr/local/Cellar/hive and afer running schematool -dbType derby -initSchema I get the following error message:
Starting metastore schema initialization to 2.0.0 Initialization script hive-schema-2.0.0.derby.sql Error: FUNCTION 'NUCLEUS_ASCII' already exists. (state=X0Y68,code=30000) org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
However, I can't find either metastore_db or metastore_db.tmp folder under install path, so I tried:
find /usr/ -name hive-schema-2.0.0.derby.sql
vi /usr/local/Cellar/hive/2.0.1/libexec/scripts/metastore/upgrade/derby/hive-schema-2.0.0.derby.sql
comment the 'NUCLEUS_ASCII' function and 'NUCLEUS_MATCHES' function
rerun schematool -dbType derby -initSchema, then everything goes well!
Homebrew installs Hive (version 2.3.1) unconfigured. The default settings are to use in-process Derby database (Hive already includes the required lib).
The only thing you have to do (immediatelly after brew install hive) is to initialize the database:
schematool -initSchema -dbType derby
and then you can run hive, and it will work. However, if you tried to run hive before initializing the database, Hive will actually semi-create an incomplete database and will fail to work:
show tables;
FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Since the database is semi-created, schematool will now fail as well:
Error: FUNCTION 'NUCLEUS_ASCII' already exists. (state=X0Y68,code=30000)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
To fix that, you will have to delete the database:
rm -Rf metastore_db
and run the initilization command again.
Noticed that I deleted the metastore_db from current directory? This is another problem: Hive is configured to create and use the Derby database in current working dir. This is because it has the following default value for ‘javax.jdo.option.ConnectionURL’:
jdbc:derby:;databaseName=metastore_db;create=true
To fix that, create file /usr/local/opt/hive/libexec/conf/hive-site.xml as
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:/usr/local/var/hive/metastore_db;create=true</value>
</property>
</configuration>
and recreate the database like before. Now the database is in /usr/local/var/hive, so in case you again accidentally ran hive before initializing the DB, delete it with:
rm -Rf /usr/local/var/hive
You might have to look at the hive configuration file. That should tell you where it is being initialized.

Failed to schematool -initSchema -dbType derby

org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : Failed to create database 'metastore_db', see the next exception for details.
SQL Error code: 40000
Use --verbose for detailed stacktrace.
* schemaTool failed *
FYI,
please check the permission on hive installation directory.
hive installation directory should be owned by the same user that is for hadoop.
that's how it worked for me.