SparkSQL error Table Not Found - sql

I converted an RDD[myClass] to dataframe and then register it as an
SQL table
my_rdd.toDF().registerTempTable("my_rdd")
This table is callable and can be demonstrated with following command
%sql
SELECT * from my_rdd limit 5
But the next step gives error, saying Table Not Found: my_rdd
val my_df = sqlContext.sql("SELECT * from my_rdd limit 5")
Quite newbie for Spark. Do not understand why this is happening. Can anyone help me out of this?
java.lang.RuntimeException: Table Not Found: my_rdd
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:111)
at org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:111)
at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
at scala.collection.AbstractMap.getOrElse(Map.scala:58)
at org.apache.spark.sql.catalyst.analysis.SimpleCatalog.lookupRelation(Catalog.scala:111)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:175)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:182)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:50)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:186)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:177)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:182)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:172)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:1071)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:1071)
at org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:1069)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:915)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:68)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:73)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:75)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:77)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:79)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:81)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:83)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:85)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:87)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:89)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:91)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:93)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:95)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:97)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:99)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:101)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:103)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:105)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:107)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:109)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:111)
at $iwC$$iwC$$iwC.<init>(<console>:113)
at $iwC$$iwC.<init>(<console>:115)
at $iwC.<init>(<console>:117)
at <init>(<console>:119)
at .<init>(<console>:123)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:556)
at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:532)
at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:525)
at org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:264)
at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Make sure to import the implicits._ from the same SQLContext. Temporary tables are kept in-memory in one specific SQLContext.
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._
my_rdd.toDF().registerTempTable("my_rdd")
val my_df = sqlContext.sql("SELECT * from my_rdd LIMIT 5")
my_df.collect().foreach(println)

I found it easy to cause problems with temptables if there is more than one open zeppelin session, either in your own browser, or from someone else using the same server. The variable sqlContext is shared across those sessions and its easy to overwrite the value of it.

So as the solution of this problem I copied core-site.xml,hive-site.xml and hdfs-site.xml files into conf directoy.

I faced with a similar problem. I was loading a table which was not present in the warehouse folder whereas Hive console was showing me the table name. You can check the detailed description of the table you are loading using describe formatted table_name. You don't need to copy any file to spark/conf folder. It is already integrated.

I have met the same error but in different case ,by solve it with using the same context. If you use hiveContext, make sure that you use it all the time, for example first sqlContext.sql("load data input XXX") , and then if you use hiveContext.sql("select * from XXX"), you will meet this problem.
Every context has it`s lifecycle. So do not use two context with the same dataFrame .

Related

Failed to import file

I have a RDF file that can be imported without any issues in another RDF store (Stardog) but keeps failing in GraphDB with this error :
15:58:18.900 [import-task-3-thread-1] ERROR c.o.f.i.MultipartFileImportRunnableTask - Could not import file
java.lang.NullPointerException: null
at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
at org.eclipse.rdf4j.common.lang.service.ServiceRegistry.get(ServiceRegistry.java:95)
at org.eclipse.rdf4j.rio.Rio.createParser(Rio.java:100)
at org.eclipse.rdf4j.rio.Rio.createParser(Rio.java:118)
at org.eclipse.rdf4j.repository.util.RDFLoader.loadInputStreamOrReader(RDFLoader.java:279)
at org.eclipse.rdf4j.repository.util.RDFLoader.load(RDFLoader.java:197)
at org.eclipse.rdf4j.repository.base.AbstractRepositoryConnection.add(AbstractRepositoryConnection.java:329)
at com.ontotext.trree.monitorRepository.MonitorRepositoryConnection.add(MonitorRepositoryConnection.java:159)
at com.ontotext.trree.parallel.ParallelRDFLoader.add(ParallelRDFLoader.java:125)
at com.ontotext.forest.impex.ParallelAwareImporter.lambda$add$3(ParallelAwareImporter.java:48)
at com.ontotext.forest.impex.ParallelAwareImporter.wrapInBeginCommit(ParallelAwareImporter.java:66)
at com.ontotext.forest.impex.ParallelAwareImporter.add(ParallelAwareImporter.java:48)
at com.ontotext.forest.impex.MultipartFileImportRunnableTask.load(MultipartFileImportRunnableTask.java:38)
at com.ontotext.forest.impex.ImportRunnableTask.run(ImportRunnableTask.java:80)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
This file can be found here : http://boetik-artistik.be/humidity_by_city.owls
All referenced ontologies are resolvable from my machine.
Thanks or your help.
Kind regards,
Johan,
I have just tried this out myself on GraphDB 8.3.1. I got a similar error when I allowed GraphDB to auto detect the import format. However, when I selected the format as "RDF/XML", it imported without a problem.
The problem is with the file extension. It should be .rdf rather than .owls.

Configuration values for hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode in HIVE

I am trying to add data to an external table using apache-hive. I am getting the following error in the hive logs
2015-06-15 17:27:44,614 ERROR [LocalJobRunner Map Task Executor #0]: mr.ExecMapper (ExecMapper.java:map(171)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"transactiondate":"05-01-2015 08:26:21","transactiontype":"CASHOUT","transactionid":144590889,"sourcenumber":null,"destnumber":null,"amount":19000,"assumedfield1":880,"customerid":33394093,"transactionstatus":"COMPLETED","assumedfield2":325,"assumedfield3":175870}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to: 256
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:933)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:709)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
... 10 more
I googled for this error and came across this link which says that we must change the values of hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode variables to higher values. What are the optimum configurations for these variables on a single node hadoop installation? None of these configuration values are working for me. Please help.
set hive.exec.max.dynamic.partitions=1000;
set hive.exec.max.dynamic.partitions.pernode=250;
Please do not try to increase hive partitions to higher value .
It may cause Namenode crash . If possible try to change the partition column and apply new logic over it

Array in output schema caused exception

I am following this WordCount example using the Google BigQuery-Hadoop connector:
https://developers.google.com/hadoop/writing-with-bigquery-connector#completecode
The example works fine as it is.
To test array in the output schema, I have altered just one line in the code by adding an array object definition to the output schema:
String outputTableSchema = "[{'name': 'Word','type': 'STRING'},{'name': 'Number','type': 'INTEGER'},{'name':'Persons','mode':'REPEATED','type':'RECORD','fields':[{'name': 'name','type': 'STRING'},{'name': 'age','type': 'INTEGER'}]}]";
Now when I run the WordCount example, it gives this exception:
java.lang.IllegalStateException
at com.google.gson.JsonArray.getAsString(JsonArray.java:133)
at com.google.cloud.hadoop.io.bigquery.BigQueryUtils.getSchemaFromString(BigQueryUtils.java:97)
at com.google.cloud.hadoop.io.bigquery.BigQueryOutputFormat.getRecordWriter(BigQueryOutputFormat.java:121)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.(ReduceTask.java:568)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:637)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Does anyone know what the issue is?
Thank you
This is actually a bug in the current version of the BigQuery connector which prevents it from supporting inner records with more than 1 field.
We have a fix internally and it's slated to go out with the next release (0.4.3) which may still be a couple weeks out; if you'd like to help try out a staging build, feel free to reach out to gcp-hadoop-contact#google.com and we can provide instructions.

Error when run Hive-0.9.0 Exception in thread "main" java.lang.NoSuchFieldError: type

Really sorry for stupid question, but struggling to find answer. I am trying to start up Hive on my 3 node Hadoop cluster, HDFS runs OK as does PIG, Hbase but for the life of me I can not get Hive to run properly.
This is the classpath output >
:/home/hduser/hive-0.9.0/conf:/home/hduser/hive-0.9.0/lib/antlr-runtime-3.0.1.jar:/home/hduser/hive-0.9.0/lib/commons-cli-1.2.jar:/home/hduser/hive-0.9.0/lib/commons-codec-1.3.jar:/home/hduser/hive-0.9.0/lib/commons-collections-3.2.1.jar:/home/hduser/hive-0.9.0/lib/commons-dbcp-1.4.jar:/home/hduser/hive-0.9.0/lib/commons-lang-2.4.jar:/home/hduser/hive-0.9.0/lib/commons-logging-1.0.4.jar:/home/hduser/hive-0.9.0/lib/commons-logging-api-1.0.4.jar:/home/hduser/hive-0.9.0/lib/commons-pool-1.5.4.jar:/home/hduser/hive-0.9.0/lib/datanucleus-connectionpool-2.0.3.jar:/home/hduser/hive-0.9.0/lib/datanucleus-core-2.0.3.jar:/home/hduser/hive-0.9.0/lib/datanucleus-enhancer-2.0.3.jar:/home/hduser/hive-0.9.0/lib/datanucleus-rdbms-2.0.3.jar:/home/hduser/hive-0.9.0/lib/derby-10.4.2.0.jar:/home/hduser/hive-0.9.0/lib/guava-r09.jar:/home/hduser/hive-0.9.0/lib/hadoop-0.20.2-core.jar:/home/hduser/hive-0.9.0/lib/hbase-0.92.0.jar:/home/hduser/hive-0.9.0/lib/hbase-0.92.0-tests.jar:/home/hduser/hive-0.9.0/lib/hive-builtins-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive-cli-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive-common-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive-contrib-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive_contrib.jar:/home/hduser/hive-0.9.0/lib/hive-exec-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive-hbase-handler-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive-hwi-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive-jdbc-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive-metastore-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive-pdk-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive-serde-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive-service-0.9.0.jar:/home/hduser/hive-0.9.0/lib/hive-shims-0.9.0.jar:/home/hduser/hive-0.9.0/lib/jackson-core-asl-1.8.8.jar:/home/hduser/hive-0.9.0/lib/jackson-jaxrs-1.8.8.jar:/home/hduser/hive-0.9.0/lib/jackson-mapper-asl-1.8.8.jar:/home/hduser/hive-0.9.0/lib/jackson-xc-1.8.8.jar:/home/hduser/hive-0.9.0/lib/JavaEWAH-0.3.2.jar:/home/hduser/hive-0.9.0/lib/jdo2-api-2.3-ec.jar:/home/hduser/hive-0.9.0/lib/jline-0.9.94.jar:/home/hduser/hive-0.9.0/lib/json-20090211.jar:/home/hduser/hive-0.9.0/lib/libfb303-0.7.0.jar:/home/hduser/hive-0.9.0/lib/libfb303.jar:/home/hduser/hive-0.9.0/lib/libthrift-0.7.0.jar:/home/hduser/hive-0.9.0/lib/libthrift.jar:/home/hduser/hive-0.9.0/lib/log4j-1.2.16.jar:/home/hduser/hive-0.9.0/lib/slf4j-api-1.6.1.jar:/home/hduser/hive-0.9.0/lib/slf4j-log4j12-1.6.1.jar:/home/hduser/hive-0.9.0/lib/stringtemplate-3.1-b1.jar:/home/hduser/hive-0.9.0/lib/zookeeper-3.4.3.jar:
Logging initialized using configuration in file:/home/hduser/hive-0.9.0/conf/hive-log4j.properties
Hive history file=/tmp/hduser/hive_job_log_hduser_201212181716_326152902.txt
and then from HIVE command line I run this:
hive> CREATE TABLE pokes (foo INT, bar STRING);
however I get the following error:
Exception in thread "main" java.lang.NoSuchFieldError: type
at org.apache.hadoop.hive.ql.parse.HiveLexer.mKW_CREATE(HiveLexer.java:1602)
at org.apache.hadoop.hive.ql.parse.HiveLexer.mTokens(HiveLexer.java:6380)
at org.antlr.runtime.Lexer.nextToken(Lexer.java:89)
at org.antlr.runtime.BufferedTokenStream.fetch(BufferedTokenStream.java:133)
at org.antlr.runtime.BufferedTokenStream.sync(BufferedTokenStream.java:127)
at org.antlr.runtime.CommonTokenStream.setup(CommonTokenStream.java:132)
at org.antlr.runtime.CommonTokenStream.LT(CommonTokenStream.java:91)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:547)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:438)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:416)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:909)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Make sure you don't have any other antlr-*.jar in your classpath except the one which is there in HIVE_HOME/lib folder. If still doesn't work download the latest version from the antlr's website and put it into the HIVE_HOME/lib folder and give it a try.
HTH
just remove all the files with prefix jackson*
and copy new version of jackson* files from hive.
rm /opt/hadoop/hadoop/lib/jackson*
cp /opt/hive/hive/lib/jackson* /opt/hadoop/hadoop/lib
I did it this way, and it worked perfectly fine!
I hope it helps.

Exporting an HSQLDB to XML using DBUnit results in null pointer errors

I'm trying to export the entire contents of my database, an HSQLDB, to XML using DBUnit, and I'm getting null pointer errors that I can't understand. I'm following the example in the FAQ:
IDatabaseConnection xmlConnection = new DatabaseConnection(conn);
IDataSet allTables = xmlConnection.createDataSet();
XmlDataSet.write(allTables, new FileOutputStream(DATABASE_PATH + ".xml"));
The null pointer error occurs on the last line. conn and DATABASE_PATH aren't null as they're both checked for that and used later in the program without a problem (exporting the database into CSV using OpenCSV, which works perfectly and exactly as expected).
The stacktrace is as follows:
org.dbunit.dataset.DataSetException: java.sql.SQLException: java.lang.NullPointerException java.lang.NullPointerException
at org.dbunit.database.DatabaseDataSet.initialize(DatabaseDataSet.java:243)
at org.dbunit.database.DatabaseDataSet.getTableNames(DatabaseDataSet.java:272)
at org.dbunit.database.DatabaseDataSet.createIterator(DatabaseDataSet.java:258)
at org.dbunit.dataset.AbstractDataSet.iterator(AbstractDataSet.java:189)
at org.dbunit.dataset.stream.DataSetProducerAdapter.(DataSetProducerAdapter.java:63)
at org.dbunit.dataset.xml.XmlDataSetWriter.write(XmlDataSetWriter.java:128)
at org.dbunit.dataset.xml.XmlDataSet.write(XmlDataSet.java:104)
at org.dbunit.dataset.xml.XmlDataSet.write(XmlDataSet.java:91)
at pms.DatabaseExporter.exportToXML(DatabaseExporter.java:181)
at pms.DatabaseExporter.main(DatabaseExporter.java:301)
Caused by: java.sql.SQLException: java.lang.NullPointerException java.lang.NullPointerException
at org.hsqldb.jdbc.Util.sqlException(Util.java:224)
at org.hsqldb.jdbc.JDBCStatement.fetchResult(JDBCStatement.java:1830)
at org.hsqldb.jdbc.JDBCStatement.executeQuery(JDBCStatement.java:181)
at org.hsqldb.jdbc.JDBCDatabaseMetaData.execute(JDBCDatabaseMetaData.java:6150)
at org.hsqldb.jdbc.JDBCDatabaseMetaData.getTables(JDBCDatabaseMetaData.java:3170)
at org.dbunit.database.DefaultMetadataHandler.getTables(DefaultMetadataHandler.java:137)
at org.dbunit.database.DatabaseDataSet.initialize(DatabaseDataSet.java:199)
... 9 more
Caused by: org.hsqldb.HsqlException: java.lang.NullPointerException
at org.hsqldb.error.Error.error(Error.java:108)
at org.hsqldb.result.Result.newErrorResult(Result.java:1069)
at org.hsqldb.StatementDMQL.execute(StatementDMQL.java:192)
at org.hsqldb.Session.executeCompiledStatement(Session.java:1315)
at org.hsqldb.Session.executeDirectStatement(Session.java:1206)
at org.hsqldb.Session.execute(Session.java:990)
at org.hsqldb.jdbc.JDBCStatement.fetchResult(JDBCStatement.java:1822)
... 14 more
Caused by: java.lang.NullPointerException
at org.hsqldb.types.CharacterType.compare(CharacterType.java:418)
at org.hsqldb.index.IndexAVL.compareRowForInsertOrDelete(IndexAVL.java:617)
at org.hsqldb.index.IndexAVLMemory.insert(IndexAVLMemory.java:214)
at org.hsqldb.persist.RowStoreAVL.indexRow(RowStoreAVL.java:171)
at org.hsqldb.persist.RowStoreAVLHybridExtended.indexRow(RowStoreAVLHybridExtended.java:99)
at org.hsqldb.Table.insertSys(Table.java:2625)
at org.hsqldb.dbinfo.DatabaseInformationMain.SYSTEM_TABLES(DatabaseInformationMain.java:2353)
at org.hsqldb.dbinfo.DatabaseInformationMain.generateTable(DatabaseInformationMain.java:348)
at org.hsqldb.dbinfo.DatabaseInformationFull.generateTable(DatabaseInformationFull.java:379)
at org.hsqldb.dbinfo.DatabaseInformationMain.setStore(DatabaseInformationMain.java:507)
at org.hsqldb.persist.PersistentStoreCollectionSession.getStore(PersistentStoreCollectionSession.java:138)
at org.hsqldb.Table.getRowStore(Table.java:2817)
at org.hsqldb.RangeVariable$RangeIteratorMain.(RangeVariable.java:939)
at org.hsqldb.RangeVariable$RangeIteratorMain.(RangeVariable.java:917)
at org.hsqldb.RangeVariable.getIterator(RangeVariable.java:770)
at org.hsqldb.QuerySpecification.buildResult(QuerySpecification.java:1293)
at org.hsqldb.QuerySpecification.getSingleResult(QuerySpecification.java:1245)
at org.hsqldb.QuerySpecification.getResult(QuerySpecification.java:1235)
at org.hsqldb.StatementQuery.getResult(StatementQuery.java:66)
at org.hsqldb.StatementDMQL.execute(StatementDMQL.java:190)
... 18 more
I've googled and couldn't find anything relating to this kind of error during export. I'm not that experienced with SQL or JDBC so I'm hoping there's enough info in the stack trace for someone more knowledgeable to tell me what's going wrong. If there's some other library that would be better for my needs, I have no problem switching... The only thing I need is export/import with XML right now, so I'm not using DBUnit for anything else. Anyway if anybody can tell me what's going on wrong or if I ought to be using something else I'd really appreciate it.
This is an error in the particular version of HSQLDB's system table creation, which was spotted and corrected recently. You can try the updated HSQLDB jar from http://hsqldb.org/support/