Imapala server is not getting started - impala

I am currently using HADOOP 2.2.0 , HIVE 0.12.0 and Impala 1.2.3. When i am trying to start imapala -server its not getting started. When i checked the log directory , i am getting the following error.
Any help is highly appreciated.
Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: callId, status;
Host Details : local host is: "XXXX/[IP-ADDESS]"; destination host is: "hadoop-master":9000;
E0219 13:15:16.223870 22635 impala-server.cc:403] Aborting Impala Server startup due to improper configuration

Hadoop 2.2 is using protobuf 2.5 and Impala is using protobuf 2.4.0a .
Unfortunately code generated with protobuf 2.5 is incompatible with older protobuf libraries.
You can check JIRA ISSUE(HADOOP-9845) for the background or design decision to upgrade protobuf in Hadoop.
SOLUTION
Remove older protobuf .
Install protbuf 2.5
Build Impala

Related

Some streams terminated before this command could finish error

I am trying to read streaming data into Azure Databricks . This is the code i've been using:
And its giving me an error saying:
my Databrick Runtime is : 6.4 Extended Support (includes Apache Spark 2.4.5, Scala 2.11)
and i install the package : com.microsoft.azure:azure-eventhubs-spark_2.11:2.3.17
but i still get the same error

issue in accessing ORC transactional table through apache drill

Issue while accessing ORC transactional hive table through apache drill.
Apache drill 1.10.0
Hive 1.2.1
Below is the error coming while accessing data from the mentioned table through apache drill.
Query Failed: An Error Occurred
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: NumberFormatException: For input string: "0000112_0000" [Error Id: ad9b4243-d48d-43c7-9755-388202d7c54d on inbbrdssvm16.india.tcs.com:31010]
Please help me in resolving the issue.
I suggest you to move onto latest Drill and Hive versions.
This issue is resolved in Apache Drill 1.13.0 version
https://issues.apache.org/jira/browse/DRILL-5978

HIVE on Spark Issue

I am trying to configure Hive on Spark but even after trying for 5 days i am not getting any solution..
Steps followed:
1.After spark installation,going in hive console and setting below proeprties
set hive.execution.engine=spark;
set spark.master=spark://INBBRDSSVM294:7077;
set spark.executor.memory=2g;
set spark.serializer=org.apache.spark.serializer.KryoSerializer;
2.Added spark -asembly jar in hive lib.
3.When running select count(*) from table_name I am getting below error:
2016-08-08 15:17:30,207 ERROR [main]: spark.SparkTask (SparkTask.java:execute(131))
- Failed to execute spark task, with exception
'org.apache.hadoop.hive.ql.metadata.HiveException (Failed to create spark client.)'
Hive version: 1.2.1
Spark version: tried with 1.6.1,1.3.1 and 2.0.0
Would appreciate if any one can suggest something.
You can download spark-1.3.1 src from spark download website and try to build spark-1.3.1 without hive version using:
./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.4" -Dhadoop.version=2.7.1 -Dyarn.version=2.7.1 –DskipTests
Then copy spark-assembly-1.3.1-hadoop2.7.1.jar to hive/lib folder.
And follow https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started#HiveonSpark:GettingStarted-SparkInstallation to set necessary properties.
First of all, you need to pay attention to which versions are compatible. If you choose Hive 1.2.1, I advise you to use Spark 1.3.1. You can see the version compatibility list here.
The mistake you have is a general mistake. You need to start Spark and see what errors the Spark Workers says. However, have you already copied the hive-site.xml to spark/conf?

Cannot Load Hive Table into Pig via HCatalog

I am currently configuring a Cloudera HDP dev image using this tutorial on CentOS 6.5, installing the base and then adding the different components as I need them. Currently, I am installing / testing HCatalog using this section of the tutorial linked above.
I have successfully installed the package and am now testing HCatalog integration with Pig with the following script:
A = LOAD 'groups' USING org.apache.hcatalog.pig.HCatLoader();
DESCRIBE A;
I have previously created and populated a 'groups' table in Hive before running the command. When I run the script with the command pig -useHCatalog test.pig I get an exception rather than the expected output. Below is the initial part of the stacktrace:
Pig Stack Trace
---------------
ERROR 2245: Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1608)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1547)
at org.apache.pig.PigServer.registerQuery(PigServer.java:518)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:991)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
...
Has anyone encountered this error before? Any help would be much appreciated. I would be happy to provide more information if you need it.
The error was caused by HBase's Thrift server not being proper configured. I installed/configured Thrift and added the following to my hive-xml.site with the proper server information added:
<property>
<name>hive.metastore.uris</name>
<value>thrift://<!--URL of Your Server-->:9083</value>
<description>IP address (or fully-qualified domain name) and port of the metastore host</description>
</property>
I thought the snippet above was not required since I am running Cloudera HDP in pseudo-distributed mode.Turns out, it and HBase Thrift are required to use HCatalog with Pig.

Sqoop - Could not find or load main class org.apache.sqoop.Sqoop

I installed Hadoop, Hive, HBase, Sqoop and added them to the PATH.
When I try to execute sqoop command, I'm getting this error:
Error: Could not find or load main class org.apache.sqoop.Sqoop
Development Environment:
OS : Ubuntu 12.04 64-bit
Hadoop Version: 1.0.4
Hive Version: 0.9.0
Hbase Version: 0.94.5
Sqoop Version: 1.4.3
make sure you have sqoop-1.4.3.jar under your SQOOP HOME directory.
Note : May be because you had downloaded wrong distribution under Sqoop Distribution
I have resolved this issue on CentOS 6.3.
I have Hadoop-1.0.4, hbase-0.94.6, hive-0.10.0, pig-0.11.1, sqoop-1.4.3.bin__hadoop-1.0.0, zookeeper-3.4.5 installed.
I was also running same problem at sqoop: Error - Could not find the main class: org.apache.sqoop.Sqoop.
To resolve this issue I have copied the jar file: sqoop-1.4.3.jar from $SQOOP_HOME/ into the $HADOOP_HOME/lib/.
Hope this would help someone who struggling sqoop to be work with hadoop.
Unfortunately, I didn't find a complete answer for my problems. Current sqoop installation version I used was 1.4.6 . I am not sure about sqoop-1.4.6.tar.gz if one has to compile the source code, I was able to beat the same error Error - Could not find the main class: org.apache.sqoop.Sqoop using following instructions:
Instead I downloaded sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz from apache sqoop and installed it at /home/ubuntu/SQOOP/ renamed sqoop-1.4.6.bin__hadoop-2.0.4-alpha to sqoop. I wanted to use with Yarn.
Then export and set $SQOOP_HOME
I used this
export SQOOP_HOME=/home/ubuntu/SQOOP/sqoop/
export PATH=$PATH:$SQOOP_HOME/bin
Now if one go to $SQOOP_HOME/bin and try
./sqoop help
It should work without any issue.
The problem in my case was that hadoop-env.sh file has this line in it:
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar
It seems that when you call sqoop it internally calls configure-sqoop which sets the HADOOP_CLASSPATH correctly but then when it (sqoop) calls hadoop, hadoop ignores that variable and reset it back to what is in hadooop-env.sh
The fix was to change the hadoop-env.sh to have this line instead:
export HADOOP_CLASSPATH="${JAVA_HOME}/lib/tools.jar:$HADOOP_CLASSPATH"
#user225003 solution magically worked and I looked into some of the files and here is what happens under the hood when you execute "sqoop" script.
The "sqoop" script essentially executes "hadoop" script from $HADOOP_COMMON_HOME/bin/ directory. While configuring sqoop, in "sqoop-env.sh" we set the $HADOOP_COMMON_HOME to hadoop installation directory. If your sqoop and hadoop installations are not in regular location /usr/local, I believe sqoop-x.x.x.jar is not in the hadoop script's classpath.