Pig Hcatalog error - apache-pig

I am trying to run pig script in local mode on a single node cluster as given below.
hduser#ubuntu:~$ pig -x local -f "/home/hduser/ddsoft/pigscript/FirstUDF.pig"
But I am getting below error.
[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 101: file
'/home/hduser/ddsoft/hive-0.13.1-bin/hcatalog/share/hcatalog/hcatalog-core-0.13.1.jar'
does not exist.
how do I register the jar file mentioned in the error message. I tried updating the .bashrc, but it didn’t fix the error.

Related

Lynis wont work, 'error: Unknown option 'ssl-certificate-paths-to-ignore' found (with value: /etc/letsencrypt/archive:)

How do I correct/get rid of the following error? It prevents me from executing any lynis commands.
Output of "Sudo lynis show version" or any lynis command:
"Error: Unknown option 'ssl-certificate-paths-to-ignore' found (with value: /etc/letsencrypt/archive:)".
Background: Installed lynis, but couldn't figure out how to use it - got stuck on changing the parameters following a guide. Tried using another guide (https://adamtheautomator.com/lynis/), which included installing lynis again, and got this error
Tried purging and removing lynis, but got this error again after reinstalling, because thought the error had occured when I had first tried to install and configure lynis.
During the purge got:
"Purging configuration files for lynis (3.0.8-100) ...
dpkg: warning: while removing lynis, directory '/etc/lynis' not empty so not removed
Purging configuration files for menu (2.1.47ubuntu4) ...".

fatal error: 'bpf_helper_defs.h' file not found

When I run make kselftest-install, it shows the following error:
../../../lib/bpf/bpf_helpers.h:11:10: fatal error: 'bpf_helper_defs.h'
file not found
How to solve this issue? I'm using Ubuntu 20.04.
First, try to find the missing file in your kernel source code root directory.
$ find -name bpf_helper_defs.h
./tools/bpf/resolve_btfids/libbpf/include/bpf/bpf_helper_defs.h
./tools/bpf/resolve_btfids/libbpf/bpf_helper_defs.h
It shows that bpf_helper_defs.h exists, but in a different directory than bpf_helpers.h locates. So a simple way to solve this issue is to copy the missing file to the same directory.
$ cp ./tools/bpf/resolve_btfids/libbpf/include/bpf/bpf_helper_defs.h ./tools/lib/bpf/
Run make kselftest-install again, the error disappears.

Hiveserver2 does not start after installing HDP 2.6.4.0-91 using cloudbreak on AWS

Hiveserver2 does not start after installing HDP 2.6.4.0-91 using cloudbreak on AWS.
Start the hiveserver2 in the Ambari UI and check the contents of /var/log/hive/hiveserver2.log.
Below is the error log.
Any help would be appreciated.
Contents of hiveserver2.log
2018-03-08 04:41:53,345 WARN [main-EventThread]: server.HiveServer2 (HiveServer2.java:process(343)) - This instance of HiveServer2 has been removed from the list of server instances available for dynamic service discovery. The last client session has ended - will shutdown now.
2018-03-08 04:41:53,347 INFO [main]: zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x16203aad5af0040 closed
2018-03-08 04:41:53,347 INFO [main]: server.HiveServer2 (HiveServer2.java:removeServerInstanceFromZooKeeper(361)) - Server instance removed from ZooKeeper.
2018-03-08 04:41:53,348 INFO [main-EventThread]: server.HiveServer2 (HiveServer2.java:stop(405)) - Shutting down HiveServer2
2018-03-08 04:41:53,348 INFO [main-EventThread]: server.HiveServer2 (HiveServer2.java:removeServerInstanceFromZooKeeper(361)) - Server instance removed from ZooKeeper.
2018-03-08 04:41:53,348 INFO [main-EventThread]: zookeeper.ClientCnxn (ClientCnxn.java:run(524)) - EventThread shut down
2018-03-08 04:41:53,348 WARN [main]: server.HiveServer2 (HiveServer2.java:startHiveServer2(508)) - Error starting HiveServer2 on attempt 1, will retry in 60 seconds
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1520480101488_0046 failed 2 times due to AM Container for appattempt_1520480101488_0046_000002 exited with exitCode: -1000
For more detailed output, check the application tracking page: http://ip-10-0-91-7.ap-northeast-2.compute.internal:8088/cluster/app/application_1520480101488_0046 Then click on links to logs of each attempt.
Diagnostics: ExitCodeException exitCode=2: tar: Removing leading `/' from member names
tar: Skipping to next header
gzip: /hadoopfs/fs1/yarn/nodemanager/filecache/60_tmp/tmp_tez.tar.gz: invalid compressed data--format violated
tar: Exiting with failure status due to previous errors
Failing this attempt. Failing the application.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:699)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:218)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:116)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:76)
at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:488)
at org.apache.hive.service.server.HiveServer2.access$700(HiveServer2.java:87)
at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:720)
at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:593)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
I had exactly the same issue with HDP on AWS. FYI, In my case the issue was with HDP version 2.6.4.5-2. I'm going to show how I fixed using this version because it is the latest at this time.
As the error log shows the problem is with tez.tar.gz that is corrupted then YARN is unable to decompress it in the YARN container.
This tez.tar.gz file is copied from the hdfs:///hdp/apps/<hdp_version>/tez/tez.tar.gz.
To reproduce the error and confirm that this file is corrupted, you can run the following command:
sudo su
su hdfs
hdfs dfs -get /hdp/apps/2.6.4.5-2/tez.tar.gz
tar -xvzf tez.tar.gz
You will get the following error:
gzip: stdin: invalid compressed data--format violated
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
The fix is pretty simple, you must just replace the HDFS file with the one that you have on your local file-system running the following command:
hdfs dfs -rm /hdp/apps/2.6.4.5-2/tez/tez.tar.gz
hdfs dfs -put /usr/hdp/current/tez-client/lib/tez.tar.gz /hdp/apps/2.6.4.5-2/tez/tez.tar.gz
Now restart Hive Server 2 service and done!
NOTE: If something similar happens with other services you can do the same thing. Please check the following link that has more details: https://community.hortonworks.com/articles/30096/foxing-broken-targz-and-jar-files-in-hdp-24.html
Hope this helps!

ERROR 2998: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected

I am new to hadoop. I was trying to integrate PIG with hive using Hcatalog but getting the below error during dump. Please let me know if any of you can help me out:
A = load 'logs' using org.apache.hcatalog.pig.HCatLoader();
dump A
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled
internal error.
Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
load and describe work fines but dump gives above error
Details:
hadoop-2.6.0
pig-0.14.0
hive-0.12.0
Compiled piggybank using
$ "ant -Dhadoopversion=23 clean jar"
export PIG_OPTS=-Dhive.metastore.uris=thrift://localhost:10000
export PIG_CLASSPATH=$HCAT_HOME/share/hcatalog/*:$HIVE_HOME/lib/*
$ REGISTER hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar;
$ REGISTER hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.0.jar;
$ REGISTER hive-0.12.0/hcatalog/share/hcatalog/hcatalog-core-0.12.0.jar;
$ REGISTER hive-0.12.0/lib/hive-exec-0.12.0.jar;
$ REGISTER hive-0.12.0/lib/hive-metastore-0.12.0.jar;
ran hive server using "hive --service hiveserver"
hive-site.xml:(and mysql related stuff )
hive.metastore.local
true
Please let me know if anything else needs to be configured

ERROR 1066: Unable to open iterator for alias

I am getting an error Unable to open iterator for alias for a simple LOAD and DUMP operation with Pig. I already took a look at the answers at:
ERROR 1066: Unable to open iterator for alias - Pig
But they don't help me.
My environment:
OS: Windows 7
Pig version: 0.13.0
Mode: Local
It shows in the error that the exception is 'Caused by' a failure to change
permission of a file in the TMP directory. But when I checked the TMP directory, there's no such file (probably it got deleted after the command is finished?).
Logs below (with -v and -w options) :
'D:\H\HADOOP-2.6.0\bin\hadoop-config.cmd' is not recognized as an internal or external command,
operable program or batch file.
15/01/24 09:20:22 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
15/01/24 09:20:22 INFO pig.ExecTypeProvider: Picked LOCAL as the ExecType
2015-01-24 09:20:22,909 [main] INFO org.apache.pig.Main - Apache Pig version 0.13.0 (r1606446) compiled Jun 29 2014, 02:29:34
2015-01-24 09:20:22,909 [main] INFO org.apache.pig.Main - Logging error messages to: d:\Pig\pig-0.13.0\bin\
2015-01-24 09:20:24,267 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file C:\Users\Venkat/.pigbootup not found
2015-01-24 09:20:24,438 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2015-01-24 09:20:26,205 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-01-24 09:20:26,236 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias a
2015-01-24 09:20:26,236 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias a
at org.apache.pig.PigServer.openIterator(PigServer.java:912)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:752)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:479)
at org.apache.pig.Main.main(Main.java:156)
Caused by: org.apache.pig.backend.datastorage.DataStorageException: ERROR 0: java.io.IOException: Failed to set permissions of path: \tmp\temp946561981 to 0700
at org.apache.pig.impl.io.FileLocalizer.relativeRoot(FileLocalizer.java:484)
at org.apache.pig.impl.io.FileLocalizer.getTemporaryPath(FileLocalizer.java:515)
at org.apache.pig.impl.io.FileLocalizer.getTemporaryPath(FileLocalizer.java:511)
at org.apache.pig.PigServer.openIterator(PigServer.java:887)
... 7 more
Caused by: java.io.IOException: Failed to set permissions of path: \tmp\temp946561981 to 0700
at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:286)
at org.apache.pig.backend.hadoop.datastorage.HPath.setPermission(HPath.java:122)
at org.apache.pig.impl.io.FileLocalizer.createRelativeRoot(FileLocalizer.java:495)
at org.apache.pig.impl.io.FileLocalizer.relativeRoot(FileLocalizer.java:481)
... 10 more
Details also at logfile: D:\Pig\pig-0.13.0\bin\data-1.txt1422071424329.log
Pig Stack Trace
ERROR 1066: Unable to open iterator for alias a
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias a
at org.apache.pig.PigServer.openIterator(PigServer.java:912)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:752)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:479)
at org.apache.pig.Main.main(Main.java:156)
Caused by: org.apache.pig.backend.datastorage.DataStorageException: ERROR 0: java.io.IOException: Failed to set permissions of path: \tmp\temp946561981 to 0700
at org.apache.pig.impl.io.FileLocalizer.relativeRoot(FileLocalizer.java:484)
at org.apache.pig.impl.io.FileLocalizer.getTemporaryPath(FileLocalizer.java:515)
at org.apache.pig.impl.io.FileLocalizer.getTemporaryPath(FileLocalizer.java:511)
at org.apache.pig.PigServer.openIterator(PigServer.java:887)
... 7 more
Caused by: java.io.IOException: Failed to set permissions of path: \tmp\temp946561981 to 0700
at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:286)
at org.apache.pig.backend.hadoop.datastorage.HPath.setPermission(HPath.java:122)
at org.apache.pig.impl.io.FileLocalizer.createRelativeRoot(FileLocalizer.java:495)
at org.apache.pig.impl.io.FileLocalizer.relativeRoot(FileLocalizer.java:481)
... 10 more
Pig file contents:
a = LOAD 's.csv' AS (NAME:chararray,COUNTRY:chararray,YEAR:int,SPORT:chararray,GOLD:int,SILVER:int,BRONZE:int,TOTAL:int);
DUMP a;
Contents of s.csv:
Yang Yilin China 2008 Gymnastics 1 0 2 3
Leisel Jones Australia 2000 Swimming 0 2 0 2
Is there anything wrong in the syntax of the LOAD statement?
Are there any environment variables that need to be set specifically other
than JAVA and JAVA_HOME?
first check your hadoop permission status if its root change it too user.
$sudo chown -R testuser:testuser /(path of hadoop folder)
permission problem will be solve.
I hope this will work
Generally we will get this error as "ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias" due to incorrect path or not able to access the file path.
check the file location is correct or not.
check are you able to access the file.
I got this issue with hadoop namenode safemode is on.
So I did below command to OFF safemode:
sudo -u hdfs hadoop dfsadmin -safemode leave
Then it works for me.