Cannot run oozie 4.3.0 on apache hadoop 2.7.3 - apache

I did all the setup for oozie 4.3.0 on Apache hadoop single node cluster, when tried running any standard example workflow.xml that comes with oozie, it is throwing below error.
WARN ActionStartXCommand:523 - SERVER[data01.teg.io] USER[hadoop] GROUP[-] TOKEN[] APP[map-reduce-wf] JOB[0000000-161215143751620-oozie-hado-W] ACTION[0000000-161215143751620-oozie-hado-W#mr-node] Error starting action [mr-node]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.]
I looked at the parameter "mapreduce.framework.name" and it is set to yarn everywhere in all config files. I checked Sharelib is created properly and can see when queried with shareliblist command, i dont see where exactly the problem is. Tried every solution came up in google and could not solve it even after struggling for 2 days with it.
I can start and stop oozie daemon with out any problem.
Any insights are greatly helpful.

I figured out the solution. Unlike the prior versions of Oozie before 4.x.x, 4.3.0 does not generate hadoop-libs.jar file when we run the build command.
In the beginning, i copied jar files only from my hadoop's
/srv/hadoop-2.7.3/share/hadoop/common to oozie's libext folder. After i copied jar files from all the below paths to oozie's libext folder, i was able to successfully setup the Oozie.
/srv/hadoop-2.7.3/share/hadoop/common/*.jar
/srv/hadoop-2.7.3/share/hadoop/common/lib/*.jar
/srv/hadoop-2.7.3/share/hadoop/hdfs/*.jar
/srv/hadoop-2.7.3/share/hadoop/hdfs/lib/*.jar
/srv/hadoop-2.7.3/share/hadoop/mapreduce/*.jar
/srv/hadoop-2.7.3/share/hadoop/mapreduce/lib/*.jar
/srv/hadoop-2.7.3/share/hadoop/yarn/*.jar
/srv/hadoop-2.7.3/share/hadoop/yarn/lib/*.jar

Related

s3distcp fail with "mapreduce_shuffle does not exist"

When I running command below,
s3-dist-cp --src s3://test/9.19 --dest hdfs:///user/hadoop/test
I got a error about auxService.
20/02/03 07:52:13 INFO mapreduce.Job: Task Id : attempt_1580716305878_0001_m_000000_2, Status : FAILED
Container launch failed for container_1580716305878_0001_01_000004 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
In many QnA, I found a solution like this
link.
But there is no process for nodemanager.
[hadoop#ip-172-31-37-115 ~]$ initctl list | grep yarn
hadoop-yarn-timelineserver start/running, process 8149
hadoop-yarn-resourcemanager start/running, process 17331
hadoop-yarn-proxyserver start/running, process 8147
My EMR was created by quick menu with emr-5.28.0.
Is there anyone knows about this problem?
Thanks!
I'm sure there's some way to update the configs, but what I did was create a cluster using the 'advanced' setup and chose these software packages:
Ganglia
Hive
Hue
Mahout
Pig
Tez
Spark
Hadoop
(8 in total)
Most of those, except spark, are installed with the default settings (the first radio button for software packages in quick setup). One of these software packages or something related to it is what causes s3-dist-cp to be installed, and I was able to use it with no problems with that setup.

Start Node in Ignite

i want to start ignite node with a configuration name as example-igfs.xml. i have alter this configuration for using IGFS as cache layer for HDFS. but when i execute the below command for start ignite node i encounter with error:
java.lang.NoClassDefFoundError: com/google/common/base/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.(Configuration.java:361)
at org.apache.hadoop.conf.Configuration$DeprecationDelta.(Configuration.java:374)
at org.apache.hadoop.conf.Configuration.(Configuration.java:456)
at org.apache.ignite.internal.processors.hadoop.impl.HadoopUtils.safeCreateConfiguration(HadoopUtils.java:334)
at org.apache.ignite.internal.processors.hadoop.impl.delegate.HadoopBasicFileSystemFactoryDelegate.start(HadoopBasicFileSystemFactoryDelegate.java:129)
java.lang.NoClassDefFoundError error usually comes when ignite can't find required libraries(Jars).
In your case, you have to move JARs to $IGNITE_HOME\libs folder.
Create a folder in libs directory, let's say hadoop-libs and move all all required JARs to this folder.
I am not expert of hadoop but it seems that you are missing hadoop client and its dependent google guava libraries.

starting ignite cluster from command line

I am trying to start Ignite cluster from the command line on windows:
this is what I did:
Download Ignite binary version and kept it in C driver.
Set Environment Variable IGNITE_HOME to that folder location.
in command line I open the directory:
C:\apache-ignite-fabric-2.2.0-bin\bin
the from that directory :
C:\apache-ignite-fabric-2.2.0-bin\bin>sh ignite.sh examples/config/example-ignite.xml
I am getting the following error:
Failed to create Ignite component (consider adding ignite-spring module to classpath) [component=SPRING, cls=org.apache.ignite.internal.processors.spring.IgniteSpringProcessorImpl]
what can be the reason for this error?
found the solution for that:
need to run it in bat file and not sh file:
C:\apache-ignite-fabric-2.2.0-bin\bin>ignite.bat examples/config/example-ignite.xml
If you're on Windows I imagine you should try ignite.bat?
ignite.sh might have problems with classpath when run on Windows, that would explain it.

Apache oozie sharedlib is showing a blank list

Relatively new to Apache OOZIE and did an installation on Ubuntu 14.04, Hadoop 2.6.0, JDK 1.8. I was able to install oozie and the web console is visible at the 11000 port of my server.
Now while i copied the examples bundled with oozie and tried to run them i am running into an error which says no sharedlib exists.
Installed the sharedlib as below-
bin/oozie-setup.sh sharelib create -fs hdfs://localhost:54310
(my namenode is running on localhost 54310 and JT on localhost 54311)
hadoop fs -ls /user/hduser/share/lib is showing shared library created as per the oozie-site.xml file. However when i check the shared library using the command -
oozie admin -oozie http://localhost:11000/oozie -shareliblist the list is blank and also jobs are failing for the same reason.
Any clues on how should i approach this problem?
Thanks.
The sharelib create command looks fine.
If you havent done so already copy the core-site.xml from your hadoop installation folder into $OOZIE_HOME/conf/hadoop-conf/.
There might already be a "placeholder" core-site.xml in the hadoop-conf folder, delete or rename that one. Oozie doesnt get its hadoop configuration directly from your hadoop install (like hive for example) but from the core-site.xml you place in that hadoop-conf folder.
Okay i got a solution for this.
So when i was trying to create the sharedlib directory it was doing on HDFS but while running the job local path was being refereed. So i extracted the oozie-sharedlib tar.gz file in my local /user/hduser/share/lib directory and its working now.
But did not get the reason so its still an open question.
I have encountered the same issue and it turned out that
oozie was not able to communicate with hdfs, as it was not able to find the location for core-site.xml or any other hadoop configuration which has to be declared inside oozie-site.xml.
Corresponding property in oozie-site.xml is oozie.service.HadoopAccessorService.hadoop.configurations
this property was defined wrongly in my case.
changed it to point to where my Hadoop configuration xmls are present and then it started communicating with hdfs and hence was able to locate the sharelib on hdfs

Sqoop - Could not find or load main class org.apache.sqoop.Sqoop

I installed Hadoop, Hive, HBase, Sqoop and added them to the PATH.
When I try to execute sqoop command, I'm getting this error:
Error: Could not find or load main class org.apache.sqoop.Sqoop
Development Environment:
OS : Ubuntu 12.04 64-bit
Hadoop Version: 1.0.4
Hive Version: 0.9.0
Hbase Version: 0.94.5
Sqoop Version: 1.4.3
make sure you have sqoop-1.4.3.jar under your SQOOP HOME directory.
Note : May be because you had downloaded wrong distribution under Sqoop Distribution
I have resolved this issue on CentOS 6.3.
I have Hadoop-1.0.4, hbase-0.94.6, hive-0.10.0, pig-0.11.1, sqoop-1.4.3.bin__hadoop-1.0.0, zookeeper-3.4.5 installed.
I was also running same problem at sqoop: Error - Could not find the main class: org.apache.sqoop.Sqoop.
To resolve this issue I have copied the jar file: sqoop-1.4.3.jar from $SQOOP_HOME/ into the $HADOOP_HOME/lib/.
Hope this would help someone who struggling sqoop to be work with hadoop.
Unfortunately, I didn't find a complete answer for my problems. Current sqoop installation version I used was 1.4.6 . I am not sure about sqoop-1.4.6.tar.gz if one has to compile the source code, I was able to beat the same error Error - Could not find the main class: org.apache.sqoop.Sqoop using following instructions:
Instead I downloaded sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz from apache sqoop and installed it at /home/ubuntu/SQOOP/ renamed sqoop-1.4.6.bin__hadoop-2.0.4-alpha to sqoop. I wanted to use with Yarn.
Then export and set $SQOOP_HOME
I used this
export SQOOP_HOME=/home/ubuntu/SQOOP/sqoop/
export PATH=$PATH:$SQOOP_HOME/bin
Now if one go to $SQOOP_HOME/bin and try
./sqoop help
It should work without any issue.
The problem in my case was that hadoop-env.sh file has this line in it:
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar
It seems that when you call sqoop it internally calls configure-sqoop which sets the HADOOP_CLASSPATH correctly but then when it (sqoop) calls hadoop, hadoop ignores that variable and reset it back to what is in hadooop-env.sh
The fix was to change the hadoop-env.sh to have this line instead:
export HADOOP_CLASSPATH="${JAVA_HOME}/lib/tools.jar:$HADOOP_CLASSPATH"
#user225003 solution magically worked and I looked into some of the files and here is what happens under the hood when you execute "sqoop" script.
The "sqoop" script essentially executes "hadoop" script from $HADOOP_COMMON_HOME/bin/ directory. While configuring sqoop, in "sqoop-env.sh" we set the $HADOOP_COMMON_HOME to hadoop installation directory. If your sqoop and hadoop installations are not in regular location /usr/local, I believe sqoop-x.x.x.jar is not in the hadoop script's classpath.