I'm trying to get Scalding working on Zeppelin while using YARN. I followed the steps in the docs here to build the interpreter and set up the classpath override. When I run in local mode, code executes properly. However when I run on my cluster via YARN my jobs fail with:
Error: java.lang.ClassNotFoundException: cascading.CascadingException
or
Error: java.lang.ClassNotFoundException: cascading.tuple.TupleException
What is even stranger to me is that I can go into Zeppelin and execute:
import cascading.tuple.TupleException
import cascading.CascadingException
And both appear to have no problem finding those classes. It is only when I try to actually use scalding (on YARN), like loading data into a typed pipe and dumping that I get the ClassNotFoundException. Any ideas on how to debug or what to fix?
It looks like the cascading jars are not distributed to the YARN cluster. Please add "zeppelin/interpreter/scalding/*" to the args.string property of the scalding interpreter.
Here's the args.string we use:
-libjars /home/zeppelin-user/zeppelin/interpreter/scalding/,/home/zeppelin-user/deploy-bundle-201608111417/libs/ -Dscalding.reducer.estimator.classes=com.twitter.scalding.reducer_estimation.InputSizeReducerEstimator -Delephantbird.use.combine.input.format=true -Delephantbird.combine.split.size=134217728 --hdfs --repl
tmpjars contains jars that are distributed to the YARN cluster. You can see its contents with the command below:
%scalding
mode.asInstanceOf[Hdfs].conf.get("tmpjars").split(",").foreach(println)
Related
Every time i run Simple-example downloaded from github JavaLite examples, I get this error. Exception in thread "main" org.javalite.activejdbc.InitException: Failed to connect to JDBC URL: jdbc:mysql:mysql://localhost/movies with user: root. Also in read me file there is said to run mvn db-migrator:create command but I also get an error. Why is that, I have downloaded maven and my mvn process-classes builed successfully. Also I am interested in more JavaLite material, there are so few on the internet.
There is a README.md file in this example project:
https://github.com/javalite/javalite-examples/blob/master/simple-example/README.md
It has exact instructions to run this example. One of the steps is:
Adjust your database connection parameters in file:
src/main/resources/database.properties
This will ensure you have a correct configuration for your database.
When I running command below,
s3-dist-cp --src s3://test/9.19 --dest hdfs:///user/hadoop/test
I got a error about auxService.
20/02/03 07:52:13 INFO mapreduce.Job: Task Id : attempt_1580716305878_0001_m_000000_2, Status : FAILED
Container launch failed for container_1580716305878_0001_01_000004 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
In many QnA, I found a solution like this
link.
But there is no process for nodemanager.
[hadoop#ip-172-31-37-115 ~]$ initctl list | grep yarn
hadoop-yarn-timelineserver start/running, process 8149
hadoop-yarn-resourcemanager start/running, process 17331
hadoop-yarn-proxyserver start/running, process 8147
My EMR was created by quick menu with emr-5.28.0.
Is there anyone knows about this problem?
Thanks!
I'm sure there's some way to update the configs, but what I did was create a cluster using the 'advanced' setup and chose these software packages:
Ganglia
Hive
Hue
Mahout
Pig
Tez
Spark
Hadoop
(8 in total)
Most of those, except spark, are installed with the default settings (the first radio button for software packages in quick setup). One of these software packages or something related to it is what causes s3-dist-cp to be installed, and I was able to use it with no problems with that setup.
i want to start ignite node with a configuration name as example-igfs.xml. i have alter this configuration for using IGFS as cache layer for HDFS. but when i execute the below command for start ignite node i encounter with error:
java.lang.NoClassDefFoundError: com/google/common/base/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.(Configuration.java:361)
at org.apache.hadoop.conf.Configuration$DeprecationDelta.(Configuration.java:374)
at org.apache.hadoop.conf.Configuration.(Configuration.java:456)
at org.apache.ignite.internal.processors.hadoop.impl.HadoopUtils.safeCreateConfiguration(HadoopUtils.java:334)
at org.apache.ignite.internal.processors.hadoop.impl.delegate.HadoopBasicFileSystemFactoryDelegate.start(HadoopBasicFileSystemFactoryDelegate.java:129)
java.lang.NoClassDefFoundError error usually comes when ignite can't find required libraries(Jars).
In your case, you have to move JARs to $IGNITE_HOME\libs folder.
Create a folder in libs directory, let's say hadoop-libs and move all all required JARs to this folder.
I am not expert of hadoop but it seems that you are missing hadoop client and its dependent google guava libraries.
I am trying to start Ignite cluster from the command line on windows:
this is what I did:
Download Ignite binary version and kept it in C driver.
Set Environment Variable IGNITE_HOME to that folder location.
in command line I open the directory:
C:\apache-ignite-fabric-2.2.0-bin\bin
the from that directory :
C:\apache-ignite-fabric-2.2.0-bin\bin>sh ignite.sh examples/config/example-ignite.xml
I am getting the following error:
Failed to create Ignite component (consider adding ignite-spring module to classpath) [component=SPRING, cls=org.apache.ignite.internal.processors.spring.IgniteSpringProcessorImpl]
what can be the reason for this error?
found the solution for that:
need to run it in bat file and not sh file:
C:\apache-ignite-fabric-2.2.0-bin\bin>ignite.bat examples/config/example-ignite.xml
If you're on Windows I imagine you should try ignite.bat?
ignite.sh might have problems with classpath when run on Windows, that would explain it.
(I'm running on CentOS 5.8). I've been following the direction for a Clustered (Multiserver) Zookeeper Set-up, but getting an error when I try to start up my server. When I run the command as described in the documentation:
java -cp zookeeper-3.4.6.jar:lib/log4j-1.2.16.jar:conf \ org.apache.zookeeper.server.quorum.QuorumPeerMain conf/zoo.cfg
I get the error:
Error: Could not find or load main class org.apache.zookeeper.server.quorum.QuorumPeerMain
I have my files location as such and am running from the ~/zookeeper-3.4.6 directory:
~/zookeeper-3.4.6/zookeeper-3.4.6.jar
~/zookeeper-3.4.6/conf/zoo.cfg
~/zookeeper-3.4.6/data/myid
~/zookeeper-3.4.6/lib/log4j-1.2.16.jar
~/zookeeper-3.4.6/bin/zkServer.sh
Does anyone know why this error is happening? I don't quite understand the arguments that are being passed, so it is hard for me to debug the path issue. As a side note, I've tried running ./zookeeper-3.4.6/bin/zkServer.sh start, which did successfully work, but the documentation seems to indicate that command is meant for a single-node instance.
Edit:
I was able to make some progress by modifying the command and taking out the :conf \ part, so now I'm running:
java -cp zookeeper-3.4.6.jar:lib/log4j-1.2.16.jar: org.apache.zookeeper.server.quorum.QuorumPeerMain conf/zoo.cfg
I get a new error, but this is progress...
Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFacto
ry
at org.apache.zookeeper.server.quorum.QuorumPeerMain.<clinit>(QuorumPeer
Main.java:64)
Caused by: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 1 more
which corresponds to lines 63 and 64 from QuorumPeerMain
public class QuorumPeerMain {
private static final Logger LOG = LoggerFactory.getLogger(QuorumPeerMain.class);
I got the Error: Could not find or load main class org.apache.zookeeper.server.quorum.QuorumPeerMain, because I had downloaded the apache-zookeeper-X.X.X.tar.gz file and not the apache-zookeeper-X.X.X.bin.tar.gz file. Downloading, untarring and using the bin.tar file fixed it for me.
You can also build the binaries from the apache-zookeeper-X.X.X.tar.gz file; see the answer of #vincent.
This happens when you download and used apache-zookeeper-X.X.X.tar.gz, you should use apache-zookeeper-X.X.X-bin.tar.gz. this will surely solve this issue
The issue can be solved by untaring apache-zookeeper-3.5.6-bin.tar.gz. Initially, I encountered the same error while untared/installed apache-zookeeper-3.5.6.tar.gz and executed /bin/zkServer.sh start
Please make sure you have downloaded apache-zookeeper-3.5.6-bin.tar.gz
I got the same errors. I solved the error by check the README.md file in apache-zookeeper-[version].tar.gz Just type:
mvn clean install -DskipTests
then start the zookeeper. You will solve the error, too.
You should be able to run zkServer.sh to get a clustered setup. It will use the same conf/zoo.cfg that you are providing manually, which will contain the cluster endpoints.
The best way to check what you are missing from your classpath (and see the proper java command) is to run the zkServer.sh you said worked for you.
When it starts up, check the actual command used like this:
ps -ef | grep zookeeper
I also get this error when try to run Apache Zookeper v3.5.5 on Windows:
Error: Could not find or load main class org.apache.zookeeper.server.quorum.QuorumPeerMain
As said by #Onnonymous, I finish my problem by downloading the .bin.tar.gz (here) instead of .tar.gz version.
I am using apache-zookeeper-3.8.0-bin.tar.gz (Ubuntu 20.04)
This has all the necessary files to successfully start the zookeeper process.
specifying the other nodes in zoo.cfg file and starting the zookeeper automatically adds other nodes and starts the election process.
digital ocean has a good setup guide here setup zookeeper cluster
here's the zoo.cfg for reference
tickTime=2000
dataDir=/data/zookeeper
clientPort=2181
maxClientCnxns=60
initLimit=10
syncLimit=5
server.1=server1hostname/ip:2888:3888
server.2=server1hostname/ip:2888:3888
server.3=server1hostname/ip:2888:3888