Need Help for Setup Apache Hadoop on Apache Mesos - apache

I'm trying to setup hadoop on mesos using the document below:
https://docs.mesosphere.com/tutorials/run-hadoop-on-mesos/
I'm facing a problem on step-9
sudo -u mapred ./hadoop-2.0.0-mr1-cdh4.2.1/bin/hadoop dfs -rm -f /hadoop-2.0.0-mr1-cdh4.2.1.tgz
sudo -u mapred /usr/bin/hadoop dfs -copyFromLocal ./hadoop-2.0.0-mr1-cdh4.2.1.tgz /
I am still new to this concept. I have to configure a mesos cluster using this tutorial:
https://www.digitalocean.com/community/tutorials/how-to-configure-a-production-ready-mesosphere-cluster-on-ubuntu-14-04
Now I'm getting errors while performing dfs commands:
root#station1:~# sudo -u mapred ./hadoop-2.0.0-mr1-cdh4.2.1/bin/hadoop dfs -rm -f /hadoop-2.0.0-mr1-cdh4.2.1.tgz
-rm: Expected authority at index 7: hdfs://
Usage: hadoop fs [generic options] -rm [-f] [-r|-R] [-skipTrash] <src> ...

This tutorial assumes you have HDFS already installed on your cluster. You can do this by manually installing HDFS on each node, or you can try out the new HDFS framework: https://github.com/mesosphere/hdfs
Does hadoop fs -ls hdfs:// work on its own? If not, you'll need to install and configure HDFS appropriately.

Related

hdfsBuilderConnect error while using tfserving load model from hdfs

there is my environment info:
TensorFlow Serving version 1.14
os mac10.15.7
i want to load modle from hdfs by using tfserving.
when i build a tensorflow-serving:hadoop docker image,like this:
FROM tensorflow/serving:2.2.0
RUN apt update && apt install -y openjdk-8-jre
RUN mkdir /opt/hadoop-2.8.2
COPY /hadoop-2.8.2 /opt/hadoop-2.8.2
ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64
ENV HADOOP_HDFS_HOME /opt/hadoop-2.8.2
ENV HADOOP_HOME /opt/hadoop-2.8.2
ENV LD_LIBRARY_PATH
${LD_LIBRARY_PATH}:${JAVA_HOME}/jre/lib/amd64/server
# ENV PATH $PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
RUN echo '#!/bin/bash \n\n\
CLASSPATH=(${HADOOP_HDFS_HOME}/bin/hadoop classpath --glob)
tensorflow_model_server --port=8500 --rest_api_port=9000 \
--model_name=${MODEL_NAME} --
model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} \
"#"' > /usr/bin/tf_serving_entrypoint.sh \
&& chmod +x /usr/bin/tf_serving_entrypoint.sh
EXPOSE 8500
EXPOSE 9000
ENTRYPOINT ["/usr/bin/tf_serving_entrypoint.sh"]
and then run :
docker run -p 9001:9000 --name tensorflow-serving-11 -e MODEL_NAME=tfrest -e MODEL_BASE_PATH=hdfs://ip:port/user/cess2_test/workspace/cess/models -t tensorflow_serving:1.14-hadoop-2.8.2
i met this problem. ps:i have already modify hadoop config in hadoop-2.8.2
hdfsBuilderConnect(forceNewInstance=0, nn=ip:port, port=0, kerbTicketCachePath=(NULL), userName=(NULL))
error:(unable to get stack trace for java.lang.NoClassDefFoundError exception: ExceptionUtils::getStackTrace error.)
is there any suggestions how to solve this problem?
thanks
i solved this problem by adding hadoop absolute path to classpath.

Hive table loading: Unable to move source file

I begin learning BigData with Hadoop Hive
I can't upload local data to Hive table
Hive command is:
load data local inpath '/usr/local/nhanvien/testHive.txt' into table nhanvien;
I get error :
Loading data to table hivetest.nhanvien Failed with exception Unable
to move source file:/usr/local/nhanvien/testHive.txt to destination
hdfs://localhost:9000/user/hive/warehouse/hivetest.db/nhanvi‌​en/testHive_copy_3.t‌​xt
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.MoveTask
was try:
hadoop fs -chmod g+w /user/hive/warehouse
sudo chmod -R 777 /home/abc/employeedetails
it still get this error
can someone give me solution ?
You can try with:
export HADOOP_USER_NAME=hdfs
hive -e "load data local inpath '/usr/local/nhanvien/testHive.txt' into table nhanvien;"
Its a permission issue. Try giving permission to local file and directory where your file exists.
sudo chmod -R 777 /usr/local/nhanvien/testHive.txt
Then
Login as $HDFS_USER and run the following command:
hdfs dfs -chown -R $HIVE_USER:$HDFS_USER /user/hive
hdfs dfs -chmod -R 775 /user/hive
hdfs dfs -chmod -R 775 /user/hive/warehouse
You can also configure for hdfs-site.xml such as:
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
This configure will disable permissions on HDFS. So, a regular user can do the operations on HDFS.
Hope this help.

What permissions are required to run Hive Cli

I'm seeing an issue with running the Hive CLI. When I run the CLI on an edge node I receive the following error regarding HDFS permissions:
c784gnj:~ # sudo hive
/usr/lib/hive/conf/hive-env.sh: line 5: /usr/lib/hive/lib/hive-hbase-handler-1.1.0-cdh5.5.2.jar,/usr/lib/hbase/hbase-common.jar,/usr/lib/hbase/lib/htrace-core4-4.0.1-incubating.jar,/usr/lib/hbase/lib/htrace-core-3.2.0-incubating.jar,/usr/lib/hbase/lib/htrace-core.jar,/usr/lib/hbase/hbase-hadoop2-compat.jar,/usr/lib/hbase/hbase-client.jar,/usr/lib/hbase/hbase-server.jar,/usr/lib/hbase/hbase-hadoop-compat.jar,/usr/lib/hbase/hbase-protocol.jar: No such file or directory
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
16/10/11 10:35:49 WARN conf.HiveConf: HiveConf of name hive.metastore.local does not exist
Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-1.1.0-cdh5.5.2.jar!/hive-log4j.properties
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: Permission denied: user=app1_K, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:216)
What is hive trying to write to in the /user directory in HDFS?
I can already see that /user/hive is created:
drwxrwxr-t - hive hive 0 2015-03-16 22:17 /user/hive
As you can see I am behind kerberos auth on Hadoop.
Thanks in advance!
Log says you need to set permission on HDFS /user directory to user app1_K
Command
hadoop fs -setfacl -m -R user:app1_K:rwx /user
Execute this command as privileged user from Hadoop bin
If you get similar permission error on any other hdfs directory, then you have to grant permission on that directory.
Refer the below link for more information.
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#ACLs_Access_Control_Lists
Instead of disabling HDFS access privileges altogether, as suggested by #Kumar, you might simply create a HDFS home dir for every new user on the system, so that Hive/Spark/Pig/Sqoop jobs have a valid location to create temp files...
On a Kerberized cluster:
kinit hdfs#MY.REALM
hdfs dfs -mkdir /user/app1_k
hdfs dfs -chown app1_k:app1_k /user/app1_k
Otherwise:
export HADOOP_USER_NAME=hdfs
hdfs dfs -mkdir /user/app1_k
hdfs dfs -chown app1_k:app1_k /user/app1_k

Apache Flume stuck after exec flume-ng

I need help.
I've downloaded Apache Flume and installed outside Hadoop, just wanna try netcat logging through console.
I used 1.6.0 version.
Here's my conf https://gist.github.com/ans-4175/297e2b4fc0a67d826b4b
Here's how I started it
bin/flume-ng agent -c conf -f conf/netcat.conf Dflume.root.logger=DEBUG,console -n Agent1
But it's stuck after only printed these output
Info: Sourcing environment configuration script /root/apache-flume/conf/flume-env.sh
Info: Including Hive libraries found via () for Hive access
+ exec /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java -Xms100m -Xmx2000m -cp '/root/apache-flume/conf:/root/apache-flume/lib/*:/root/apache-flume/lib/*:/lib/*' -Djava.library.path= org.apache.flume.node.Application -f conf/netcat.conf Dflume.root.logger=DEBUG,console -n Agent1
Any suggestions for simple start and install?
Thanks
dumb me, it should be
bin/flume-ng agent -c conf -f conf/netcat.conf -Dflume.root.logger=DEBUG,console -n Agent1

How to work with Mahout?

I have tried to work with mahout using the link:
http://girlincomputerscience.blogspot.in/2010/11/apache-mahout.html
When i execute the command
anand#ubuntu:~/Downloads/mahout-distribution-0.9$ bin/mahout recommenditembased --input mydata.dat --usersFile user.dat --numRecommendations 2 --output output/ --similarityClassname SIMILARITY_PEARSON_CORRELATION
it shows:
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
Error occurred during initialization of VM
Could not reserve enough space for object heap
Could not create the Java virtual machine.
What may be the possible reasons for the errors?
You need to setup Apache Hadoop first ( also described here ):
$ http://mirror.metrocast.net/apache/hadoop/common/hadoop-1.2.1/hadoop-1.2.1-bin.tar.gz
$ tar zxf hadoop-1.2.1-bin.tar.gz
$ cd hadoop-1.2.1
$ export PATH=`pwd`/bin:$PATH
Then try to setup Apache Mahout
Apache Mahout setup
$ wget -c http://archive.apache.org/dist/mahout/0.9/mahout-distribution-0.9.tar.gz
$ tar zxf mahout-distribution-0.9.tar.gz
$ cd mahout-distribution-0.9
Setup env variables:
export HADOOP_HOME=/path/to/hadoop-1.2.1
export PATH=$HADOOP_HOME/bin:$PATH
export MAHOUT_HOME=/path/to/mahout-distribution-0.9
export PATH=$MAHOUT_HOME/bin:$PATH
Now your Mahout installation should work fine ( read here for more ).