org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied - hadoop-yarn

I am receiving below error. I have properly listed my core-site.xml properties for auth_local. Any help is appreciated.
Caused by: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to yarn/second-m.us-east1-c.c.golden-shine-351000.internal#US-EAST1-C.C.GOLDEN-SHINE-351000.INTERNAL`enter code here`
core-site.xml
-------------
<property>
<name>hadoop.security.auth_to_local</name>
<value>RULE:[1:$1](.*)s/(.*)/$1/g
RULE:[2:$1](.*)s/(.*)/$1/g
RULE:[2:$1/$2#$0]([ndj]n/.*#US-EAST1-C\.C\.GOLDEN-SHINE-351000\.INTERNAL)s/.*/hdfs/
RULE:[2:$1/$2#$0]([rn]m/.*#US-EAST1-C\.C\.GOLDEN-SHINE-351000\.INTERNAL)s/.*/yarn/
RULE:[2:$1/$2#$0](jhs/.*#US-EAST1-C\.C\.GOLDEN-SHINE-351000\.INTERNAL)s/.*/mapred/
DEFAULT</value>
</property>
kalyanchintanippu#second-m:~$ hadoop org.apache.hadoop.security.HadoopKerberosName yarn/second-m.us-east1-c.c.golden-shine-351000.internal#US-EAST1-C.C.GOLDEN-SHINE-351000.INTERNAL
Name: yarn/second-m.us-east1-c.c.golden-shine-351000.internal#US-EAST1-C.C.GOLDEN-SHINE-351000.INTERNAL to yarn

Related

Hive returning error while running insert query

I am trying to run an insert query and face following error using mapreduce
Application application_1609169302439_0001 failed 2 times due to AM
Container for appattempt_1609169302439_0001_000002 exited with
exitCode: 1 Failing this attempt.Diagnostics: [2020-12-28
16:29:05.332]Exception from container-launch. Container id:
container_1609169302439_0001_02_000001 Exit code: 1 [2020-12-28
16:29:05.335]Container exited with a non-zero exit code 1. Error file:
prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of
stderr : Error: Could not find or load main class
org.apache.hadoop.mapreduce.v2.app.MRAppMaster Please check whether
your <HADOOP_HOME>/etc/hadoop/mapred-site.xml contains the below
configuration: yarn.app.mapreduce.am.env
HADOOP_MAPRED_HOME=${full path of your hadoop distribution
directory}
mapreduce.map.env HADOOP_MAPRED_HOME=${full path
of your hadoop distribution directory}
mapreduce.reduce.env HADOOP_MAPRED_HOME=${full
path of your hadoop distribution directory}
while looking at my mapred-site.xml config file
<configuration>
<property>
<name>mapreduce.jobtracker.address</name>
<value>localhost:54311</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_MAPRED_HOME</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_MAPRED_HOME</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_MAPRED_HOME</value>
</property>
</configuration>
My understanding is that it is a configuration issue but cannot find any clear and simple answer what is missing on my sytem.
I had installed tez before but it wasn't working either.
Any help or guidance would be apreciated. I browsed site and could find similar issues reported but wasn't able to fix mine based on solution provided.
Best
After some research, I updated my configuration and I am now able to run mapreduce job from hive cli
Here's updated part of mapred-site.xml
> <property> <name>yarn.app.mapreduce.am.env</name>
> <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value> </property> <property>
> <name>mapreduce.map.env</name>
> <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value> </property> <property>
> <name>mapreduce.reduce.env</name>
> <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value> </property> <property>
> <name>mapreduce.application.classpath</name>
> <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/common/*,$HADOOP_MAPRED_HOME/share/hadoop/common/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/yarn/*,$HADOOP_MAPRED_HOME/share/hadoop/yarn/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/hdfs/*,$HADOOP_MAPRED_HOME/share/hadoop/hdfs/lib/*</value>
> </property>
yarn-site.xml also contains the path to jars.
<property>
<name>yarn.application.classpath</name>
<value>
%HADOOP_HOME%\etc\hadoop,
%HADOOP_HOME%\share\hadoop\common\*,
%HADOOP_HOME%\share\hadoop\common\lib\*,
%HADOOP_HOME%\share\hadoop\hdfs\*,
%HADOOP_HOME%\share\hadoop\hdfs\lib\*,
%HADOOP_HOME%\share\hadoop\mapreduce\*,
%HADOOP_HOME%\share\hadoop\mapreduce\lib\*,
%HADOOP_HOME%\share\hadoop\yarn\*,
%HADOOP_HOME%\share\hadoop\yarn\lib\*
</value>
</property>
After these updated and restarting the cluster, I am able to run jobs and follow their progress on port 8088
Best

Hive action failing with SLF4J error : SLF4J: Class path contains multiple SLF4J bindings

I am trying to create a simple workflow with a hive action. I'm using Cloudera Quickstart VM (CDH 5.12). The following are the components of my workflow:
1) top_n_products.hql
create table instacart.top_n as
(
select * from
(
select row_number() over (order by no_of_times_ordered desc)as num_rank, product_id, product_name, no_of_times_ordered
from
(
select A.product_id, B.product_name, count(*) as no_of_times_ordered from
instacart.order_products__train as A
left outer join
instacart.products as B
on A.product_id=B.product_id
group by A.product_id, B.product_name
)C
)D
where num_rank <= ${N}
);
2) hive-config.xml
I have basically copied the default hive-site.xml from /etc/hive/conf into my workflow workspace folder and renamed it to hive-config.xml
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://127.0.0.1/metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>cloudera</value>
</property>
<property>
<name>hive.hwi.war.file</name>
<value>/usr/lib/hive/lib/hive-hwi-0.8.1-cdh4.0.0.jar</value>
<description>This is the WAR file with the jsp content for Hive Web Interface</description>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://127.0.0.1:9083</value>
<description>IP address (or fully-qualified domain name) and port of the metastore host</description>
</property>
</configuration>
3) Workflow properties
In the hive action, I set the following:
- set HIVE XML, Job XML paths to my hive-config.xml
- Also added hive-config.xml to Files
- In the workflow properties, set the path to my workspace
- Defined the parameter N in my query
Screenshot of my Hive Action properties
When I try to run the workflow it fails, and the stderr throws following error:
Log Type: stderr
Log Upload Time: Mon Nov 20 19:49:04 -0800 2017
Log Length: 2759
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/filecache/130/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Nov 20, 2017 7:47:34 PM com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
WARNING: You are attempting to use a deprecated API (specifically, attempting to #Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
Nov 20, 2017 7:47:35 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
.
.
.
.
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to GuiceManagedComponentProvider with the scope "PerRequest"
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Below are the workflow.xml and job.properties that are generated:
1) Workflow XML:
<workflow-app name="Top_N_Products" xmlns="uri:oozie:workflow:0.5">
<global>
<job-xml>hive-config.xml</job-xml>
</global>
<start to="hive-87ac"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="hive-87ac" cred="hcat">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>hive-config.xml</job-xml>
<script>top_n_products.hql</script>
<param>N={N}</param>
<file>hive-config.xml#hive-config.xml</file>
</hive>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
2) job.properties
security_enabled=False
send_email=False
dryrun=False
nameNode=hdfs://quickstart.cloudera:8020
jobTracker=localhost:8032
N=10
Please note that the hive query runs perfectly fine through the Hive query editor. Am I missing something while configuring the workflow? Any help is appreciated!
Thanks,
Deb

Not able to start hwi in hive 2.0.0

I have my hive-hwi-2.0.0.war file in lib folder, right where it should be.
In my hive-site.xml, I have these overridden the configurations :
<property>
<name>hive.hwi.listen.host</name>
<value>localhost</value>
</property>
<property>
<name>hive.hwi.listen.port</name>
<value>9998</value>
</property>
<property>
<name>hive.hwi.war.file</name>
<value>/lib/hive-hwi-2.0.0.war</value>
<description>This sets the path to the HWI war file, relative to ${HIVE_HOME}. </description>
</property>
But it just does not find it.
When I run the command ,hive --service hwi,It throws an error,
HI war file not found at .... location.
What could really be the issue here?

HBase with YARN throws ERROR

I'm using Hadoop 2.5.1 with HBase 0.98.11 on Ubuntu 14.04
I could run it in Pseudo-distributed mode. Now that I want to run on distributed mode. I follow the instruction from sites and end up having an error in RUNTIME called "Error: org/apache/hadoop/hbase/HBaseConfiguration" (while there is no error when I compile the code).
After trying things, I found that if I comment the mapreduce.framework.name in mapred-site.xml and also stuffs in yarn-site, I could be able to run the hadoop successfully.
But I think it's the single-node running (I have no idea, just guessing by comparing the running time to what I ran in Pseudo and there is no MR in slave's node jps when running the job on master).
Here are some of my conf:
hdfs-site
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<!-- <property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>-->
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
mapred-site
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<!--<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>-->
yarn-site
<!-- Site specific YARN configuration properties -->
<!--<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>10.1.1.177:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>10.1.1.177:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>10.1.1.177:8031</value>
</property>-->
Thank you so much for every help
UPDATE: I try making some changes to the yarn-site by adding yarn.applicaton.classpath like this
https://dl-web.dropbox.com/get/Public/yarn.png?_subject_uid=51053996&w=AABeDJfRp_D31RiVHqBWn0r9naQR_lFVJXIlwvCwjdhCAQ
The error changed to EXIT CODE.
https://dl-web.dropbox.com/get/Public/exitcode.jpg?_subject_uid=51053996&w=AAAQ-bYoRSrQV3yFq36vEDPnAB9aIHnyOQfnvt2cUHn5IQ
UPDATE2: In syslog of the application logs it says
2015-04-24 20:34:59,164 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1429792550440_0035_000002
2015-04-24 20:34:59,589 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2015-04-24 20:34:59,610 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2015-04-24 20:34:59,616 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.NoSuchMethodError: org.apache.hadoop.http.HttpConfig.setPolicy(Lorg/apache/hadoop/http/HttpConfig$Policy;)V
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1364)
2015-04-24 20:34:59,621 INFO [Thread-1] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster received a signal. Signaling RMCommunicator and JobHistoryEventHandler.
Any suggestions pls
I guess that you didn't set up your hadoop cluster correctly please follow these steps :
Hadoop Configuration:
step 1 : edit hadoop-env.sh as following:
# The java implementation to use. Required.
export JAVA_HOME=/usr/lib/jvm/java-6-sun
step 2 : Now create a directory and set the required ownerships and permissions
$ sudo mkdir -p /app/hadoop/tmp
$ sudo chown hduser:hadoop /app/hadoop/tmp
# ...and if you want to tighten up security, chmod from 755 to 750...
$ sudo chmod 750 /app/hadoop/tmp
step 3 : edit core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
step 5 : edit mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
step 6 : edit hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hduser/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hduser/hadoop/hadoopdata/hdfs/datanode</value>
</property>
step 7 : edit yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
finally format your hdfs (You need to do this the first time you set up a Hadoop cluster)
$ /usr/local/hadoop/bin/hadoop namenode -format
Hbase Configuration:
edit you hbase-site.xml:
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:54310/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/local/hbase/zookeeper</value>
</property>
Hope this helps you
After sticking with the problem for more than 3 days (maybe it's from my misunderstanding the concept), I can fix the problem by adding HADOOP_CLASSPATH (like what I did when setting up the pseudo-distribute in hadoop-env) into the yarn-env.
I have no idea much in detail. But, yeah, hope this may be able to help someone in the future.
Cheers.
I was using Spark on Yarn and was getting the same error. Actually, the spark jar had a internal dependency of hadoop-client and hadoop-mapreduce-client-* jars pointing to older 2.2.0 versions. So, I included these entries in my POM with the Hadoop version that I was running and did a clean build.
This resolved the issue for me. Hope this helps someone.

hive-site.xml path in hive0.13.1

I'm a newbie. I would like to know the hive-site.xml and hive-default.xml files locations in hive-0.13.1 version.
I have downloaded hive0.13.1-bin version from the below location.
http://apache.mirrors.pair.com/hive/hive-0.13.1/
Extracted and then configured hive environment variables. I'm able to run the commands(create table, show, load data, query table..).
But in the conf(/hive/hive-0.13-1/conf) directory, I do not see hive-site.xml and hive-default.xml files. Where these files are located in hive-0.13.1 version?
follow the steps
1) Extract folder
2)go to /apache-hive-0.13.1-bin/conf and make a copy of hive-default.xml.template , it looks like hive-default.xml (copy).template.
3) Rename hive-default.xml (copy).template to hive-site.xml.
4) Make a copy of hive-env.sh.template to hive-env.sh.
add in hive-env.sh
export HADOOP_HEAPSIZE=1024
# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=/home/user17/BigData/hadoop
#hive
export HIVE_HOME=/home/user17/BigData/hive
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=$HIVE_HOME/conf
5) export hadoop and hive path in .bashrc file
export HADOOP_HOME=/home/user17/BigData/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export HIVE_HOME=/home/user17/BigData/hive
export PATH=$PATH:$HIVE_HOME/bin
start your hadoop by
start-all.sh
enjoy with your hive.Give the hadoop and hive path in export command according to your system.let me know if not work.
You can find hive-site.xml.template file in conf directory.
You should make it to hive-site.xml and add following configurations:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true</value>
<description>the URL of the MySQL database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property>
<property>
<name>hive.hwi.listen.host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>hive.hwi.listen.port</name>
<value>9999</value>
</property>
<property>
<name>hive.hwi.war.file</name>
<value>lib/hive-hwi-0.12.0.war</value>
</property>
<!--
<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>
-->
</configuration>
And create metdata in mysql using commands if not exits in chosen DB (mysql).