Naming hive session cannot work sometimes - hive

I ran sqls on hive tez by hive -f xxx.sql --hiveconf hive.session.id=sessionName
but on the yarn resourcemanager displays like this
HIVE-f4ea6c3f-f4cf-4db3-8801-da6f94e20237
HIVE-d920c434-d2e6-4c1c-a506-d69b580960f7
sometimes it displays correctly..
How can solve this problem

The thing is Tez can reuse containers. AM container reuse = session reuse. controlled by this parameter: tez.am.container.reuse.enabled=true
One yarn AM container can be reused for different Tez sessions. This is the reason why yarn name is different.
BTW there is one more parameter added in this JIRA HIVE-12357, you can set name for each DAG:
hive.query.name

Related

No mr job execution info in beeline

In beeline, I could not see the job execution info (like job progress), I have already set the following properties in hive-site.xml. Could anyone help to figure out how to diagnose such issues ? How can I check whether hive server2 take the correct configuration ?
hive.server2.logging.operation.level VERBOSE
I only see the following log in beeline
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
did you try
set hive.root.logger=INFO,console;
I forget to set hive.async.log.enabled to be false, after this setting it works

Issue with executing spark sql job using oozie action

Facing a weird issue, trying to execute a spark-sql(Spark2) job using oozie action but the behavior of execution is quite weird, at times it executes fine but sometimes it continues to be in "Running" state forever, on checking the logs got the below issue.
WARN org.apache.spark.scheduler.cluster.YarnClusterScheduler` - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
The strange thing is that we have already provided sufficient resources, the same can be seen from spark environment variables as well and as well under the cluster resources(cluster has sufficient cores and RAM).
<spark-opts>--executor-memory 10G --num-executors 7 --executor-cores 3 --driver-memory 8G --driver-cores 2</spark-opts>
With the same configuration sometimes it is executing fine as well. Are we missing something?
The issue was related to jar conflict,following are the suggestions to identify the same.
a)Check the maven dependency tree to make sure there is no transitive dependency conflict.
b)While spark job is running check the environment variables being used using Spark UI.
c)Resolve the conflict and run a maven clean package.

Flink job started from another program on YARN fails with "JobClientActor seems to have died"

I'm new flink user and I have the following problem.
I use flink on YARN cluster to transfer related data extracted from RDBMS to HBase.
I write flink batch application on java with multiple ExecutionEnvironments (one per RDB table to transfer table rows in parrallel) to transfer table by table sequentially (because call of env.execute() is blocking).
I start YARN session like this
export YARN_CONF_DIR=/etc/hadoop/conf
export FLINK_HOME=/opt/flink-1.3.1
export FLINK_CONF_DIR=$FLINK_HOME/conf
$FLINK_HOME/bin/yarn-session.sh -n 1 -s 4 -d -jm 2048 -tm 8096
Then I run my application on YARN session started via shell script transfer.sh. Its content is here
#!/bin/bash
export YARN_CONF_DIR=/etc/hadoop/conf
export FLINK_HOME=/opt/flink-1.3.1
export FLINK_CONF_DIR=$FLINK_HOME/conf
$FLINK_HOME/bin/flink run -p 4 transfer.jar
When I start this script from command line manually it works fine - jobs are submitted to YARN session one by one without errors.
Now I should be able to run this script from another java program.
For this aim I use
Runtime.exec("transfer.sh");
(maybe are there better ways to do this? I have seen at REST API but there are some difficulties because job manager is proxied by YARN).
At the beginning is works as usually - first several jobs are submitted to session and finished successfully. But the following jobs are not submitted to YARN session.
In /opt/flink-1.3.1/log/flink-tsvetkoff-client-hadoop-dev1.log I see error (and no another errors found in DEBUG level)
The program execution failed: JobClientActor seems to have died before the JobExecutionResult could be retrieved.
I have tried to analyse this problem by myself and found out that this error has occurred in JobClient class while sending ping request with timeout to JobClientActor (i.e. YARN cluster).
I tried to increase multiple heartbeat and timeout options like akka.*.timeout, akka.watch.heartbeat.* and yarn.heartbeat-delay options but it doesn't solve the problem - new jobs are not submit to YARN session from CliFrontend.
The environment for both case (manual call and call from another program) is the same. When I call
$ ps axu | grep transfer
it will give me output
/usr/lib/jvm/java-8-oracle/bin/java -Dlog.file=/opt/flink-1.3.1/log/flink-tsvetkoff-client-hadoop-dev1.log -Dlog4j.configuration=file:/opt/flink-1.3.1/conf/log4j-cli.properties -Dlogback.configurationFile=file:/opt/flink-1.3.1/conf/logback.xml -classpath /opt/flink-1.3.1/lib/flink-metrics-graphite-1.3.1.jar:/opt/flink-1.3.1/lib/flink-python_2.11-1.3.1.jar:/opt/flink-1.3.1/lib/flink-shaded-hadoop2-uber-1.3.1.jar:/opt/flink-1.3.1/lib/log4j-1.2.17.jar:/opt/flink-1.3.1/lib/slf4j-log4j12-1.7.7.jar:/opt/flink-1.3.1/lib/flink-dist_2.11-1.3.1.jar:::/etc/hadoop/conf org.apache.flink.client.CliFrontend run -p 4 transfer.jar
I also tried to update flink to 1.4.0 release or change parallelism of job (even to -p 1) but error has still occurred.
I have no idea what could be different? Is any workaround by the way?
Thank you for any help.
Finally I find out how to resolve that error
Just replace Runtime.exec(...) with new ProcessBuilder(...).inheritIO().start().
I really don't know why the call of inheritIO helps in that case because as I understand it just redirects IO streams from child process to parent process.
But I have checked that if I comment out this line of code the program begins to fall again.

Hive - How can I stop logs displaying in console?

I have been trying to omit logs from console while querying in hive, but still it is showing up.
If you are opening the hive console by typing
> hive
in your terminal and then write queries, you can solve this by simply using
> hive -S
This basically means that you are starting hive in silent mode.
Hope that helps.
You could increase the polling interval to minutes or hours:
SET hive.exec.counters.pull.interval=[millis];
The default is 1000 milliseconds, but you can increase it to anything you like. That should decrease the number of logs written to stdout.
If you don't want any logs on the console while starting the shell you can set the hive.root.logger property
$HIVE_HOME/bin/hive --config hive.root.logger=INFO,DRFA
hive.root.logger specifies the logging level as well as the log
destination. Specifying console as the target sends the logs to the
standard error (instead of the log file).
If you want to see ERROR messages on console you can set this command
$HIVE_HOME/bin/hive --config hive.root.logger=ERROR,console
Start hive in silent mode using
$ hive -S
then Set logger level to Error, which will avoid Warnings/Info from printing.
hive> set logger.PerfLogger.level = ERROR;
If there is "SLF4J: Class path contains multiple SLF4J bindings." in your log, it means that there are multiple log4j jars (different versions, different behaviors) in the class path
I don't know the principle of log4j, but according to the Hadoop configuration file, perform the following steps:
cd $HIVE_HOME/conf
cat > log4j.properties <<EOL
log4j.rootLogger=WARN, CA
log4j.appender.CA=org.apache.log4j.ConsoleAppender
log4j.appender.CA.layout=org.apache.log4j.PatternLayout
log4j.appender.CA.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n
EOL
After starting hive (Hive 3.1.2 Apache), the log is set to WARN level, which may not necessarily work, but you can try it.

In what mode is Hive installed?

Does hive installation have any specific mode?
Like for example, Hadoop installation has 3 modes: standalone, pseudo-distributed and fully distributed.
Similarly does Hive has any specific type of distribution?
Can Hive be installed in distributed mode?
Hive actually provides you the option to run queries in 2 modes :
1- Map-Reduce mode
2- Local mode
Normally Hive compiler generates map-reduce jobs for most queries under the hood. These jobs are then submitted to the Map-Reduce cluster indicated by the variable:
mapred.job.tracker
While this usually points to a map-reduce cluster with multiple nodes, Hadoop also provided you the ability to run map-reduce jobs locally on the your standalone workstation. In order to run Hive queries in local mode you need to do this :
hive> SET mapred.job.tracker=local;
Details can be found here.