Logs for hive query executed via. beeline - hive

i am running below hive coomand from beeline . Can someone please tell where can I see Map reudce logs for this ?
0: jdbc:hive2://<servername>:10003/> select a.offr_id offerID , a.offr_nm offerNm , b.disp_strt_ts dispStartDt , b.disp_end_ts dispEndDt , vld_strt_ts validStartDt, vld_end_ts validEndDt from gcor_offr a, gcor_offr_dur b where a.offr_id = b.offr_id and b.disp_end_ts > '2016-09-13 00:00:00';

When using beeline, MapReduce logs are part of HiveServer2 log4j logs.
If your Hive install was configured by Cloudera Manager (CM), then it will typically be in /var/log/hive/hadoop-cmf-HIVE-1-HIVESERVER2-*.out on the node where HiveServer2 is running (may or may not be the same as where you are running beeline from)
Few other scenarios:
Your Hive install was not configured by CM ? You will need to manually create log4j config file:
Create hive-log4j.properties config file in directory specified by HIVE_CONF_DIR environment variable. (This makes it accessible to HiveServer2 JVM classpath)
In this file, log location is specified by log.dir and log.file. See conf/hive-log4j.properties.template in your distribution for an example template for this file.
You run beeline in "embedded HS2 mode" (i.e. beeline -u jdbc:hive2:// user password) ?:
You will customize beeline log4j (as opposed to HiveServer2 log4j).
Beeline log4j properties file is strictly called beeline-log4j2.properties (in versions prior to Hive 2.0, it is called beeline-log4j.properties). Needs to be created and made accessible to beeline JVM classpath via HIVE_CONF_DIR. See HIVE-10502 and HIVE-12020 for further discussion on this.
You want to customize what HiveServer2 logs get printed on beeline stdout ?
This can be configured at HiveServer2 level using hive.server2.logging.operation.enabled and hive.server2.logging.operation configs.

Hive uses log4j for logging. These logs are not emitted to the standard output by default but are instead captured to a log file specified by Hive's log4j properties file. By default, Hive will use hive-log4j.default in the conf/ directory of the Hive installation which writes out logs to /tmp/<userid>/hive.log and uses the WARN level.
It is often desirable to emit the logs to the standard output and/or change the logging level for debugging purposes. These can be done from the command line as follows:
$HIVE_HOME/bin/hive --hiveconf hive.root.logger=INFO,console

set hive.async.log.enabled=false

Related

Not able to start hiveserver2 for Apache Hive

Could any one help to resolve below problem, I'm trying to start hserver2 and I configured hive_site.xml and configuration file for Hadoop Directory path as well and jar file hive-service-rpc-2.1.1.jar also available at directory lib. And I am able to start using hive but not hiveserver2
$ hive --service hiveserver2 Exception in thread "main" java.lang.ClassNotFoundException: /home/directory/Hadoop/Hive/apache-hive-2/1/1-bin/lib/hive-service-rpc-2/1/1/jar
export HIVE_HOME=/usr/local/hive-1.2.1/
export HIVE_HOME=/usr/local/hive-2.1.1
I am glad that I solve it's problem. Here is my question ,I have different version hive ,and My command use 1.2.1, but it find it's jar form 2.1.1.
you can user command which hive server 2 ,find where is you command from .

Hive - How can I stop logs displaying in console?

I have been trying to omit logs from console while querying in hive, but still it is showing up.
If you are opening the hive console by typing
> hive
in your terminal and then write queries, you can solve this by simply using
> hive -S
This basically means that you are starting hive in silent mode.
Hope that helps.
You could increase the polling interval to minutes or hours:
SET hive.exec.counters.pull.interval=[millis];
The default is 1000 milliseconds, but you can increase it to anything you like. That should decrease the number of logs written to stdout.
If you don't want any logs on the console while starting the shell you can set the hive.root.logger property
$HIVE_HOME/bin/hive --config hive.root.logger=INFO,DRFA
hive.root.logger specifies the logging level as well as the log
destination. Specifying console as the target sends the logs to the
standard error (instead of the log file).
If you want to see ERROR messages on console you can set this command
$HIVE_HOME/bin/hive --config hive.root.logger=ERROR,console
Start hive in silent mode using
$ hive -S
then Set logger level to Error, which will avoid Warnings/Info from printing.
hive> set logger.PerfLogger.level = ERROR;
If there is "SLF4J: Class path contains multiple SLF4J bindings." in your log, it means that there are multiple log4j jars (different versions, different behaviors) in the class path
I don't know the principle of log4j, but according to the Hadoop configuration file, perform the following steps:
cd $HIVE_HOME/conf
cat > log4j.properties <<EOL
log4j.rootLogger=WARN, CA
log4j.appender.CA=org.apache.log4j.ConsoleAppender
log4j.appender.CA.layout=org.apache.log4j.PatternLayout
log4j.appender.CA.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n
EOL
After starting hive (Hive 3.1.2 Apache), the log is set to WARN level, which may not necessarily work, but you can try it.

Cannot connect Impala-Kudu to Apache Kudu (without Cloudera Manager): Get TTransportException Error

I have successfully installed kudu on Ubuntu (Trusty) as per the official kudu documentations (see http://kudu.apache.org/docs/installation.html ). The setup has one node running master and tablet server and another node running the tablet server only. I am having issues installing impala-kudu without Cloudera Manager on the node running kudu master. I have followed CDH installation instructions on this (see http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_cdh5_install.html ) page until Step 3. I have avoided installing CDH with YARN and MRv1 as I don’t need to run any mapreduce jobs and will not be using hadoop. Impala-kudu and impala-kudu-shell installed without errors. When I launch the impala-shell it returns:
Starting Impala Shell without Kerberos authentication
Error connecting: TTransportException, Could not connect to kudu_test:21000
***********************************************************************************
Welcome to the Impala shell. Copyright (c) 2015 Cloudera, Inc. All rights reserved.
(Impala Shell v2.7.0-cdh5-IMPALA_KUDU-cdh5 (48f1ad3) built on Thu Aug 18 12:15:44 PDT 2016)Want to know what version of Impala you're connected to? Run the VERSION command to
find out!
***********************************************************************************
[Not connected] >
I have tried to use the CONNECT option to connect to the kudu-master node without success. Both imapala-kudu and kudu are running on the same machine. Are there additional configuration settings which need to be changed or is hadoop and YARN a strict requirement to make impala-kudu work?
After running ps -ef | grep -i impalad I can confirm the impala daemon is not running. After navigating to the impala logs at ~/var/log/impala I find a few errors and warning files. Here is the output of impalad.ERROR:
Log file created at: 2016/09/13 13:26:24
Running on machine: kudu_test
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0913 13:26:24.084389 3021 logging.cc:118] stderr will be logged to this file.
E0913 13:26:25.406966 3021 impala-server.cc:249] Currently configured default filesystem: LocalFileSystem. fs.defaultFS (file:///) is not supported.ERROR: block location tracking is not properly enabled because
- dfs.datanode.hdfs-blocks-metadata.enabled is not enabled.
- dfs.client.file-block-storage-locations.timeout.millis is too low. It should be at least 10 seconds.
E0913 13:26:25.406990 3021 impala-server.cc:252] Aborting Impala Server startup due to improper configuration. Impalad exiting.
Maybe I need to revisit HDFS and the Hive Metastore to ensure I have these services configured properly?
According to the log, impalad quits because the default filesystem is configured to be LocalFileSystem, which is not supported. You have to set a distributed filesystem, such as HDFS as the default.
Although Kudu is a separate storage system and does not rely on HDFS, Impala still seems to require a non-local default FS even when using with Kudu. The Impala_Kudu documentation explicitly lists the following requirement:
Before installing Impala_Kudu, you must have already installed and configured services for HDFS (though it is not used by Kudu), the Hive Metastore (where Impala stores its metadata), and Kudu.
I can even imagine that HDFS may not really be needed for any other reason than to make Impala happy, but this is just speculation from my side. Update: Found IMPALA-1850 which confirms my suspicion that HDFS should not be needed for Impala any more, but it's not just a single check that has to be removed.

How to connect Spark-Notebook to Hive metastore?

This is a cluster with Hadoop 2.5.0, Spark 1.2.0, Scala 2.10, provided by CDH 5.3.2. I used a compiled spark-notebook distro
It seems Spark-Notebook cannot find the Hive metastore by default.
How to specify the location of hive-site.xml for spark-notebook so that it can load the Hive metastore?
Here is what I tried:
link all files from /etc/hive/conf, with hive-site.xml included, to the current directory
specify SPARK_CONF_DIR variable in bash
When you start the notebook set the environment variable EXTRA_CLASSPATH with the path where you have located the hive-site.xml,
this works for me:EXTRA_CLASSPATH=/path_of_my_mysql_connector/mysql-connector-java.jar:/my_hive_site.xml_directory/conf ./bin/spark-notebook
I have also passed the jar of my mysqlconnector because I have Hive with MySql.
I have found some info from this link: https://github.com/andypetrella/spark-notebook/issues/351
Using CDH 5.5.0 Quickstart VM, the solution is the following: You need the reference hive-site.xmlto the notebook which provides the access information to the hive metastore. By default, spark-notebooks uses an internal metastore.
You can the define the following environmental variable in ~/.bash_profile:
HADOOP_CONF_DIR=$HADOOP_CONF_DIR:/etc/hive/conf.cloudera.hive/
export HADOOP_CON_DIR
(Make sure you execute source ~/.bash_profile if you do not open a new terminal the terminal)
(The solution is given here: https://github.com/andypetrella/spark-notebook/issues/351)

Setting hive configuration properties through hive --service jar

Can somebody tell me if i can pass hiveconf properties through cli. Actually i want to run a jar using hive --service jar and in this command i want to set some properties. I have tried the following commands but didnot work:
hive --service jar myjar.jar my.example.jar.MyMainClass -hiveconf x=y
hive --service jar myjar.jar my.example.jar.MyMainClass HIVE_OPTS x=y
Thanks in advance
No, As far as I have seen, it does not work with hive --service jar. --hiveconf can be used to set some configuration when launching a CLI, or thrift server.