Hive CLI giving problem while starting it - hive

When i run command hive it is only able to start from bin folder beacause metastore is created in bin only if i run it from home its not able to get start and shows error.
I have added these lines in my .bashrc file for hive
HIVE env variables
export HIVE_HOME=/opt/hadoop/hive/apache-hive-2.3.4-bin
export PATH=$HIVE_HOME/bin:$PATH

Can you try to setup path as mentioned below and retry,
user#ubuntu:~$ sudo gedit ~/.bashrc
Copy and paste the following lines at end of the file
# Set HIVE_HOME
export HIVE_HOME="/opt/hadoop/hive/apache-hive-2.3.4-bin"
PATH=$PATH:$HIVE_HOME/bin
export PATH
But here my suggestion is, instead of using hive command prompt try to use recommended way that is beeline client. If you have hiveserver2 configured you can connect using beeline client and query to hive.

Related

SSH into Hadoop cluster using Paramiko, and then executing dependent commands

I am implementing a python script, which uses paramiko to connect to a hadoop cluster. My problem is that I can SSH to a root user only, and from inside I have to switch user to hdfs to execute my command.
now I need something to automate this switching to HDFS user and then cding into /tmp/ and then executing command from there. I have tried invoke_shell() , it hangs, and also the && inside the exec_command, it also doesnt work.
I am getting a permission denied exception:
java.io.FileNotFoundException: file.txt (Permission denied)
There are two workflows that I have thought of:
1st one:
1. sudo -u hdfs -s
2. cd /tmp/
3. <execute the command> <outputDir>
2nd one:
sudo -u hdfs <execution command> /tmp/<outputDir>
The first one doesnt give the above error. But the second one throws this. I was trying second one just to avoid the dependent command issue.
Any help or suggestions will be appreciated.

Apache Airflow command not found with SSHOperator

I am trying to use the SSHOperator to SSH into a remote machine and run an external application through the command line. I have setup the SSH connection via the admin page.
This section of code is used to define the commands and the SSH connection to the external machine.
sshHook = SSHHook(ssh_conn_id='remote_comp')
command_1 ="""
cd /files/232-065/Rans
bash run.sh
"""
Where 'run.sh' runs the shell script:
#!/bin/sh
starccm+ -batch run_export.java Rans_Model.sim
Which simply runs the commercial software starccm+ with some options I have specified.
This section defines the task:
inlet_profile = SSHOperator(
task_id='inlet_profile',
ssh_hook=sshHook,
command=command_1
)
I have confirmed the SSH connection works by giving a simple 'ls' command and checking the output.
The error that I get is:
bash run.sh, error: run.sh: line 2: starccm+: command not found
The command in 'run.sh' works when I am logged into the machine (it does not require a GUI). This makes me think that there is a problem with the SSH session and it is not the same as the one that Apache Airflow logs into, but I am not sure how to solve this problem.
Does anyone have any experience with this?
There is no issue with SSH connection (at least from the error message). However, the issue is with starccm+ installation path.
Please check the installation path of starccm+ .
Check if the installation path is part of $PATH env variable
$ echo $PATH
If not, then install it in the standard locations like /bin or /usr/bin etc (provided they are included in $PATH variable), or export the installed director into PATH variable like this,
$ export PATH=$PATH:/<absolute_path>
It is not ideal but if you struggle with setting the path variable you can run starccm stating the full path like:
/directory/where/star/is/installed/starccm+ -batch run_export.java Rans_Model.sim

schematool: command not found

I am trying to install Hive on my Ubuntu 19.10 machine .
I am using this doc https://phoenixnap.com/kb/install-hive-on-ubuntu.
As mentioned in step 6, where I am trying to initiate Derby Database, I write the command in the right path : ~/apache-hive-3.1.2-bin/bin
schematool –initSchema –dbType derby
But I get this error :
schematool: command not found.
How can I resolve this please ?
I had the same question before.
Maybe because of the wrong configuration files, like hive-site.xml, hive-env.sh. A blank space in my configuration file caused this error.
Default path for schematool is $HIVE_HOME/bin/schematool (/apache-hive-3.1.2-bin/bin/schematool in your case). Try to add this HIVE_HOME on your .bashrc file, worked for me.
# Hive
export HIVE_HOME=/<your hive path>
export PATH=$PATH:$HIVE_HOME/bin
Try this
using this command I resolved this issue
hive --service schematool -dbType mysql -password hive -username hive -validate
run ./schematool –initSchema –dbType derby
don't forget the ./

Why the error getting when importing table data from Oracle DB to HIVE using Sqoop?

I am getting below error while importing data from oracle DB to HIVE using Sqoop
ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Cannot run program "hive": error=2, No such file or directory
below is my command which I am executing.
sqoop import --connect jdbc:oracle:thin:#host:port/xe--username sa -- password sa --table SA.SHIVAMSAMPLE --hive-import -m 1
The data is getting created inside hdfs but hive tables are not getting created i.e a folder gets created inside (bin/hdfs dfs -ls) direct default directory.
when I will give explicitly path for warehouse then only it will store in warehouse directory like "user/hive/warehouse" after that also table not created and not loaded data.
I installed hadoop in "Amit/hadoop-2.6.5" and HIVe is "Amit/apache-hive-1.2.1-bin" and sqoop "Amit/sqoop-1.4.5-cdh5.3.2" in .bashrc I set the hadoop path only.
Is require to hive and sqoop as well.
When I set hive home in sqoop-env.sh file then above command run fine but table is not created and file is created inside hdfs /user/hive/warehouse/shivamsample
can you please tell me any extra conf require to resolve this issue.

Hive script not running in crontab with hadoop must be in the path error

After setting Hadoop Home path and Prefix path in .bashrc and /etc/profile also im getting the same error - Cannot find hadoop installation: $HADOOP_HOME or $HADOOP_PREFIX must be set or hadoop must be in the path
If i run the script from crontab im facing this error from hive> prompt its working fine
plz help with the regarding how to solve this
Set $HADOOP_HOME in $HIVE_HOME/conf/hive-env.sh
try loading user bash profile in the script, as below,
. ~/.bash_profile
bash is included in user bash_profile and it will have user specific configurations as well.
see the similar question Hbase commands not working in script executed via crontab