I'm trying to load a tab separated file to a HIVE text file table using hiveconf parameters as below -
load data local inpath '${hiveconf:TEXT_FILE}' into table ${hiveconf:HIVE_TABLE};
But when I run this .hql file as below
hive -hiveconf DB=$DB TEXT_FILE="$text_file_name" HIVE_TABLE=$HIVE_TABLE -f file_load.hql
I get the below error -
NoViableAltException(16#[202:1: tableName : (db= identifier DOT tab= identifier -> ^( TOK_TABNAME $db $tab) |tab= identifier -> ^( TOK_TABNAME $tab) );])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
......
......
FAILED: ParseException line X:YY cannot recognize input near '$' '{' 'hiveconf' in table name
I searched on google and understood that it's due to hive keyword but I have already created the table successfully and when I load the file by hardcoding the file name and table name then the data gets loaded! Please help me here!
Thank you!
You passing context variables incorrectly. it should be -hiveconf before each variable:
hive -hiveconf DB=$DB -hiveconf TEXT_FILE="$text_file_name" -hiveconf HIVE_TABLE=$HIVE_TABLE -f file_load.hql
Related
I have one file with many load data local sqls,which may contains sqls that can cause error from hive cli.
After error the cli stop proceeding the rest sqls.
If I want to ignore these errors,and continue these sqls,How can I do?
set hive.cli.errors.ignore=true;
Demo
hive -f <(echo 'select x;select 1+1 as x')
FAILED: SemanticException [Error 10004]: Line 1:7 Invalid table alias
or column reference 'x': (possible column names are: )
hive --hiveconf hive.cli.errors.ignore=true -f <(echo 'select x;select 1+1 as x')
FAILED: SemanticException [Error 10004]: Line 1:7 Invalid table alias
or column reference 'x': (possible column names are: )
OK
2
I've got a param which is like "This is a param", and I'm going to pass it to below hiveQL:
hive -hivevar sys_nm="This is a param" -e 'select * from rd_sys where rd_sys_nm=${hivevar:sys_nm}'
But Hive returned below error message:
Logging initialized using configuration in jar:file:/opt/mapr/hive/hive-0.13/lib/hive-common-0.13.0-mapr-1409.jar!/hive-log4j.properties
FAILED: ParseException line 1:49 missing EOF at 'is' near 'This'
g4t7491_[mgr#g4t7491 ~]$
Does anyone know how to pass it normally?
Hive var don't work like hiveconf where you need to apply "hiveconf:somthing" in the code
when declaring hivevar just add the var name like this -> ${var_name}
for example:
through command line:
hive -hivevar MONTH_VAR='11' -e "select * from table where month=${MONTH_VAR};"
you can also declair through the script:
set hivevar:MONTH_VAR=11;
-- so query would look like this (no hiveconf):
set hivevar:MONTH_VAR=11;
SELECT * from table where month=${MONTH_VAR};
You need to put the string in single quotes for it to parse correctly as a string inside the sql after interpolation.
hive -hivevar sys_nm="'This is a param'" -e 'select * from rd_sys where rd_sys_nm=${hivevar:sys_nm}'
I want to create a hive script that uses as database one of two given parameters, whichever is not null.
My hive-test.sql is this:
set db_name = coalesce(${hiveconf:dbOne}, ${hiveconf:dbTwo});
use ${hiveconf:db_name};
show tables;
and I run it with:
hive -hiveconf dbOne=my_database -f hive-test.sql
and I am getting:
FAILED: ParseException line 2:12 missing EOF at '(' near 'coalesce'
I should note that if I change the first line in script to:
set db_name = my_database;
it works.
I can't figure out what I did wrong. Your assistance is appreciated.
This feature is not available in Hive.
Do variable assignment in the shell, for example like here: setting-a-shell-variable-in-a-null-coalescing-fashion and pass it to the Hive.
Is it possible to run something like this in Hive CLI?
I am trying to pass file contents as a variable to another query.
set column_list=!cat /home/user/filename.lst ;
create table tabname as select $column_list from ...
if you have a query file you pass the variables as hiveconf
hive -hiveconf var1=abcd -f file.txt
or you can construct your query and then pass it to hive cli using -e
hive -e "create table ..."
file filename.lst
line
make a file test.sh,
temp=$(cat /home/user/filename.lst)
hive -f test.hql -hiveconf var=$temp
make a another file test.hql
create table test(${hiveconf:var} string);
on terminal
sh -x test.sh
It will pass the line to the test.hql and it will create a table with line as column;
note- all files should be in same directory .This script is passing only one variable.
While loading XML data file into HIVE table i got following error message:
FAILED: SemanticException 7:9 Input format must implement InputFormat. Error encountered near token 'StoresXml'.
The way i am loading the XML file is as follows :
**Create a table StoresXml
'CREATE EXTERNAL TABLE StoresXml (storexml string)
STORED AS INPUTFORMAT 'org.apache.mahout.classifier.bayes.XmlInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/user/hive/warehouse/stores';'
** Location /user/hive/warehouse/stores is in HDFS.
load data inpath <local path where the xml file is stored> into table StoresXml;
Now,problem is when i select any column from table StoresXml ,the above mentioned error comes up.
Please help me with it.Where i am going wrong?
1) first you need to create single column table like
CREATE TABLE xmlsample(xml string);
2) after that you need to load data in local/hdfs to hive table like
LOAD DATA INPATH '---------' INTO TABLE XMLSAMPLE;
3) NEXT BY USING XPATH, XPATH_ARRAY,XPATH_STRING LIKE SAMPLE XML QUERIES..
I have just loaded this transactions.xml file into hive table using xpath
for XML file:
**Bring records of xml file into one line:
terminal> cat /home/cloudera/Desktop/Test/Transactions_xml.xml | tr -d '&' | tr '\n' ' ' | tr '\r' ' ' | sed 's|</record>|</record>\n|g' | grep -v '^\s*$' > /home/cloudera/Desktop/trx_xml;
terminal> hadoop fs -put /home/cloudera/Desktop/trx_xml.xml /user/cloudera/DataTest/Transactions_xml
hive>create table Transactions_xml1(xmldata string);
hive>load data inpath '/user/cloudera/DataTest/Transactions_xml' overwrite into table Transactions_xml1;
hive>create table Transactions_xml(trx_id int,account int,amount int);
hive>insert overwrite table Transactions_xml select xpath_int(xmldata,'record/Tid'),
xpath_int(xmldata,'record/AccounID'),
xpath_int(xmldata,'record/Amount') from Transactions_xml1;
I hope this will help you. Let me know the result.
I have developed a tool to generate hive scripts from a csv file. Following are few examples on how files are generated.
Tool -- https://sourceforge.net/projects/csvtohive/?source=directory
Select a CSV file using Browse and set hadoop root directory ex: /user/bigdataproject/
Tool Generates Hadoop script with all csv files and following is a sample of
generated Hadoop script to insert csv into Hadoop
#!/bin/bash -v
hadoop fs -put ./AllstarFull.csv /user/bigdataproject/AllstarFull.csv
hive -f ./AllstarFull.hive
hadoop fs -put ./Appearances.csv /user/bigdataproject/Appearances.csv
hive -f ./Appearances.hive
hadoop fs -put ./AwardsManagers.csv /user/bigdataproject/AwardsManagers.csv
hive -f ./AwardsManagers.hive
Sample of generated Hive scripts
CREATE DATABASE IF NOT EXISTS lahman;
USE lahman;
CREATE TABLE AllstarFull (playerID string,yearID string,gameNum string,gameID string,teamID string,lgID string,GP string,startingPos string) row format delimited fields terminated by ',' stored as textfile;
LOAD DATA INPATH '/user/bigdataproject/AllstarFull.csv' OVERWRITE INTO TABLE AllstarFull;
SELECT * FROM AllstarFull;
Thanks
Vijay