How to check whether a partition exists with hive

How to check whether a partition exists with hive - hive

I have a HiveQL script that can do some operations based on a hive table. But before doing these operations, I will check whether the partition needed exists, and if not, I will terminate the script. So how can I achieve it?

Using shell:
table_name="schema.table"
partition_spec="key=value"
partition_exists=$(hive -e "show partitions $table_name" | grep "$partition_spec");
#check partition_exists
if [ "$partition_exists" = "" ]; then echo not exists; else echo exists; fi

Related

How to replace SQL with bash variable of SQL command output

I am working on a program in bash that checks for valid JSON files before loading them into a table. The process first runs f_check_valid_json to verify the JSON. This process runs f_exe_sql_stmnt() that returns a column of bad files, stored in variable bad_fl_list. I would like to be able to input bad_fl_list in the WHERE clause of my update and delete sections of the function.
Right now, the SQL fails when there is more than one JSON file ID in bad_fl_list
f_exe_sql_stmnt(){
db=$1
sql_str=$2
psql -d "$db" -Atc "$sql_str"
if [ $? -gt 0 ]
then
echo "======================================================================="
echo "***Error: Database error while executing sql statement($sql_str)..."
exit 123
fi
}
f_check_valid_json() {
echo "*** checking for valid JSON format***"
sql_stmnt="Select json_fl_id from json_stgng where is_valid_json(json_datarec_fl) = false;"
bad_fl_list=$(f_exe_sql_stmnt "$t_db" "${sql_stmnt}")
echo "BAD FILE LIST: ${bad_fl_list}"
if [ ! -z "$bad_fl_list" ]
then
echo "BAD JSON LIST IS NOT EMPTY"
echo "*** updating balancing table to reflect bad file ***"
updt_bal_log_str="UPDATE ${bal_log_tbl} SET trgt_load_stus_cd ='F' where json_fl_id in ($bad_fl_list);"
f_exe_sql_stmnt "$DB" "$updt_bal_log_str"
echo "*** deleting bad JSON file record from staging with file ID: ${bad_fl_list}"
delete_stmnt="delete from ${stg_tbl} where json_fl_id in ($bad_fl_list);"
f_exe_sql_stmnt "$t_db" "${delete_stmnt}"
fi
}
Here is some example output from the logs:
+ psql -d dedw -Atc 'UPDATE json_load_bal_dtl_log SET trgt_load_stus_cd ='\''F'\'' where json_fl_id in (O21181043417
O21181043417
O21181003641);'
ERROR: syntax error at or near "O21181043417"
LINE 2: O21181043417
^

when beeline partially executed the list of commands then how to get the exit code status?

I have a beeline query where I'm passing (-f) a file named as "some.sql" which is having multiple queries to be executed. But one of them failed then does it return 0 or some non zero value? please help me with this. I would like to capture and handle this situation.

The return code will be a non-zero value if atleast one of the queries in the file fails. Beeline will not execute other queries in the script after the failed one, if there are any. It is better to have one query per file.
A sample bash script.
#!/bin/bash
beeline -u $url -f queries.sql
rc=$?
if [ $rc -ne 0 ]
then
echo "return code is $rc. One or more queries in the file failed"
else
echo "return code is $rc. All queries executed successfully"
fi
You can also add printf statements after each query in the queries file to know the queries that executed successfully.

Hive to pass parameters in where clause

In Hive can we pass parameter in where clause?
if yes, cloud you please explain me with one scenario?
ex:In sql
select * from mytable where col= ?

Yes, you can.
Here are several examples:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VariableSubstitution
Here is a specific example, using a shell script, which is a very common place to use variable substitution:
#!/usr/bin/env bash
if [ "$#" -eq 1 ]; then
WHEREVAR=$1
hive -e "SELECT * FROM myDB.myTable where myFirstField=${WHEREVAR};"
else
echo "Illegal number of parameters"

connect to sqlplus only once without writing to a file in a loop

I have a requirement for which I need to write a ksh script that reads command line parameters into arrays and creates DML statements to insert records into an oracle database. I've created a script as below to achieve this. However, the user invoking the script doesn't have permission to write into the directory where the script has to run. So, is there a way we can fire multiple inserts on the database without connecting to sqlplus multiple times within the loop and at the same time, NOT create temp sql file as below? Any ideas are highly appreciated. Thanks in advance!
i=0
while (( i<$src_tbl_cnt ))
do
echo "insert into temp_table values ('${src_tbl_arr[$i]}', ${ins_row_arr[$i]}, ${rej_row_arr[$i]});" >> temp_scrpt.sql
(( i+=1 ))
done
echo "commit; disc; quit" >> temp_scrpt.sql
sqlplus user/pass#db # temp_scrpt.sql

Just use the /tmp directory.
The /tmp directory is guaranteed to be present on any unix-family server. It is there precisely for needs like this. Definitely do something like add the current process ID in the file name so that multiple users don't step on each other. So the total name is something like /tmp/temp_$PID_scrpt.sql or the like.
When done, be sure to also delete that file--say, in a line right after the sqlplus call. Thus be sure to store the file name in a variable and delete what's in that variable.
It should go without saying, but in a well run shop: 1) The admins should have put more than enough space in /tmp, 2) All the users in the community should not be deleting other's files in /tmp or overloading it so it runs out of space. 3) The admins should setup a job that deletes files from /tmp after a certain age so that if your script fails before it deletes the temporary file, it won't be there forever.
So really, this answer is more about /tmp and managing it effectively--but that really is what you need. Using temporary files is a powerful technique, so your design is good. And the reality that users often won't have rights in a directory is common, so /tmp is your answer.

Instead of creating a temporary file you can directly pipe the output of an input generating block into sqlplus, in your shell script.
Example:
{
echo 'set auto off;'
for ((i=0; i<100; i++)); do
echo "insert into itest(i) values ($i);"
done
# echo 'rollback;' # for testing
echo 'commit;'
} | sqlplus -S juser/secret#db > /dev/null
This works with Ksh 93 and Bash (perhaps even with Ksh 88 modulo the (( expression syntax).
The corresponding DDL statement for the test table:
create table itest ( i number(36) ) ;
PS: Btw, even when creating a temporary file is preferred - redirecting the output is way more efficient than doing an append-style redirect for each line, e.g.:
{ for ((i=0; i<100; i++)); do echo "line $i"; done; echo end; } > foo.tmp

the below piece of code will keep connecting to SQLplus multiple times or it will connect only once ?
{
echo 'set auto off;'
for ((i=0; i<100; i++)); do
echo "insert into itest(i) values ($i);"
done
echo 'rollback;' # for testing
echo 'commit;'
} | sqlplus -S juser/secret#db > /dev/null

Running oracle script as oracle user from a shell script that runs as root

I have a shell script that runs as root. I want the script to switch to oracle user, run sqlplus and run some .sql files.
I am trying to followung :
su - oracle << -EOF1 2>&1
sqlplus $user/$password << -EOF2
#oracle.sql;
#quartz.sql;
EOF2
EOF1
first of all i get stty: standard input: Inappropriate ioctl for device what does it mean ?
second, can someone explain to me how the redirect (should) work in this case ?
Thanks

Use:
if [ "$(id -un)" -eq "root" ]; then
exec su - oracle -c $0
fi
sqlplus <<EOF
blablabla
EOF
If your script potentially takes arguments, the solution will differ.
What this does it checking whether the user running is currently root. If so, it re-executes the script ($0) as user oracle instead.
But BTW, why does the script run as root in the first place?

su - oracle -c " echo 'select 1 from dual;
select 2 from dual;'| sqlpus / as sysdba "
if contain ' using following
su - oracle -c " echo \"select 1 from dual;
select 2 from dual;\" | sqlpus / as sysdba "

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to check whether a partition exists with hive - hive

I have a HiveQL script that can do some operations based on a hive table. But before doing these operations, I will check whether the partition needed exists, and if not, I will terminate the script. So how can I achieve it?

Using shell: table_name="schema.table" partition_spec="key=value" partition_exists=$(hive -e "show partitions $table_name" | grep "$partition_spec"); #check partition_exists if [ "$partition_exists" = "" ]; then echo not exists; else echo exists; fi

Related

How to replace SQL with bash variable of SQL command output

when beeline partially executed the list of commands then how to get the exit code status?

Hive to pass parameters in where clause

connect to sqlplus only once without writing to a file in a loop

Running oracle script as oracle user from a shell script that runs as root

Categories

Resources