hive condition in shell script not working as expected - sql

Condition within Hive query in shell script not working properly
Wrote a shell script to send out email alert based on the condition in the outcome of a query, but no matter what happens, only the 2nd part (else part) always gets sent, no matter the outcome of the variable. Please kindly help to check. Below is the script:
#!bin/sh
strata=$(impala connection string -q "SELECT calendar, COUNT(*) row_count FROM TABLE
WHERE calendar = CAST(from_unixtime(unix_timestamp(now() - interval 1 days), 'yyyyMMdd') AS INT)
GROUP BY calendar
ORDER BY calendar DESC;")
if [ $strata -eq 0 ] ;then
echo -e 'The table HAS NOT been refreshed today, kindly hold' | mailx -s 'Alerting:Refresh_Status' -c email.address -- email.address
else
echo -e The number of records is $strata | mailx -s 'Alerting: Refresh_Status' -c email.address -- email.address
fi
The output of the variable will either be 0 or the number of records in the table, and email will be sent based on that. But the else part is the only one that gets sent regardless of the result.

correct me if I'm wrong, I think the strata variable will contain the result of your query hence the if statement in your script will jump to the else state because the result is not equal 0.
I think the query should be like this.
SELECT COUNT(*) row_count FROM TABLE
WHERE calendar = CAST(from_unixtime(unix_timestamp(now() - interval 1 days), 'yyyyMMdd') AS INT)
You just need to exclude the calendar from your select statement.

Related

How to use case statement in select with termsql?

I'm getting an unexpected error when using a case statement with a query using termsql. Is there something I'm doing wrong with it, perhaps invalid sql, or might this be an issue with the termsql parser (sqlparse version 0.4.2)?
Here's a simplified example of what I'm seeing (with newlines to help readability):
echo "Fred|10\nJulie|15\nFatima|42\nJulie|10\nFatima|18" | \
termsql -c 'name,activities' -t test -d'|' \
"select case name when 'Fatima' then 'Primary' else 'Secondary' end primary_normalized,sum(activities) from test group by primary_normalized"
That yields and error of:
Error: near line 1: near "'Fatima'": syntax error
The goal there is to group things by "Primary" and "Secondary" based on a chosen name.
This same query will do it, but the output isn't quite what I want:
echo "Fred|10\nJulie|15\nFatima|42\nJulie|10\nFatima|18" | \
termsql -c 'name,activities' -t test -d'|' \
"select name,sum(activities) from test group by case name when 'Fatima' then 'Primary' else 'Secondary' end"
The output from that is
Fatima|60
Fred|35
...which is kind of weird since the select is operating on the name field, but the group is operating on the case statement output. Not unexpected, but not desirable, where I'd really like the first option to work.

KSH-update : Get rows affected

Hello I'm querying a teradata database like that :
for var in `db2 -x "$other_query"`;
do
query_update_date="update test SET date =Null WHERE
name_test='$var '"
db2 -v "$query_update_date"
done
My query is executed but what I would like to print the query_update_date only when one row or more is affected (changed ) by update.
Example :
If I have
First query of loop :
query_update_date="update test SET date =Null WHERE
name_test='John'"
and second query of the loop :
query_update_date="update test SET date =Null WHERE
name_test='Jeff'"
and in my table before the query :
name_test date
Jeff 01/07/2016
John Null
After the query
name_test date
Jeff Null
John Null
The date from John was already null , so it hasn't been affected by update.
And
db2 -v "$query_update_date"
print my queries. What I want for previous example is to print in my logs only
query_update_date="update test SET date =Null WHERE
name_test='Jeff'"
Take snapshots of the table before and after you execute the query. Use whatever tools you like: how about (assuming SQL) SPOOL a "SELECT * FROM T" query into this file beforehand, and into that file afterwards. Use the UNIX diff command to compare the two files, and merely count the length of its output:
LINES_OUT=$(diff oldResults newResults | wc -l)
if [[ $LINES_OUT = 0 ]]
then
# log the query, however you do that
fi
If $IS_ANY_DIFF is true, log the query; otherwise, don't.

executing HIVE query in background

how to execute a HIVE query in background when the query looks like below
Select count(1) from table1 where column1='value1';
I am trying to write it using a script like below
#!/usr/bin/ksh
exec 1> /home/koushik/Logs/`basename $0 | cut -d"." -f1 | sed 's/\.sh//g'`_$(date +"%Y%m%d_%H%M%S").log 2>&1
ST_TIME=`date +%s`
cd $HIVE_HOME/bin
./hive -e 'SELECT COUNT(1) FROM TABLE1 WHERE COLUMN1 = ''value1'';'
END_TIME=`date +%s`
TT_SECS=$(( END_TIME - ST_TIME))
TT_HRS=$(( TT_SECS / 3600 ))
TT_REM_MS=$(( TT_SECS % 3600 ))
TT_MINS=$(( TT_REM_MS / 60 ))
TT_REM_SECS=$(( TT_REM_MS % 60 ))
printf "\n"
printf "Total time taken to execute the script="$TT_HRS:$TT_MINS:$TT_REM_SECS HH:MM:SS
printf "\n"
but getting error like
FAILED: SemanticException [Error 10004]: Line 1:77 Invalid table alias or column reference 'value1'
let me know exactly where I am doing mistake.
Create a document named example
vi example
Enter the query in the document and save it.
create table sample as
Select count(1) from table1 where column1='value1';
Now run the document using the following command:
hive -f example 1>example.error 2>example.output &
You will get the result as
[1]
Now disown the process :
disown
Now the process will run in the background. If you want to know the status of the output, you may use
tail -f example.output
True #Koushik ! Glad that you found the issue.
In the query, bash was unable to form the hive query due to ambiguous single quotes.
Though SELECT COUNT(1) FROM Table1 WHERE Column1 = 'Value1' is valid in hive,
$hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = 'Value1';' is not valid.
The best solution would be to use double quotes for the Value1 as
hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = "Value1";'
or use a quick and dirty solution by including the single quotes within double quotes.
hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = "'"Value1"'";'
This would make sure that the hive query is properly formed and then executed accordingly. I'd not suggest this approach unless you've a desperate ask for a single quote ;)
I am able to resolve it replacing single quote with double quote. Now the modified statement looks like
./hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = "Value1";'

how to assign a query result to a shell variable

I have an sql query that returns a date.
I call this query from a shell script and would like to assign this value to the variable called datestart (and use it later). Here is my code. Without the datestart assignment the query works fine.
#!/bin/sh
firstname="-Upgsql"
dbname="statcoll"
portname="-p5438"
datestart=(psql $firstname $portname $dbname<< EOF
SELECT MIN(latestrefdate) FROM (SELECT MAX(referencedate) AS latestrefdate FROM statistics WHERE transactionname IN(SELECT DISTINCT transactionname FROM statistics WHERE platform = 'Smarties')GROUP BY transactionname) as earliest;
EOF
)
echo $datestart
but the result is this :
Syntax error: word unexpected (expecting ")").
I have no idea where should I insert that closing bracket. Any hint is appreciated.
Instead of brackets in variable assignment you need to use $(...) for BASH or `...` for sh.
Try this:
#!/bin/sh
firstname="-Upgsql"
dbname="statcoll"
portname="-p5438"
datestart=`psql -t --pset="footer=off" --user="$firstname" --port="$portname" -d "$dbname"<<EOF
SELECT MIN(latestrefdate) FROM (SELECT MAX(referencedate) AS latestrefdate FROM statistics WHERE transactionname IN (SELECT DISTINCT transactionname FROM statistics WHERE platform = 'Smarties') GROUP BY transactionname) as earliest;
EOF
`
echo "$datestart"

checking if a variable value is null on bash script

i have this script that retrieves a parameter (a number) from a SQL query and assigns it to a variable.
there are two options - either the SQL query finds a value and then the script preforms
echo "the billcycle number is $v_bc", or it doesnt find a value and it suppose to
echo "no billcycle parameter found".
im having a problem with the if condition.
this is what i came up with:
#!/bin/bash
v_bc=`sqlplus -s /#bscsprod <<EOF
set pagesize 0
select billcycle from bc_run
where billcycle not in (50,16)
and control_group_ind is null
and billseqno=6043;
EOF`
if [ -z "$v_bc" ]; then echo no billcycle parameter found
else echo "the billcycle parameter is $v_bc"
fi
when billseqno=6043, then it means that v_bc=25, and when i run the script, the result is:
"the billcycle parameter is 25". which is what i ment it to do.
when i set billseqno=6042, according to the above SQL query, v_bc will get no value, therefore what i want it to do is echo "no billcycle parameter found".
instead i get
"the billcycle parameter is
no rows were selected".
any suggestions ?
thanks very much
Assaf.
Your code is correctly checking for an empty value -- v_bc is just not empty, even with -s.
You may either:
parse the output of sqplus when no rows are returned, so add this:
if [[ "$v_bc" == "no rows were selected" ]]; then v_bc=""; fi
This uses the bash [[ ]] command with "==" for pattern matching so we don't need to worry about leading/trailing whitespace. This is not as robust as dogbane's SET FEEDBACK OFF since it's entirely possible for "no rows were selected" to be valid data.
write a better query which always returns data, like this:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:594023455752#followup-221571500346463844
The trick being to reformulate your query and use select/union with a fallback query that conditionally provides output when your query is empty:
with data as
(select billcycle from bc_run where [...])
select * from data
union all
select 'NA', null from dual where not exists (select null from data);
(see also What is the dual table in Oracle? )
Try turning feedback off in sqlplus so that you don't get any output if no rows are selected:
v_bc=`sqlplus -s /#bscsprod <<EOF
SET FEEDBACK OFF
set pagesize 0
select billcycle from bc_run
where billcycle not in (50,16)
and control_group_ind is null
and billseqno=6043;
EOF`
Try something like
if [ "${v_bc:-SHOULDNTHAPPEN}" = "SHOULDNTHAPPEN" ]; then
echo no billcycle parameter found
else......
The idiomatic way to assign a value to a variable if that variable is empty is with an = in a parameter expansion. In other words, this:
: ${v_bc:=no billcycle paramter found}
is equivalent to:
test -z "$v_bc" && v_bc='no billcycle paramter found'
In your case, it would also be easy to do:
echo ${v_bc:-no billcycle parameter found}
but the question asked in the title is not really the problem in your case. (Since the problem is not that v_bc is the empty string, but rather that it is a value you do not expect.)