I have an sql query that returns a date.
I call this query from a shell script and would like to assign this value to the variable called datestart (and use it later). Here is my code. Without the datestart assignment the query works fine.
#!/bin/sh
firstname="-Upgsql"
dbname="statcoll"
portname="-p5438"
datestart=(psql $firstname $portname $dbname<< EOF
SELECT MIN(latestrefdate) FROM (SELECT MAX(referencedate) AS latestrefdate FROM statistics WHERE transactionname IN(SELECT DISTINCT transactionname FROM statistics WHERE platform = 'Smarties')GROUP BY transactionname) as earliest;
EOF
)
echo $datestart
but the result is this :
Syntax error: word unexpected (expecting ")").
I have no idea where should I insert that closing bracket. Any hint is appreciated.
Instead of brackets in variable assignment you need to use $(...) for BASH or `...` for sh.
Try this:
#!/bin/sh
firstname="-Upgsql"
dbname="statcoll"
portname="-p5438"
datestart=`psql -t --pset="footer=off" --user="$firstname" --port="$portname" -d "$dbname"<<EOF
SELECT MIN(latestrefdate) FROM (SELECT MAX(referencedate) AS latestrefdate FROM statistics WHERE transactionname IN (SELECT DISTINCT transactionname FROM statistics WHERE platform = 'Smarties') GROUP BY transactionname) as earliest;
EOF
`
echo "$datestart"
Related
Condition within Hive query in shell script not working properly
Wrote a shell script to send out email alert based on the condition in the outcome of a query, but no matter what happens, only the 2nd part (else part) always gets sent, no matter the outcome of the variable. Please kindly help to check. Below is the script:
#!bin/sh
strata=$(impala connection string -q "SELECT calendar, COUNT(*) row_count FROM TABLE
WHERE calendar = CAST(from_unixtime(unix_timestamp(now() - interval 1 days), 'yyyyMMdd') AS INT)
GROUP BY calendar
ORDER BY calendar DESC;")
if [ $strata -eq 0 ] ;then
echo -e 'The table HAS NOT been refreshed today, kindly hold' | mailx -s 'Alerting:Refresh_Status' -c email.address -- email.address
else
echo -e The number of records is $strata | mailx -s 'Alerting: Refresh_Status' -c email.address -- email.address
fi
The output of the variable will either be 0 or the number of records in the table, and email will be sent based on that. But the else part is the only one that gets sent regardless of the result.
correct me if I'm wrong, I think the strata variable will contain the result of your query hence the if statement in your script will jump to the else state because the result is not equal 0.
I think the query should be like this.
SELECT COUNT(*) row_count FROM TABLE
WHERE calendar = CAST(from_unixtime(unix_timestamp(now() - interval 1 days), 'yyyyMMdd') AS INT)
You just need to exclude the calendar from your select statement.
I have column in hive table like below
testing_time
2018-12-31 14:45:55
2018-12-31 15:50:58
Now I want to get the distinct values as a variable so I can use in another query.
I have done like below
abc=`hive -e "select collect_set(testing_time)) from db.tbl";`
echo $abc
["2018-12-31 14:45:55","2018-12-31 15:50:58"]
xyz=${abc:1:-1}
when I do
hive -e "select * from db.tbl where testing_time in ($xyz)"
I get below error
Arguments for IN should be the same type! Types are {timestamp IN (string, string)
what the the mistake I am doing?
What is the correct way of achieving my result?
Note: I know I can use subquery for this scenario but I would like to use variable to achieve my result
Problem is that you're comparing timestamp (column testing_time) with string (i.e. "2018-12-31 14:45:55"), so you need to convert string to timestamp, which you can do via TIMESTAMP(string).
Here's a bash script that adds the conversion:
RES="" # here we will save the resulting SQL
IFS=","
read -ra ITEMS <<< "$xyz" # split timestamps into array
for ITEM in "${ITEMS[#]}"; do
RES="${RES}TIMESTAMP($ITEM)," # add the timestamp to RES variable,
# surrounded by TIMESTAMP(x)
done
unset IFS
RES="${RES%?}" # delete the extra comma
Then you can run the constructed SQL query:
hive -e "select * from db.tbl where testing_time in ($RES)"
I want to store current_day - 1 in a variable in Hive. I know there are already previous threads on this topic but the solutions provided there first recommends defining the variable outside hive in a shell environment and then using that variable inside Hive.
Storing result of query in hive variable
I first got the current_Date - 1 using
select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1);
Then i tried two approaches:
1. set date1 = ( select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1);
and
2. set hivevar:date1 = ( select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1);
Both the approaches are throwing an error:
"ParseException line 1:82 cannot recognize input near 'select' 'date_sub' '(' in expression specification"
When I printed (1) in place of yesterday's date the select query is saved in the variable. The (2) approach throws "{hivevar:dt_chk} is undefined
".
I am new to Hive, would appreciate any help. Thanks.
Hive doesn't support a straightforward way to store query result to variables.You have to use the shell option along with hiveconf.
date1 = $(hive -e "set hive.cli.print.header=false; select date_sub(from_unixtime(unix_timestamp(),'yyyy-MM-dd'),1);")
hive -hiveconf "date1"="$date1" -f hive_script.hql
Then in your script you can reference the newly created varaible date1
select '${hiveconf:date1}'
After lots of research, this is probably the best way to achieve setting a variable as an output of an SQL:
INSERT OVERWRITE LOCAL DIRECTORY '<home path>/config/date1'
select CONCAT('set hivevar:date1=',date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1)) from <some table> limit 1;
source <home path>/config/date1/000000_0;
You will then be able to use ${date1} in your subsequent SQLs.
Here we had to use <some table> limit 1 as hive got a bug in insert overwrite if we don't specify a table name.
how to execute a HIVE query in background when the query looks like below
Select count(1) from table1 where column1='value1';
I am trying to write it using a script like below
#!/usr/bin/ksh
exec 1> /home/koushik/Logs/`basename $0 | cut -d"." -f1 | sed 's/\.sh//g'`_$(date +"%Y%m%d_%H%M%S").log 2>&1
ST_TIME=`date +%s`
cd $HIVE_HOME/bin
./hive -e 'SELECT COUNT(1) FROM TABLE1 WHERE COLUMN1 = ''value1'';'
END_TIME=`date +%s`
TT_SECS=$(( END_TIME - ST_TIME))
TT_HRS=$(( TT_SECS / 3600 ))
TT_REM_MS=$(( TT_SECS % 3600 ))
TT_MINS=$(( TT_REM_MS / 60 ))
TT_REM_SECS=$(( TT_REM_MS % 60 ))
printf "\n"
printf "Total time taken to execute the script="$TT_HRS:$TT_MINS:$TT_REM_SECS HH:MM:SS
printf "\n"
but getting error like
FAILED: SemanticException [Error 10004]: Line 1:77 Invalid table alias or column reference 'value1'
let me know exactly where I am doing mistake.
Create a document named example
vi example
Enter the query in the document and save it.
create table sample as
Select count(1) from table1 where column1='value1';
Now run the document using the following command:
hive -f example 1>example.error 2>example.output &
You will get the result as
[1]
Now disown the process :
disown
Now the process will run in the background. If you want to know the status of the output, you may use
tail -f example.output
True #Koushik ! Glad that you found the issue.
In the query, bash was unable to form the hive query due to ambiguous single quotes.
Though SELECT COUNT(1) FROM Table1 WHERE Column1 = 'Value1' is valid in hive,
$hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = 'Value1';' is not valid.
The best solution would be to use double quotes for the Value1 as
hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = "Value1";'
or use a quick and dirty solution by including the single quotes within double quotes.
hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = "'"Value1"'";'
This would make sure that the hive query is properly formed and then executed accordingly. I'd not suggest this approach unless you've a desperate ask for a single quote ;)
I am able to resolve it replacing single quote with double quote. Now the modified statement looks like
./hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = "Value1";'
i have this script that retrieves a parameter (a number) from a SQL query and assigns it to a variable.
there are two options - either the SQL query finds a value and then the script preforms
echo "the billcycle number is $v_bc", or it doesnt find a value and it suppose to
echo "no billcycle parameter found".
im having a problem with the if condition.
this is what i came up with:
#!/bin/bash
v_bc=`sqlplus -s /#bscsprod <<EOF
set pagesize 0
select billcycle from bc_run
where billcycle not in (50,16)
and control_group_ind is null
and billseqno=6043;
EOF`
if [ -z "$v_bc" ]; then echo no billcycle parameter found
else echo "the billcycle parameter is $v_bc"
fi
when billseqno=6043, then it means that v_bc=25, and when i run the script, the result is:
"the billcycle parameter is 25". which is what i ment it to do.
when i set billseqno=6042, according to the above SQL query, v_bc will get no value, therefore what i want it to do is echo "no billcycle parameter found".
instead i get
"the billcycle parameter is
no rows were selected".
any suggestions ?
thanks very much
Assaf.
Your code is correctly checking for an empty value -- v_bc is just not empty, even with -s.
You may either:
parse the output of sqplus when no rows are returned, so add this:
if [[ "$v_bc" == "no rows were selected" ]]; then v_bc=""; fi
This uses the bash [[ ]] command with "==" for pattern matching so we don't need to worry about leading/trailing whitespace. This is not as robust as dogbane's SET FEEDBACK OFF since it's entirely possible for "no rows were selected" to be valid data.
write a better query which always returns data, like this:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:594023455752#followup-221571500346463844
The trick being to reformulate your query and use select/union with a fallback query that conditionally provides output when your query is empty:
with data as
(select billcycle from bc_run where [...])
select * from data
union all
select 'NA', null from dual where not exists (select null from data);
(see also What is the dual table in Oracle? )
Try turning feedback off in sqlplus so that you don't get any output if no rows are selected:
v_bc=`sqlplus -s /#bscsprod <<EOF
SET FEEDBACK OFF
set pagesize 0
select billcycle from bc_run
where billcycle not in (50,16)
and control_group_ind is null
and billseqno=6043;
EOF`
Try something like
if [ "${v_bc:-SHOULDNTHAPPEN}" = "SHOULDNTHAPPEN" ]; then
echo no billcycle parameter found
else......
The idiomatic way to assign a value to a variable if that variable is empty is with an = in a parameter expansion. In other words, this:
: ${v_bc:=no billcycle paramter found}
is equivalent to:
test -z "$v_bc" && v_bc='no billcycle paramter found'
In your case, it would also be easy to do:
echo ${v_bc:-no billcycle parameter found}
but the question asked in the title is not really the problem in your case. (Since the problem is not that v_bc is the empty string, but rather that it is a value you do not expect.)