SQL queries in BASH script - sql

I am using Hadoop to execute my queries.
What I want is using BASH variables within my query. Here is an example :
export month="date +%m"
export year="date +%Y"
beeline -u jdbc:hive2://clustername.azurehdinsight.net:443/tab'
-n myname -e "select * from mytable where month = '$month' and
year = '$year';"
But the query is empty so that in reality, it's not the case within Hive.
select * from mytable where month = '$month' and
year = '$year';
is not an empty query in Hive.
Is there a problem in my bash script ?

You need execute date command using $(), change
export month="date +%m"
export year="date +%Y"
with
export month=$(date +%m)
export year=$(date +%Y)
You can use hivevar arguments with beeline
beeline -u jdbc:hive2://clustername.azurehdinsight.net:443/tab \
-n myname \
--hivevar month=$month \
--hivevar year=$year \
-e "select * from mytable where month = '${hivevar:month}' and year = '${hivevar:year}';"

Related

How to pass unix variable in where condition of query?

I have a file whose filename I am storing in a shell variable and I wish to pass that variable in the WHERE condition of my SQL select query. How can I achieve this ?
my code
cd /path/to/folder
var =$(ls tail)
id_var=$(echo "$var" | cut -f 1 -d '.')
...
...
sqlplus -s user/pwd#db < mysql.sql > output.txt
cat mysql.sql
select * from Records where "GlobalId"='$id_var'
From this answer:
cd /path/to/folder
var =$(ls tail)
id_var=$(echo "$var" | cut -f 1 -d '.')
sqlplus -s user/pwd#db #mysql.sql "${id_var}" > output.txt
Then in mysql.sql use &1 to substitute the first start argument:
select * from Records where "GlobalId"='&1'
Note: &1 is a substitution variable (and not a bind variable) so you will need to make sure that the value passed in does not perform any SQL injection attacks.
You can export the variable
export id_var
Then use envsubst command
envsubst < mysql.sql
This will substitute your variable.

query errors out if ran from shell script

I can run this query fine
CREATE TABLE db.table1 STORED AS PARQUET as
SELECT * FROM db.table WHERE UPPER(executing) = 'TRUE';
Unless I run it from bash shell script. I get this error
#!/bin/bash
bash -c 'impala-shell -k -q "CREATE TABLE db.table1 STORED AS PARQUET as
SELECT * FROM db.table WHERE UPPER(executing) = 'TRUE';"'
ERROR: AnalysisException: operands of type STRING and BOOLEAN are not
comparable: upper(executing) = TRUE
I have tried using double quotes, no quotes and lower case with no luck
Single quotes cannot be included in a single-quoted string in shell. The single quotes around TRUE aren't included in the SQL command passed to impala-shell; the first closes the initial ', and the second starts a new quoted string, so your script is equivalent to
bash -c "impala-shell -k -q \"CREATE TABLE db.table1 STORED AS PARQUET as
SELECT * from db.table WHERE UPPER(executing) = TRUE;\""
One solution is to use double quotes as I have above, which allow you to include the single quotes that SQL requires.
bash -c "impala-shell -k -q \"CREATE TABLE db.table1 STORED AS PARQUET as
SELECT * from db.table WHERE UPPER(executing) = 'TRUE';\""
Alternatively, use $'...' to quote the argument to -c, in which case you can include properly escaped single quotes in the string.
bash -c $'impala-shell -k -q "CREATE TABLE db.table1 STORED AS PARQUET as
SELECT * from db.table WHERE UPPER(executing) = \'TRUE\';"'
However it's not clear why you are using bash -c at all instead of just running impala-shell directly as:
impala-shell -k -q "CREATE ... WHERE UPPER(executing) = 'TRUE';"

How to pass parameter into SQL file from UNIX script?

I'm looking to pass in a parameter into a SQL file from my UNIX script. Unfortunately having problems with it.
Please see UNIX script below:
#!/bin/ksh
############
# Functions
_usage() {
SCRIPT_NAME=XXX
-eq 1 -o "$1" = "" -o "$1" = help -o "$1" = Help -o "$1" = HELP ]; then
echo "Usage: $SCRIPT_NAME [ cCode ]"
echo " - For example : $SCRIPT_NAME GH\n"
exit 1
fi
}
_initialise() {
cCode=$1
echo $cCode
}
# Set Variables
_usage $#
_initialise $1
# Main Processing
sql $DBNAME < test.sql $cCode > $PVNUM_LOGFILE
RETCODE=$?
# Check for errors within log file
if [[ $RETCODE != 0 ]] || grep 'E_' $PVNUM_LOGFILE
then
echo "Error - 50 - running test.sql. Please see $PVNUM_LOGFILE"
exit 50
fi
Please see SQL script (test.sql):
SELECT DISTINCT v1.*
FROM data_latest v1
JOIN temp_table t
ON v1.number = t.id
WHERE v1.code = '&1'
The error I am receiving when running my UNIX script is:
INGRES TERMINAL MONITOR Copyright 2008 Ingres Corporation
E_US0022 Either the flag format or one of the flags is incorrect,
or the parameters are not in proper order.
Anyone have any idea what I'm doing wrong?
Thanks!
NOTE: While I don't work with the sql command, I do routinely pass UNIX parameters into SQL template/script files when using the isql command line tool, so fwiw ...
The first thing you'll want to do is replace the &1 string with the value in the cCode variable; one typical method is to use sed to do a global search and replace of &1 with ${cCode} , eg:
$ cCode=XYZ
$ sed "s/\&1/${cCode}/g" test.sql
SELECT DISTINCT v1.*
FROM data_latest v1
JOIN temp_table t
ON v1.number = t.id
WHERE v1.code = 'XYZ' <=== &1 replaced with XYZ
NOTE: You'll need to wrap the sed code in double quotes so that the value of the cCode variable can be referenced.
Now, to get this passed into sql there are a couple options ... capture the sed output to a new file and submit that file to sql or ... [and I'm guessing this is doable with sql], pipe the sed output into sql, eg:
sed "s/\&1/${cCode}/g" test.sql | sql $DBNAME > $PVNUM_LOGFILE
You may need '\p\g' around your SQL in the text file?
I personally tend to code in the SQL to the script itself, as in
#!/bin/ksh
var=01.01.2018
db=database_name
OUTLOG=/path/log.txt
sql $db <<_END_ > $OUTLOG
set autocommit on;
\p\g
set lockmode session where readlock = nolock;
\p\g
SELECT *
FROM table
WHERE date > '${var}' ;
\p\g
_END_
exit 0

Print variable with each line of while-read command

I'm trying to set up a monitoring script that would take all the databases we have, showed tables and done some arithmetics on it.
I have this command:
impala-shell -i impalad -q " show databases;" -B | while read a; do impala-shell -q "show tables in ${a}" -B -i impalad; done
That produces following output:
Query: show tables in database1
table1
table2
How should I format the output to display the database name($a) with each table? I tried echoing it or || but this only prints the database name after displaying all the tables. Or is there a way how to pass the variable to awk?
Desired output would look like this:
database1.table1
database1.table2
It looks like the output of the show tables ... command will have a 1-line header, followed by the list of table names.
You could skip the first line by piping to tail -n +2,
and then use another while loop to echo the database name and table name pairs in the desired format:
impala-shell -i impalad -q " show databases;" -B | while read a; do
impala-shell -q "show tables in ${a}" -B -i impalad | tail -n +2 | while read table; do
echo $a.$table
done
done
You could also do
impala-shell -q ... | awk -v db="$a" 'NR > 1 {print db "." $0}'

How to export query to file

I need export a query to a file. I'm trying with
(SELECT A.*
FROM dfs.ff.`filea.json` A
LEFT JOIN dfs.ff.`fileb.json` B ON (A.quote = B.quote)
WHERE B.C IS NULL) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE LOCATION dfs.ff.`result.csv`;
But throws me a error
Use BCP utility
bcp "SELECT A.*
FROM dfs.ff.`filea.json` A
LEFT JOIN dfs.ff.`fileb.json` B ON (A.quote = B.quote)
WHERE B.C IS NULL" queryout "D:\MyTable.csv" -c -t , -S SERVERNAME -T
The -c argument specifies character output, as opposed to SQL's native binary format; this defaults to tab-separated values, but -t , changes the field terminator to commas. -T specifies Windows authentication ("trusted connection"), otherwise use -U MyUserName -P MyPassword.
This doesn't export column headers by default. You need to use a UNION ALL for headers
OR
Use SQLCMD
SQLCMD -S SERVERNAME -E -Q "SELECT A.*
FROM dfs.ff.`filea.json` A
LEFT JOIN dfs.ff.`fileb.json` B ON (A.quote = B.quote)
WHERE B.C IS NULL"
-s "," -o "D:\MyData.csv"
Also
http://www.egenconsulting.com/output-sql-to-csv/
http://solvedstack.com/questions/is-there-a-select-into-outfile-equivalent-in-sql-server-management-studio
Use the Drill shell command !record to record all output to a specified file. http://drill.apache.org/docs/configuring-the-drill-shell/
Use CTAS for the purpose. Don't forget to define store.format (default is parquet). Doc ref: https://drill.apache.org/docs/create-table-as-ctas/.