query errors out if ran from shell script - sql

I can run this query fine
CREATE TABLE db.table1 STORED AS PARQUET as
SELECT * FROM db.table WHERE UPPER(executing) = 'TRUE';
Unless I run it from bash shell script. I get this error
#!/bin/bash
bash -c 'impala-shell -k -q "CREATE TABLE db.table1 STORED AS PARQUET as
SELECT * FROM db.table WHERE UPPER(executing) = 'TRUE';"'
ERROR: AnalysisException: operands of type STRING and BOOLEAN are not
comparable: upper(executing) = TRUE
I have tried using double quotes, no quotes and lower case with no luck

Single quotes cannot be included in a single-quoted string in shell. The single quotes around TRUE aren't included in the SQL command passed to impala-shell; the first closes the initial ', and the second starts a new quoted string, so your script is equivalent to
bash -c "impala-shell -k -q \"CREATE TABLE db.table1 STORED AS PARQUET as
SELECT * from db.table WHERE UPPER(executing) = TRUE;\""
One solution is to use double quotes as I have above, which allow you to include the single quotes that SQL requires.
bash -c "impala-shell -k -q \"CREATE TABLE db.table1 STORED AS PARQUET as
SELECT * from db.table WHERE UPPER(executing) = 'TRUE';\""
Alternatively, use $'...' to quote the argument to -c, in which case you can include properly escaped single quotes in the string.
bash -c $'impala-shell -k -q "CREATE TABLE db.table1 STORED AS PARQUET as
SELECT * from db.table WHERE UPPER(executing) = \'TRUE\';"'
However it's not clear why you are using bash -c at all instead of just running impala-shell directly as:
impala-shell -k -q "CREATE ... WHERE UPPER(executing) = 'TRUE';"

Related

SQL queries in BASH script

I am using Hadoop to execute my queries.
What I want is using BASH variables within my query. Here is an example :
export month="date +%m"
export year="date +%Y"
beeline -u jdbc:hive2://clustername.azurehdinsight.net:443/tab'
-n myname -e "select * from mytable where month = '$month' and
year = '$year';"
But the query is empty so that in reality, it's not the case within Hive.
select * from mytable where month = '$month' and
year = '$year';
is not an empty query in Hive.
Is there a problem in my bash script ?
You need execute date command using $(), change
export month="date +%m"
export year="date +%Y"
with
export month=$(date +%m)
export year=$(date +%Y)
You can use hivevar arguments with beeline
beeline -u jdbc:hive2://clustername.azurehdinsight.net:443/tab \
-n myname \
--hivevar month=$month \
--hivevar year=$year \
-e "select * from mytable where month = '${hivevar:month}' and year = '${hivevar:year}';"

How to pass parameter into SQL file from UNIX script?

I'm looking to pass in a parameter into a SQL file from my UNIX script. Unfortunately having problems with it.
Please see UNIX script below:
#!/bin/ksh
############
# Functions
_usage() {
SCRIPT_NAME=XXX
-eq 1 -o "$1" = "" -o "$1" = help -o "$1" = Help -o "$1" = HELP ]; then
echo "Usage: $SCRIPT_NAME [ cCode ]"
echo " - For example : $SCRIPT_NAME GH\n"
exit 1
fi
}
_initialise() {
cCode=$1
echo $cCode
}
# Set Variables
_usage $#
_initialise $1
# Main Processing
sql $DBNAME < test.sql $cCode > $PVNUM_LOGFILE
RETCODE=$?
# Check for errors within log file
if [[ $RETCODE != 0 ]] || grep 'E_' $PVNUM_LOGFILE
then
echo "Error - 50 - running test.sql. Please see $PVNUM_LOGFILE"
exit 50
fi
Please see SQL script (test.sql):
SELECT DISTINCT v1.*
FROM data_latest v1
JOIN temp_table t
ON v1.number = t.id
WHERE v1.code = '&1'
The error I am receiving when running my UNIX script is:
INGRES TERMINAL MONITOR Copyright 2008 Ingres Corporation
E_US0022 Either the flag format or one of the flags is incorrect,
or the parameters are not in proper order.
Anyone have any idea what I'm doing wrong?
Thanks!
NOTE: While I don't work with the sql command, I do routinely pass UNIX parameters into SQL template/script files when using the isql command line tool, so fwiw ...
The first thing you'll want to do is replace the &1 string with the value in the cCode variable; one typical method is to use sed to do a global search and replace of &1 with ${cCode} , eg:
$ cCode=XYZ
$ sed "s/\&1/${cCode}/g" test.sql
SELECT DISTINCT v1.*
FROM data_latest v1
JOIN temp_table t
ON v1.number = t.id
WHERE v1.code = 'XYZ' <=== &1 replaced with XYZ
NOTE: You'll need to wrap the sed code in double quotes so that the value of the cCode variable can be referenced.
Now, to get this passed into sql there are a couple options ... capture the sed output to a new file and submit that file to sql or ... [and I'm guessing this is doable with sql], pipe the sed output into sql, eg:
sed "s/\&1/${cCode}/g" test.sql | sql $DBNAME > $PVNUM_LOGFILE
You may need '\p\g' around your SQL in the text file?
I personally tend to code in the SQL to the script itself, as in
#!/bin/ksh
var=01.01.2018
db=database_name
OUTLOG=/path/log.txt
sql $db <<_END_ > $OUTLOG
set autocommit on;
\p\g
set lockmode session where readlock = nolock;
\p\g
SELECT *
FROM table
WHERE date > '${var}' ;
\p\g
_END_
exit 0

executing HIVE query in background

how to execute a HIVE query in background when the query looks like below
Select count(1) from table1 where column1='value1';
I am trying to write it using a script like below
#!/usr/bin/ksh
exec 1> /home/koushik/Logs/`basename $0 | cut -d"." -f1 | sed 's/\.sh//g'`_$(date +"%Y%m%d_%H%M%S").log 2>&1
ST_TIME=`date +%s`
cd $HIVE_HOME/bin
./hive -e 'SELECT COUNT(1) FROM TABLE1 WHERE COLUMN1 = ''value1'';'
END_TIME=`date +%s`
TT_SECS=$(( END_TIME - ST_TIME))
TT_HRS=$(( TT_SECS / 3600 ))
TT_REM_MS=$(( TT_SECS % 3600 ))
TT_MINS=$(( TT_REM_MS / 60 ))
TT_REM_SECS=$(( TT_REM_MS % 60 ))
printf "\n"
printf "Total time taken to execute the script="$TT_HRS:$TT_MINS:$TT_REM_SECS HH:MM:SS
printf "\n"
but getting error like
FAILED: SemanticException [Error 10004]: Line 1:77 Invalid table alias or column reference 'value1'
let me know exactly where I am doing mistake.
Create a document named example
vi example
Enter the query in the document and save it.
create table sample as
Select count(1) from table1 where column1='value1';
Now run the document using the following command:
hive -f example 1>example.error 2>example.output &
You will get the result as
[1]
Now disown the process :
disown
Now the process will run in the background. If you want to know the status of the output, you may use
tail -f example.output
True #Koushik ! Glad that you found the issue.
In the query, bash was unable to form the hive query due to ambiguous single quotes.
Though SELECT COUNT(1) FROM Table1 WHERE Column1 = 'Value1' is valid in hive,
$hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = 'Value1';' is not valid.
The best solution would be to use double quotes for the Value1 as
hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = "Value1";'
or use a quick and dirty solution by including the single quotes within double quotes.
hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = "'"Value1"'";'
This would make sure that the hive query is properly formed and then executed accordingly. I'd not suggest this approach unless you've a desperate ask for a single quote ;)
I am able to resolve it replacing single quote with double quote. Now the modified statement looks like
./hive -e 'SELECT COUNT(1) FROM Table1 WHERE Column1 = "Value1";'

How to export query to file

I need export a query to a file. I'm trying with
(SELECT A.*
FROM dfs.ff.`filea.json` A
LEFT JOIN dfs.ff.`fileb.json` B ON (A.quote = B.quote)
WHERE B.C IS NULL) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE LOCATION dfs.ff.`result.csv`;
But throws me a error
Use BCP utility
bcp "SELECT A.*
FROM dfs.ff.`filea.json` A
LEFT JOIN dfs.ff.`fileb.json` B ON (A.quote = B.quote)
WHERE B.C IS NULL" queryout "D:\MyTable.csv" -c -t , -S SERVERNAME -T
The -c argument specifies character output, as opposed to SQL's native binary format; this defaults to tab-separated values, but -t , changes the field terminator to commas. -T specifies Windows authentication ("trusted connection"), otherwise use -U MyUserName -P MyPassword.
This doesn't export column headers by default. You need to use a UNION ALL for headers
OR
Use SQLCMD
SQLCMD -S SERVERNAME -E -Q "SELECT A.*
FROM dfs.ff.`filea.json` A
LEFT JOIN dfs.ff.`fileb.json` B ON (A.quote = B.quote)
WHERE B.C IS NULL"
-s "," -o "D:\MyData.csv"
Also
http://www.egenconsulting.com/output-sql-to-csv/
http://solvedstack.com/questions/is-there-a-select-into-outfile-equivalent-in-sql-server-management-studio
Use the Drill shell command !record to record all output to a specified file. http://drill.apache.org/docs/configuring-the-drill-shell/
Use CTAS for the purpose. Don't forget to define store.format (default is parquet). Doc ref: https://drill.apache.org/docs/create-table-as-ctas/.

psql shortcut for frequently used queries? (like Unix "alias")

Is it possible to somehow create aliases (like Unix alias command) in psql?
I mean, not SQL FUNCTION, but local aliases to ease manual queries?
I don't know about any possibility. There is only workaround for psql based on psql variables, but there is lot of limits - using parameters for this queries is difficult.
postgres=# \set whoami 'SELECT CURRENT_USER;'
postgres=# :whoami
current_user
--------------
pavel
(1 row)
Pavel's answer is almost correct, except you can use parameter in another way.
after
\set s 'select * from '
\set l ' limit 10;'
The following command
:s agent :l
will equal to
select * from agent limit 10;
According to http://www.postgresql.org/docs/9.0/static/app-psql.html
If an unquoted argument begins with a colon (:), it is taken as a psql
variable and the value of the variable is used as the argument
instead. If the variable name is surrounded by single quotes (e.g.
:'var'), it will be escaped as an SQL literal and the result will be
used as the argument. If the variable name is surrounded by double
quotes, it will be escaped as an SQL identifier and the result will be
used as the argument.
You can also use backquote to run shell command
Arguments that are enclosed in backquotes (`) are taken as a command
line that is passed to the shell. The output of the command (with any
trailing newline removed) is taken as the argument value. The above
escape sequences also apply in backquotes.
how about using UDFs? You can create a UDF that returns a table (set of) then you can query it as this: select * from udf();
It is not as clean, but it is better than nothing and it is portable. And UDFs can take parameters too.
Why not use a view? May be views will help in your case.
This might help, if you need to run frequent queries from command line (not from psql cli).
Add this to .bash_profile /.bashrc
POSTGRES_BIN=~/Postgres/bin
B_RED='\033[1;31m'
RESET='\033[0m'
psqlcommand="$POSTGRES_BIN/psql -U vignesh usersdb -q -c"
function psqlselectrows()
{
[ -z "$1" ] && echo -e "${B_RED}Argument 1 missing: Need table name${RESET}" ||
$psqlcommand "SELECT * from $1"
}
The above command selects rows from the table, passed in the argument.
Note:
Change the database name, as required.
The schema by default is public. To have another default schema, add the following line in ~/.psqlrc file.
SET SEARCH_PATH TO <schema_name>;
If the database is password protected, refer this and make use of the secure method.
I have made some commands for my use, if it might help.
psqlselectrows - To select rows from a table
psqlgettablecount - To get row count of a table
psqltruncatetable - To truncate a table, on prompt
psqlgettablesize - To get the size of a table
psqlgetvacuumdetails - To get vacuum details of a table
psqlsettings - To get default and modified settings configured for Postgres.
(All the above commands need table name as first argument)
#Colors
B_RED='\033[1;31m'
B_GREEN='\033[1;32m'
B_YELLOW='\033[1;33m'
RESET='\033[0m'
#Postgres Command With Params
psqlcommand="$POSTGRES_BIN/psql -U vignesh usersdb -q -c"
function psqlgettablesize()
{
[ -z "$1" ] && echo -e "${B_RED}Argument 1 missing: Need table name${RESET}" ||
$psqlcommand "select pg_size_pretty(pg_total_relation_size('$1')) as total_table_size, pg_size_pretty(pg_relation_size('$1')) as table_size, pg_size_pretty(pg_indexes_size('$1')) as index_size;";
}
function psqlgettablecount()
{
[ -z "$1" ] && echo -e "${B_RED}Argument 1 missing: Need table name${RESET}" ||
$psqlcommand "select count(*) from $1;"
}
function psqlgetvacuumdetails()
{
[ -z "$1" ] && echo -e "${B_RED}Argument 1 missing: Need table name${RESET}" ||
$psqlcommand "SELECT relname, n_live_tup, n_dead_tup, last_analyze::timestamp, analyze_count, last_autoanalyze::timestamp, autoanalyze_count, last_vacuum::timestamp, vacuum_count, last_autovacuum::timestamp, autovacuum_count FROM pg_stat_user_tables where relname='$1' and schemaname = current_schema();"
}
function psqltruncatetable()
{
[ -z "$1" ] && echo -e "${B_RED}Argument 1 missing: Need table name${RESET}" ||
{
read -p "$(echo -e ${B_YELLOW}"Are you sure to truncate table '$1' (y/n)? "${RESET})" choice
case "$choice" in
y|Y ) $psqlcommand "TRUNCATE $1;";;
n|N ) echo -e "${B_GREEN}Table '$1' not truncated${RESET}";;
* ) echo -e "${B_RED}Invalid option${RESET}";;
esac
}
}
function psqlsettings()
{
query="select * from pg_settings"
if [ "$1" != "" ]; then
query="$query where category like '%$1%'"
fi
query="$query ;"
$psqlcommand "$query"
if [ -z "$1" ]; then
echo -e "${B_YELLOW}Passing Category as first argument will filter the related settings.${RESET}"
fi
}
function psqlselectrows()
{
[ -z "$1" ] && echo -e "${B_RED}Argument 1 missing: Need table name${RESET}" ||
$psqlcommand "SELECT * from $1"
}