Execute an Impala query and get query time - impala

I want to be able to execute a number of Impala queries and return the time it took for each query to execute. Using the Impala shell, I can do this with the following command:
impl -q "select count(*) from database.table;"
This gives me the output
Using service name 'impala'
SSL is enabled. Impala server certificates will NOT be verified (set --ca_cert to change)
Connected to *****.************:21000
Server version: impalad version 2.6.0-cdh5.8.3 RELEASE (build c644f476b774db9db87a619628f7a6ecc5f843e0)
Query: select count(*) from database.table
+----------+
| count(*) |
+----------+
| 1130976 |
+----------+
Fetched 1 row(s) in 0.86s
I want to be able to fetch that last line and extract the time. It doesn't really matter how, which is why I haven't tagged a language. I have tried using grep like this:
impl -q "select count(*) from database.table" | grep -Po "\d+\.\d+"
But that does nothing but remove the table. Putting the query in a python script and using subprocess couldn't find impl as a command, and same for scala.

The weird thing is that impala-shell dumps those messages to stderr rather than to stdout, so to fetch the last line, you would have to append a 2>&1 to redirect stderr to stdout
impala-shell -q "query string" 2>&1 | grep -Po "\d+\.\d+(?=s)"
Notice that a positive lookahead (?=s) is probably required to avoid capturing version numbers

Related

Query working fine on PGAdmin but not on Terminal

I am planning to create a cronjob that will change the password of my site every 15th day of the month. This is my .sh file
psql -U postgres -h (sample-host) (sample-db) -p (sample-port) -c "UPDATE web_user SET password_hash = '$2a$12$ohJ0j2Y9lRkO6Ld9MaiLuu7Q4hzYSr1IsM5SfY1SxAGk6fgn20aj2' WHERE email = 'email#email.com'"
When i run the query UPDATE web_user SET password_hash ='$2a$12$ohJ0j2Y9lRkO6Ld9MaiLuu7Q4hzYSr1IsM5SfY1SxAGk6fgn20aj2' WHERE email = 'email#email.com'; on pgadmin, everything is fine. The update is fine and the password is right. But when I run my .sh file on my machine(ubuntu 18.04), and even manually running it on the terminal, the result will be just a.
There are no errors messages or anything like that. Is there something that I missed? BTW the version of is postgresql 13.4.
Update: I just found out that special characters are causing the problem. it seems that the psql command does not allow special characters. The problem is I can't find any resources about these things.
For the special characters you need to a workaround
I tried in powershell, may be work in linux terminal
before:
postgres=# select * from web_user;
pass | email
------+-------
(0 rows)
used echo to output the password_hash as it contains special characters.
psql -U postgres -h localhost -d postgres -p 5432 -c "UPDATE web_user SET pass = '$(echo '$2a$12$ohJ0j2Y9lRkO6Ld9MaiLuu7Q4hzYSr1IsM5SfY1SxAGk6fgn20aj2')' WHERE email = 'email#email.com'"
output:
postgres=# select * from web_user;
pass | email
--------------------------------------------------------------+-----------------
$2a$12$ohJ0j2Y9lRkO6Ld9MaiLuu7Q4hzYSr1IsM5SfY1SxAGk6fgn20aj2 | email#email.com
(1 row)

Store my "Sybase" query result /output into a script variable

I need a variable to keep the results retrieved from a query (Sybase) that´s in a script.
I have built the following script, it works fine I get the desired result when I run it
Script: EXECUTE_DAILY:
isql -U database_dba -P password <<EOF!
select the_name from table_name where m_num="NUMB912" and date="17/01/2019"
go
quit
EOF!
echo "All Done"
Output:
"EXECUTE_DAILY" 97 lines, 293 characters
user#zp01$ ./EXECUTE_DAILY
the_name
-----------------------------------
NAME912
(1 row affected)
But now I would like to keep the output(the_name: NAME912) in a variable.
So far this is basically what I'm trying with no success.
variable=$(isql -U database_dba -P password -se "select the_name from table_name where m_num="NUMB912" and date="17/01/2019" ")
But, is not working. I can't save NAME912 in a variable.
You need to parse the output for the desired string/piece-of-data that you wish to store in your variable. I tend to make my life a bit easier by making sure I can easily/quickly search/parse out what I want.
Keeping a few issues in mind ...
I tend to use isql -s"|" -w10000 to ensure (most of the time) that a) the result set has all columns delimited with the pipe ('|') and b) a single row of data does not span multiple rows; the pipe delimiter makes it easier to parse out columns that may contain white space; obviously (?) use a different delimiter if a pipe may be part of your actual data
to make parsing of the isql output a bit easier I tend to add a unique, grep-able (literal) string to the rows that I'm looking to search/parse
some databases (eg, SQLAnywhere, Oracle) tend to mimic a literal value as the column header if said literal string has not been assigned an explicit alias/header; this means that if you do a simple search on your literal string then you'll get a match for the result set header as well as the actual data row
I tend to capture all isql output to a temporary file; this allows for easier follow-on processing, eg, error checking, data parsing, dumping contents to a logfile, etc
So, with the above in mind my code typically looks something like:
$ outfile=/tmp/.$$.isql.outfile
$ isql -s"|" -w10000 -U database_dba -P password <<-EOF > ${outfile} 2>&1
-- 'GREP'||'ME' ensures that 'GREPME' only shows up in the data row
select 'GREP'||'ME',the_name
from table_name
where m_num = "NUMB912"
and date = "17/01/2019"
go
EOF
$ cat ${outfile}
... snip ...
|'GREP'||'ME'|the_name | # notice the default column header = 'GREP'||'ME' which won't match my search for 'GREPME'
|------------|----------|
|GREPME |NAME912 | # this is the line I want to search/parse
... snip ...
$ read -r namevar < <(egrep GREPME ${outfile} | awk -F"|" '{print $3}')
$ echo ${namevar}
NAME912

How To Send Output To Terminal Window with Hive Script

I am familiar with storing output/results for a Hive Query to file, but what command do I use in the script to display the results of the HQL to the terminal?
Normally Hive prints results to the stdout, if not redirected it displays on console. You do not need any special command for this.
If you want to display results on the console screen and at the same time store them in a file, use tee command:
hive -e "use mydb; select * from test_t" | tee ./results.txt
OK
123 {"value(B)":"Bye"}
123 {"value(G)":"Jet"}
Time taken: 1.322 seconds, Fetched: 2 row(s)
Check file contains results
cat ./results.txt
123 {"value(B)":"Bye"}
123 {"value(G)":"Jet"}
See here: https://ru.wikipedia.org/wiki/Tee
This was my output:
There was no output, because I had yet to properly use the LOAD DATA INPATH command to my hdfs. After loading, I received output from the SELECT statement in the script.

why does the multiple "find" doesn't work in SC query

I wrote a command line
sc query PlugPlay | FIND "SERVICE_NAME" | FIND "STATE"
to list only the service name and its status but it's not giving any output.
Please correct me how to list the service name and its STATE (running or stopped) only.
You can do this with Windows' built-in findstr command. If you give it multiple words to find, separated by spaces, it will print lines that match any word (i.e. findstr "a b" is equivalent to grep -E 'a|b').
sc query plugplay | findstr "SERVICE_NAME STATE"
Running two pipes like that is not an "or" operation, it is an "and" operation. It will only output lines that include both SERVICE_NAME and STATE (which will be none, so no output is correct). If you run just the first find it gives
C:\>sc query PlugPlay | FIND "SERVICE_NAME"
SERVICE_NAME: PlugPlay
C:\>
and thus the STATE information is already removed.
The windows find command is too simple and limited to do what you want, but it can be achieved using the unix grep command. From cygwin for instance:
$ sc query PlugPlay | grep -E 'SERVICE_NAME|STATE'
SERVICE_NAME: PlugPlay
STATE : 4 RUNNING
$

How to delete last row in output file generated by nzsql

I am trying to delete last row in the file generated by nzsql.Please find the below query.
nzsql -A -c "SELECT * FROM AM_MAS_DIVISION_DIM" > abc.out
When I execute this query the output will be generated and stored in abc.out.This will include both header columns as well as some time information at the bottom.But I don't need the bottom metadata and want to keep only my header columns. How can I do this using only nzsql.Please help me.Thanks in advance.
use -r flag in the nzsql command to avoid getting that row [assuming the metadata referred in question is the row count summary line, ex: (3 rows)]
-r Suppresses the row count that is displayed at the end of the SQL output.
reference: http://pic.dhe.ibm.com/infocenter/ntz/v7r0m3/index.jsp?topic=%2Fcom.ibm.nz.adm.doc%2Fr_sysadm_nzsql_command.html
Why don't you just pipe the output to a unix command to remove it? I think something like this will work:
nzsql -A -c "SELECT * FROM AM_MAS_DIVISION_DIM" | sed '$d' > abc.out
Seems to be a recommended solution for getting rid of the last line (although ed, gawk, and other tools can handle it).