KSH-update : Get rows affected - sql

Hello I'm querying a teradata database like that :
for var in `db2 -x "$other_query"`;
do
query_update_date="update test SET date =Null WHERE
name_test='$var '"
db2 -v "$query_update_date"
done
My query is executed but what I would like to print the query_update_date only when one row or more is affected (changed ) by update.
Example :
If I have
First query of loop :
query_update_date="update test SET date =Null WHERE
name_test='John'"
and second query of the loop :
query_update_date="update test SET date =Null WHERE
name_test='Jeff'"
and in my table before the query :
name_test date
Jeff 01/07/2016
John Null
After the query
name_test date
Jeff Null
John Null
The date from John was already null , so it hasn't been affected by update.
And
db2 -v "$query_update_date"
print my queries. What I want for previous example is to print in my logs only
query_update_date="update test SET date =Null WHERE
name_test='Jeff'"

Take snapshots of the table before and after you execute the query. Use whatever tools you like: how about (assuming SQL) SPOOL a "SELECT * FROM T" query into this file beforehand, and into that file afterwards. Use the UNIX diff command to compare the two files, and merely count the length of its output:
LINES_OUT=$(diff oldResults newResults | wc -l)
if [[ $LINES_OUT = 0 ]]
then
# log the query, however you do that
fi
If $IS_ANY_DIFF is true, log the query; otherwise, don't.

Related

hive condition in shell script not working as expected

Condition within Hive query in shell script not working properly
Wrote a shell script to send out email alert based on the condition in the outcome of a query, but no matter what happens, only the 2nd part (else part) always gets sent, no matter the outcome of the variable. Please kindly help to check. Below is the script:
#!bin/sh
strata=$(impala connection string -q "SELECT calendar, COUNT(*) row_count FROM TABLE
WHERE calendar = CAST(from_unixtime(unix_timestamp(now() - interval 1 days), 'yyyyMMdd') AS INT)
GROUP BY calendar
ORDER BY calendar DESC;")
if [ $strata -eq 0 ] ;then
echo -e 'The table HAS NOT been refreshed today, kindly hold' | mailx -s 'Alerting:Refresh_Status' -c email.address -- email.address
else
echo -e The number of records is $strata | mailx -s 'Alerting: Refresh_Status' -c email.address -- email.address
fi
The output of the variable will either be 0 or the number of records in the table, and email will be sent based on that. But the else part is the only one that gets sent regardless of the result.
correct me if I'm wrong, I think the strata variable will contain the result of your query hence the if statement in your script will jump to the else state because the result is not equal 0.
I think the query should be like this.
SELECT COUNT(*) row_count FROM TABLE
WHERE calendar = CAST(from_unixtime(unix_timestamp(now() - interval 1 days), 'yyyyMMdd') AS INT)
You just need to exclude the calendar from your select statement.

assign date value dynamically to hive query

I have partitioned table in hive and I want to assign value for date column dynamically( yesterday's date ). Below is my current query but it's not working.
ALTER TABLE db1.table1 ADD IF NOT EXISTS PARTITION (loaddate="date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd') , 1)") LOCATION "hdfs://location1/abc/rawdata/externalhivetables/downloading/data";
Instead of returning the date value it's returning me the complete expression.
select downloading.loaddate From downloading limit 3;
+------------------------------------------------------------+
| downloading.loaddate |
+------------------------------------------------------------+
| date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd') , 1) |
| date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd') , 1) |
| date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd') , 1) |
In hive shell we cannot assign variable variables from the result of query yet, we need to have 2 steps:
Use Shell script to execute the query and store the result into a variable.
Then initialize the hive shell/script with the variable.
bash$ var=`hive -S -e "select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd') , 1);"`
bash$ echo $var
Now initialize hive/beeline shell with the varvalue
bash$ hive -hiveconf dd=$var
hive> ALTER TABLE db1.table1 ADD IF NOT EXISTS PARTITION (loaddate='${hiveconf:dd}') LOCATION "hdfs://location1/abc/rawdata/externalhivetables/downloading/data";
Refer to this and this links for additional information.
Use shell to calculate date and substitute it using shell variable substitution:
bash$ dt=$(date -d '-1 day' +%Y-%m-%d)
bash$ hive -e "ALTER TABLE db1.table1 ADD IF NOT EXISTS PARTITION (loaddate='$dt') LOCATION 'hdfs://location1/abc/rawdata/externalhivetables/downloading/data'"

How to set intervalstyle = iso_8601 and then run a select query in golang

I have a table with an interval column, something like this.
CREATE TABLE validity (
window INTERVAL NOT NULL
);
Assume the value stored is 'P3DT1H' which is in iso_8601 format.
When I try to read the value, it comes in regular postgres format.
3 days 01:00:00
However I want the value in iso_8601 format. How can I achieve it?
so=# CREATE TABLE validity (
w INTERVAL NOT NULL
);
CREATE TABLE
so=# insert into validity values ('3 days 01:00:00');
INSERT 0 1
you probably are looking for intervalstyle
so=# set intervalstyle to iso_8601;
SET
so=# select w From validity;
w
--------
P3DT1H
(1 row)
surely it can be set per transaction/session/role/db/cluster
You can use SET intervalstyle query and set the style to iso_8601. Then, when you output the results, they will be in ISO 8601 format.
_, err := s.db.Exec("SET intervalstyle='iso_8601'")
res, err := s.db.Query("select interval '1d1m'")
// res contains a row with P1DT1M
If you are looking for a way to change intervalstyle for all sessions on a server level, you can update it in your configuration file:
-- connect to your psql using whatever client, e.g. cli and run
SHOW config_file;
-- in my case: /usr/local/var/postgres/postgresql.conf
Edit this file and add the following line:
intervalstyle = 'iso_8601'
In my case the file already had a commented out line with intervalstyle, and its value was postgres. You should change it and restart the service.
That way you won't have to change the style from golang each time you run a query.

Create table name in Hive using variable subsitution

I'd like to create a table name in Hive using variable substitution.
E.g.
SET market = "AUS";
create table ${hiveconf:market_cd}_active as ... ;
But it fails. Any idea how it can be achieved?
You should use backtrics (``) for name for that, like:
SET market=AUS;
CREATE TABLE `${hiveconf:market}_active` AS SELECT 1;
DESCRIBE `${hiveconf:market}_active`;
Example run script.sql from beeline:
$ beeline -u jdbc:hive2://localhost:10000/ -n hadoop -f script.sql
Connecting to jdbc:hive2://localhost:10000/
...
0: jdbc:hive2://localhost:10000/> SET market=AUS;
No rows affected (0.057 seconds)
0: jdbc:hive2://localhost:10000/> CREATE TABLE `${hiveconf:market}_active` AS SELECT 1;
...
INFO : Dag name: CREATE TABLE `AUS_active` AS SELECT 1(Stage-1)
...
INFO : OK
No rows affected (12.402 seconds)
0: jdbc:hive2://localhost:10000/> DESCRIBE `${hiveconf:market}_active`;
...
INFO : Executing command(queryId=hive_20190801194250_1a57e6ec-25e7-474d-b31d-24026f171089): DESCRIBE `AUS_active`
...
INFO : OK
+-----------+------------+----------+
| col_name | data_type | comment |
+-----------+------------+----------+
| _c0 | int | |
+-----------+------------+----------+
1 row selected (0.132 seconds)
0: jdbc:hive2://localhost:10000/> Closing: 0: jdbc:hive2://localhost:10000/
Markovitz's criticisms are correct, but do not produce a correct solution. In summary, you can use variable substitution for things like string comparisons, but NOT for things like naming variables and tables. If you know much about language compilers and parsers, you get a sense of why this would be true. You could construct such behavior in a language like Java, but SQL is just too crude.
Running that code produces an error, "cannot recognize input near '$' '{' 'hiveconf' in table name".(I am running Hortonworks, Hive 1.2.1000.2.5.3.0-37).
I spent a couple hours Googling and experimenting with different combinations of punctuation, different tools ranging from command line, Ambari, and DB Visualizer, etc., and I never found any way to construct a table name or a field name with a variable value. I think you're stuck with using variables in places where you need a string literal, like comparisons, but you cannot use them in place of reserved words or existing data structures, if that makes sense. By example:
--works
drop table if exists user_rgksp0.foo;
-- Does NOT work:
set MY_FILE_NAME=user_rgksp0.foo;
--drop table if exists ${hiveconf:MY_FILE_NAME};
-- Works
set REPORT_YEAR=2018;
select count(1) as stationary_event_count, day, zip_code, route_id from aaetl_dms_pub.dms_stationary_events_pub
where part_year = '${hiveconf:REPORT_YEAR}'
-- Does NOT Work:
set MY_VAR_NAME='zip_code'
select count(1) as stationary_event_count, day, '${hiveconf:MY_VAR_NAME}', route_id from aaetl_dms_pub.dms_stationary_events_pub
where part_year = 2018
The qualifies should be removed
You're using the wrong variable name
SET market=AUS; create table ${hiveconf:market}_active as select 1;

how to assign a query result to a shell variable

I have an sql query that returns a date.
I call this query from a shell script and would like to assign this value to the variable called datestart (and use it later). Here is my code. Without the datestart assignment the query works fine.
#!/bin/sh
firstname="-Upgsql"
dbname="statcoll"
portname="-p5438"
datestart=(psql $firstname $portname $dbname<< EOF
SELECT MIN(latestrefdate) FROM (SELECT MAX(referencedate) AS latestrefdate FROM statistics WHERE transactionname IN(SELECT DISTINCT transactionname FROM statistics WHERE platform = 'Smarties')GROUP BY transactionname) as earliest;
EOF
)
echo $datestart
but the result is this :
Syntax error: word unexpected (expecting ")").
I have no idea where should I insert that closing bracket. Any hint is appreciated.
Instead of brackets in variable assignment you need to use $(...) for BASH or `...` for sh.
Try this:
#!/bin/sh
firstname="-Upgsql"
dbname="statcoll"
portname="-p5438"
datestart=`psql -t --pset="footer=off" --user="$firstname" --port="$portname" -d "$dbname"<<EOF
SELECT MIN(latestrefdate) FROM (SELECT MAX(referencedate) AS latestrefdate FROM statistics WHERE transactionname IN (SELECT DISTINCT transactionname FROM statistics WHERE platform = 'Smarties') GROUP BY transactionname) as earliest;
EOF
`
echo "$datestart"