Xargs, sqlplus and quote nightmare? - sql

I have one big file containing data, for example :
123;test/x/COD_ACT_008510/descr="R08-Ballon d''eau"
456;test/x/COD_ACT_008510/descr="R08-Ballon d''eau"
In reality, there is much more column but I simplified here.
I want to treat each line, and do some sqlplus treatment with them.
Let say that I have one table, with two column, with this :
ID | CONTENT
123 | test/x/COD_ACT_333/descr="Test 1"
456 | test/x/COD_ACT_444/descr="Test 2"
Let say I want to update the two lines content value to have that :
ID | CONTENT
123 | test/x/COD_ACT_008510/descr="R08-Ballon d''eau"
456 | test/x/COD_ACT_008510/descr="R08-Ballon d''eau"
I have a lot of data and complex request to execute in reality, so I have to use sqlplus, not tools like sqlloader.
So, I treat the input file on 5 multi thread, one line at each time, and define "\n" like separator to evict quote conflict :
cat input_file.txt | xargs -n 1 -P 5 -d '\n' ./my_script.sh &
In "my_script.sh" I have :
#!/bin/bash
line="$1"
sim_id=$(echo "$line" | cut -d';' -f1)
content=$(echo "$line" | cut -d';' -f2)
sqlplus -s $DBUSER/$DBPASSWORD#$DBHOST:$DBPORT/$DBSCHEMA #updateRequest.sql "$id" "'"$content"'"
And in the updateRequest.sql file (just containing a test) :
set heading off
set feed off
set pages 0
set verify off
update T_TABLE SET CONTENT = '&2' where ID = '&1';
commit;
And in result, I have :
01740: missing double quote in identifier
If I put “verify” parameter to on in the sql script, I can see :
old 1: select '&2' from dual
new 1: select 'test/BVAL/COD_ACT_008510/descr="R08-Ballon d'eau"' from dual
It seems like one of the two single quotes (used for escape the second quote) is missing...
I tried everything, but each time I have an error with quote or double quote, either of bash side, or sql side... it's endless :/
I need the double quote for the "descr" part, and I need to process the apostrophe (quote) in content.
For info, the input file is generated automatically, but I can modify his format.

With GNU Parallel it looks like this:
dburl=oracle://$DBUSER:$DBPASSWORD#$DBHOST:$DBPORT/$DBSCHEMA
cat big |
parallel -j5 -v --colsep ';' -q sql $dburl "update T_TABLE SET CONTENT = '{=2 s/'/''/g=}' where ID = '{1}'; commit;"
But only if you do not have ; in the values. So given this input it will do the wrong thing:
456;test/x/COD_ACT_008510/descr="semicolon;in;value"

Related

Print variable with each line of while-read command

I'm trying to set up a monitoring script that would take all the databases we have, showed tables and done some arithmetics on it.
I have this command:
impala-shell -i impalad -q " show databases;" -B | while read a; do impala-shell -q "show tables in ${a}" -B -i impalad; done
That produces following output:
Query: show tables in database1
table1
table2
How should I format the output to display the database name($a) with each table? I tried echoing it or || but this only prints the database name after displaying all the tables. Or is there a way how to pass the variable to awk?
Desired output would look like this:
database1.table1
database1.table2
It looks like the output of the show tables ... command will have a 1-line header, followed by the list of table names.
You could skip the first line by piping to tail -n +2,
and then use another while loop to echo the database name and table name pairs in the desired format:
impala-shell -i impalad -q " show databases;" -B | while read a; do
impala-shell -q "show tables in ${a}" -B -i impalad | tail -n +2 | while read table; do
echo $a.$table
done
done
You could also do
impala-shell -q ... | awk -v db="$a" 'NR > 1 {print db "." $0}'

Remove header from query result in bq command line

I have a query $(bq query --format=csv "select value from $BQConfig where parameter = 'Columnwidth'") .
The output of the query in csv format is :
value
3 4 6 8
here i want to get only the result 3 4 6 8 not the value which is just a header.
I have gone through google document and found that --noprint_header works only for bq extract. i didnt find anything for bq query.
If you are on a bash shell, you could use sed or awk to skip the first lines:
bq query --format=csv "SELECT 1 x" | sed "2 d"
Or:
bq query --format=csv "SELECT 1 x" | awk 'NR>2'
You can use the --skip_leading_rows argument (source : Create a table from a file)

SQL query over multiple tables in one database

enter code hereI have the following problem:
I have a bunch of Tables in a Vertica Database say:
+------------+
| Tablenames |
+------------+
| a_1 |
| a_2 |
| a_34 |
| b_1 |
| b_4 |
+------------+
The tables are not exactly the same but have mostly similar entries. And now I want to make one query over all tables that start with a_ (a_1 a_2 a_34).
Is there a way to search through all the tables for the string a_ in their name, output some sort of list and than either use a for loop or join operation with the generated list?
Once I get the new table (lets call it temp_table) that has all the table names that start with a_ I would like to run one query over all of them, something like that (Matlab syntax):
for ii=1:length(temp_table)
Data{ii}=SELECT * FROM temp_table(ii) WHERE paste_condition_here
end
So Data should be a new table that appends the new rows with each iteration.
#Nirjihar - there is on information_schema (you need v_catalog), you are confusing with MySQL.
select TABLE_NAME from v_catalog.TABLES where TABLE_NAME like 'a_%';
This will return all tables with a criteria of 'a_%'
Just as a complement !
In Vertica you won`t have loops ! For this you have to use UDP(user defined procedures), this can be written in the language of your choice (shell,java,R,C++).
i will go ahead and post on model here for you :
1 - Shell proc - to be created in the procedures folder
#!/bin/bash
. /home/dbadmin/.profile
/opt/vertica/bin/vsql -U $username -w $password -t -o /tmp/query.sql -c"
SELECT
' select * from '
||TABLE_SCHEMA
||'.'
||TABLE_NAME
||';'
FROM
v_catalog.TABLES where TABLE_NAME like '%$1%'
"
/opt/vertica/bin/vsql -U $username -w $password -F $'|' -At -o /tmp/query_output.csv -f /tmp/query.sql
2 - change sh file privs
chmod 4750 query_table.sh
3 - make sure you have the .profile file populated accordingly
. /home/dbadmin/.profile
#!/bin/bash
username=dbadmin
password=secrectpasswd
export username
export password
Note: this is to avoid passwd in text and only have one point of text passwd
4 - Register the UDP with Vertica Catalog
. /home/dbadmin/.profile
admintools -t install_procedure -f /vertica/catalog//procedures/query_table.sh -d -p $password
5 -Create the UDP inside the database
. /home/dbadmin/.profile
/opt/vertica/bin/vsql -U $username -w $password -c "CREATE PROCEDURE dba.query_table(table_name varchar) AS 'query_table.sh' LANGUAGE 'external' USER 'dbadmin';"
6 - execute the proc
select dba.query_table('you possible table name here');
7 - check results
a - you will get a file with the query
b - one file with the exported data(csv '|' delimited).
i have a similar post here:
http://www.aodba.com/create-vertica-schema-fly/
To get all the tables that start from a_:-
select TABLE_NAME from INFORMATION_SCHEMA.TABLES where TABLE_NAME like 'a_%'
Then you can alias this and join the list or each table as you like.

How to display only the db2 query result via shell script and not the query?

There is probably a very simple solution here, but I am probably not using the right search terms. I have a sql query running in a shell script. I get the results I am looking for, however, I am also getting the sql query as part of of the result. How can I suppress this and just show the result?
My script:
#!/usr/bin/sh
db2 connect to MYDB >/dev/null 2>&1;
db2 -x -v "select A, B, C from MYTABLE";
db2 connect reset >/dev/null 2>&1;
And my output looks like this:
select A, B, C from MYTABLE
AAA BBB CCC
AAA BBB CCC
I would like to get rid of the first row and just show the result. What am I missing?
Thanks in advance for your help!
The -v option for the DB2 command line processor causes the current statement being executed to be printed in the output.
Remove the -v from your command and you'll get only the results of the query.
if you just want to skip the 1st row from your output you could:
yourscript.sh | tail -n +2
test with seq:
kent$ seq 5|tail -n +2
2
3
4
5
Try this
db2 -o query
for more info. http://www.ibm.com/developerworks/data/library/techarticle/adamache/0109adamache.html

regex to split name=value,* into csv of name,* and value,*

I would like to split a line such as:
name1=value1,name2=value2, .....,namen=valuen
two produce two lines as follows:
name1,name2, .....,namen
value1,value2, .....,valuen
the goal being to construct an sql insert along the lines of:
input="name1=value1,name2=value2, .....,namen=valuen"
namescsv=$( echo $input | sed 's/=[^,]*//g' )
valuescsv=$( echo $input | ?????? )
INSERT INTO table_name ( $namescsv ) VALUES ( $valuescsv )
Id like to do this as simply as possible - perl awk, or multiple piping to tr cut etc seems too complicated. Given the names part seems simple enough I figure there must be something similar for values but cant work it out.
You can just inverse your character match :
echo $input | sed 's/[^,]*=//g'
i think your best bet is still sed -re s/[^=,]*=([^,]*)/\1/g though I guess the input would have match your table exactly.
Note that in some RDBMS you can use the following syntax:
INSERT INTO table_name SET name=value, name2=value2, ...;
http://dev.mysql.com/doc/refman/5.5/en/insert.html
The following shell script does what you are asking for and takes care of escaping (not only because of injection, but you may want to insert values with quotes in them):
_IFS="$IFS"; IFS=","
line="name1=value1,name2=value2,namen=valuen";
for pair in $line; do
names="$names,${pair%=*}"
values="$values,'$(escape_sql "${pair#*=}")'"
done
IFS="$_IFS"
echo "INSERT INTO table_name ( ${names#,} ) VALUES ( ${values#,} )"
Output:
INSERT INTO table_name ( name1,name2,namen ) VALUES ( 'value1','value2','valuen' )