Bash while read : output issue - sql

Updated :
Initial issue :
Having a while read loop printing every line that is read
Answer : Put a done <<< "$var"
Subsequent issue :
I may need some explanations about some SHELL code :
I have this :
temp_ip=$($mysql --skip-column-names -h $db_address -u $db_user -p$db_passwd $db_name -e "select ip_routeur,code_site from $db_vtiger_table where $db_vtiger_table.ip_routeur NOT IN (select ip from $db_erreur_table);")
That gets results looking like this :
<ip1> <site1>
<ip2> <site2>
<ip3> <site3>
<ip4> <site4>
up to 5000 ip_address
I did a "while loop" :
while [ `find $proc_dir -name snmpproc* | wc -l` -ge "$max_proc_snmpget" ];do
{
echo "sleeping, fping in progress";
sleep 1;
}
done
temp_ip=$($mysql --skip-column-names -h $db_address -u $db_user -p$db_passwd $db_name -e "select ip_routeur,code_site from $db_vtiger_table where $db_vtiger_table.ip_routeur NOT IN (select ip from $db_erreur_table);")
while read ip codesite;do
{
sendSNMPGET $ip $snmp_community $code_site &
}
done<<<"$temp_ip"
And the sendSNMPGET function is :
sendSNMPGET() {
touch $procdir/snmpproc.$$
hostname=`snmpget -v1 -c $2 $1 sysName.0`
if [ "$hostname" != "" ]
then
echo "hi test"
fi
rm -f $procdir/snmpproc.$$
The $max_proc_snmpget is set to 30
At the execution, the read is ok, no more printing on screen, but child processes seems to be disoriented
hi
hi
hi
hi
hi
hi
hi
hi
hi
hi
hi
hi
./scan-snmp.sh: fork: Resource temporarily unavailable
./scan-snmp.sh: fork: Resource temporarily unavailable
./scan-snmp.sh: fork: Resource temporarily unavailable
./scan-snmp.sh: fork: Resource temporarily unavailable
Why can't it handle this ?

If temp_ip contains the name of a file that you want to read, then use:
done<"$temp_ip"
In your case, it appears that temp_ip is not a file name but contains the actual data that you want. In that case, use:
done<<<"$temp_ip"
Take care that the variable is placed inside double-quotes. That protects the data against the shell's word splitting which would result in the replacement of new line characters with spaces.
More details
In bash, an expression like <"$temp_ip" is called redirection. In this case in means that the while loop will get its standard input from the file called $temp_ip.
The expression <<<"$temp_ip" is called a here string. In this case, it means that the while loop will get its standard input from the data in the variable $temp_ip.
More information on both redirection and here strings in man bash.

Or you can parse the output of your initial command directly:
$mysql --skip-column-names -h $db_address -u $db_user -p$db_passwd $db_name -e "select ip_routeur,code_site from $db_vtiger_table where $db_vtiger_table.ip_routeur NOT IN (select ip from $db_erreur_table) | \
while read ip codesite
do
...
done
If you want to improve the performance and run some of the 5,000 SNMPGETs in parallel, I would recommend using GNU Parallel (here) like this:
$mysql --skip-column-names -h $db_address -u $db_user -p$db_passwd $db_name -e "select ip_routeur,code_site from $db_vtiger_table where $db_vtiger_table.ip_routeur NOT IN (select ip from $db_erreur_table) | parallel -k -j 20 -N 2 sendSNMPGET {1} $snmp_community {2}
The -k will keep the parallel output in order. The -j 20 will run up to 20 SNMPGETs in parallel at a time. The -N 2 means take 2 parameters from the mysql output per job (i.e. ip and codesite). {1} and {2} are your ip and codesite parameters.
http://www.gnu.org/software/parallel/

I propose to not store the result value but use it directly:
while read ip codesite
do
sendSNMPGET "$ip" "$snmp_community" "$code_site" &
done < <(
"$mysql" --skip-column-names -h "$db_address" -u "$db_user" -p"$db_passwd" "$db_name" \
-e "select ip_routeur,code_site from $db_vtiger_table where $db_vtiger_table.ip_routeur NOT IN (select ip from $db_erreur_table);")
This way you start the mysql command in a subshell and use its output as input to the while loop (similar to piping which here also is an option).
But I see some problems with that code: If you really start each sendSNMPGET command in the background, you very quickly will put a massive load on your computer. For each line you read another active background process is started. This can slow down your machine to the point where it is rendered useless.
I propose to not run more than 20 background processes at a time.

As you don't seem to have liked my answer with GNU Parallel, I'll show you a very simplistic way of doing it in parallel without needing to install that...
#!/bin/bash
MAX=8
j=0
while read ip code
do
(sleep 5; echo $ip $code) & # Replace this with your SNMPGET
((j++))
if [ $j -eq $MAX ]; then
echo -n Pausing with $MAX processes...
j=0
wait
fi
done < file
wait
This starts up to 8 processes (you can change it) and then waits for them to complete before starting another 8. You have already been shown how to feed your mysql stuff into the loop by other respondents in the second to last line of the script...
The key to this is the wait which will wait for all started processes to complete.

Related

How to place quotes mark in ansible task with grep, awk, sed

My task search for config in CMD column to gather information what is directory of application config and also PID.
---
- hosts: all
pre_tasks:
- name: Check if process is running
become: yes
shell: 'ps -e --format="pid cmd" | grep process.cfg | sed -e "s/[[:space:]]\+/ /g"| grep -v color'
register: proces_out
output looks like this after this command:
32423 /var/local/bin/application -c /var/local/etc/process.cfg
But i think ansible have troubles with 2 greps in 1 command. I need them both because if i dont use reversed "grep -v color" this anoying thing appears "grep --color=auto ", i cant cut out PID that i need in another task which kills process because real process is in second line.
My second idea was to use AWK, which i think would be the best tool for this case, but if i use double quotation marks in --format parameter and in SED command and the single quotation mark in awk parameters they dont want to cooperate. Even if i keep them balanced they interfere with them selfs.
AWK idea:
shell: 'ps -e --format="pid cmd" | grep process.cfg | sed -e "s/[[:space:]]\+/ /g"| awk 'FNR == 2''
I want to ask for a hint what would be the best to avoid incompatibility in code and be able to use it after as a output in variable
## PID
{{ proces_out.stdout.split(' ')[0] }}
## application
{{ proces_out.stdout.split(' ')[1] }}
## config
{{ proces_out.stdout.split(' ')[3] }}
But i think ansible have troubles with 2 greps in 1 command
That is for sure not true
if i dont use reversed "grep -v color" this anoying thing appears "grep --color=auto ", i cant cut out PID that i need in another task which kills process because real process is in second line.
You are running into the classic case of the grep process matching its own regex, as will happen in a lot of "simple" cases. What you want is a regex that matches your string but does not match itself. In that example above it would be:
shell: 'ps -e --format="pid cmd" | grep process[.]cfg | sed -e "s/[[:space:]]\+/ /g"'
because process[.]cfg matches process.cfg but does not match process[.]cfg I also fixed your regex because in a regex, the . means any character, which doesn't appear to be what you really wanted to happen
With regard to that --color bit, you can likely can side-step that nonsense by using the full path to grep, which will cause bash to really execute the binary, versus some alias that uses --color=auto; I actually wouldn't have expected the colors to show up in an ansible run, because it's not the right $TERM but systems are weird
Thank you Matthew for that solution, but i found diffirent option to avoid unnessesery output.
So syntax is almost the same, but i added to --format addonational parameter ppid Parent process id, in most case i belive parent process always have number 1 in output which helps to sort it as i want to.
It look like this:
shell: >
ps -e --format="ppid pid cmd" |
grep process.cfg |
sed -e "s/[[:space:]]\+/ /g"
register: output_process
And output looks like this:
1 54345 /var/local/bin/application -c /var/local/etc/process.cfg
6435 6577 grep --color=auto process.cfg
Now its easy we can use ansible modules to sort it:
- name: Kill process
become: yes
shell: "kill {{ output_process.stdout_lines[0].split(' ')[2] }}"
What it does? it selects line 0 which is first line, splits output between spaces and selects 3rd phrase. In output theres :space: before ppid thats why PID is 3rd
Thank you again for your solution Matthew, it might be helpfull in another case.

Optimize informix update

I have a bash script that will update a table based on a file. The way I have it it opens and closes for every line in the file and would like to understand how to open, perform all the updates, and then close. It is fine for a few updates but if it ever requires more than a few hundred could be really taxing on the system.
#!/bin/bash
file=/export/home/dncs/tmp/file.csv
dateFormat=$(date +"%m-%d-%y-%T")
LOGFILE=/export/home/dncs/tmp/Log_${dateFormat}.log
echo "${dateFormat} : Starting work" >> $LOGFILE 2>&1
while IFS="," read mac loc; do
if [[ "$mac" =~ ^([0-9a-fA-F]{2}:){5}[0-9a-fA-F]{2}$ ]]; then
dbaccess thedb <<EndOfUpdate >> $LOGFILE 2>&1
UPDATE profile
SET local_code= '$loc'
WHERE mac_address = '$mac';
EndOfUpdate
else
echo "Error: $mac not valid format" >> $LOGFILE 2>&1
fi
IIH -i $mac >> $LOGFILE 2>&1
done <"$file"
Source File.
12:BF:20:04:BB:30,POR-4
12:BF:21:1C:02:B1,POR-10
12:BF:20:04:72:FD,POR-4
12:BF:20:01:5B:4F,POR-10
12:BF:20:C2:71:42,POR-7
This is more or less what I'd do:
#!/bin/bash
fmt_date() { date +"%Y-%m-%d.%T"; }
file=/export/home/dncs/tmp/file.csv
dateFormat=$(fmt_date)
LOGFILE="/export/home/dncs/tmp/Log_${dateFormat}.log"
exec >> $LOGFILE 2>&1
echo "${dateFormat} : Starting work"
valid_mac='/^\(\([0-9a-fA-F]\{2\}:\)\{5\}[0-9a-fA-F]\{2\}\),\([^,]*\)$/'
update_stmt="UPDATE profile SET local_code = '\3' WHERE mac_address = '\1';"
sed -n -e "$valid_mac s//$update_stmt/p" "$file" |
dbaccess thedb -
sed -n -e "$valid_mac d; s/.*/Error: invalid format: &/p" "$file"
sed -n -e "$valid_mac s//IIH -i \1/p" "$file" | sh
echo "$(fmt_date) : Finished work"
I changed the date format to a variant of ISO 8601; it is easier to parse. You can stick with your Y2K-non-compliant US-ish format if you prefer. The exec line arranges for standard output and standard error from here onwards to go to the log file. The sed command all use the same structure, and all use the same pattern match stored in a variable. This makes consistency easier. The first sed script converts the data into UPDATE statements (which are fed to dbaccess). The second script identifies invalid MAC addresses; it deletes valid ones and maps the invalid lines into error messages. The third script ignores invalid MAC addresses but generates a IIH command for each valid one. The script records an end time — it will allow you to assess how long the processing takes. Again, repetition is avoided by creating and using the fmt_date function.
Be cautious about testing this. I had a file data containing:
87:36:E6:5E:AC:41,loc-OYNK
B2:4D:65:70:32:26,loc-DQLO
ZD:D9:BA:34:FD:97,loc-PLBI
04:EB:71:0D:29:D0,loc-LMEE
DA:67:53:4B:EC:C4,loc-SFUU
I replaced the dbaccess with cat, and the sh with cat. The log file I relocated to the current directory — leading to:
#!/bin/bash
fmt_date() { date +"%Y-%m-%d.%T"; }
#file=/export/home/dncs/tmp/file.csv
file=data
dateFormat=$(fmt_date)
#LOGFILE="/export/home/dncs/tmp/Log_${dateFormat}.log"
LOGFILE="Log-${dateFormat}.log"
exec >> $LOGFILE 2>&1
echo "${dateFormat} : Starting work"
valid_mac='/^\(\([0-9a-fA-F]\{2\}:\)\{5\}[0-9a-fA-F]\{2\}\),\([^,]*\)$/'
update_stmt="UPDATE profile SET local_code = '\3' WHERE mac_address = '\1';"
sed -n -e "$valid_mac s//$update_stmt/p" "$file" |
cat
#dbaccess thedb -
sed -n -e "$valid_mac d; s/.*/Error: invalid format: &/p" "$file"
#sed -n -e "$valid_mac s//IIH -i \1/p" "$file" | sh
sed -n -e "$valid_mac s//IIH -i \1/p" "$file" | cat
echo "$(fmt_date) : Finished work"
After I ran it, the log file contained:
2017-04-27.14:58:20 : Starting work
UPDATE profile SET local_code = 'loc-OYNK' WHERE mac_address = '87:36:E6:5E:AC:41';
UPDATE profile SET local_code = 'loc-DQLO' WHERE mac_address = 'B2:4D:65:70:32:26';
UPDATE profile SET local_code = 'loc-LMEE' WHERE mac_address = '04:EB:71:0D:29:D0';
UPDATE profile SET local_code = 'loc-SFUU' WHERE mac_address = 'DA:67:53:4B:EC:C4';
Error: invalid format: ZD:D9:BA:34:FD:97,loc-PLBI
IIH -i 87:36:E6:5E:AC:41
IIH -i B2:4D:65:70:32:26
IIH -i 04:EB:71:0D:29:D0
IIH -i DA:67:53:4B:EC:C4
2017-04-27.14:58:20 : Finished work
The UPDATE statements would have gone to DB-Access. The bogus MAC address was identified. The correct IIH commands would have been run.
Note that piping the output into sh requires confidence that the data you generate (the IIH commands) will be clean.

Retrieving process id using sshcmd on unix

I want to retrieve process id when my code successfully start the job. But its returning null.
I am starting job using sshcmd, creating log of sshcmd output, and then trying to retrieve process id in new_process_id using sshcmd. if I get new_process_id I will show new_process_id else I will show output collected in log file. But I am getting null in new_process_id.
remote_command="nohup J2EEServer/config/AMSS/scripts/${batch_job} & "
sshcmd -q -u ${login_user} -s ${QA_HOST} "$remote_command" > /tmp/nohup_${batch_job} 2>&1
remote_command=$(ps -ef | grep ${login_user} | grep $batch_job | grep -v grep | awk '{print $2}');
new_process_id=`sshcmd -q -u ${login_user} -s ${QA_HOST} "$remote_command"`
runstatus=`grep Synchronized. /tmp/nohup_${batch_job}`
if [[ $runstatus != "" ]]
then
new_process_id=`cat /tmp/nohup_${batch_job}`
fi
echo $new_process_id
The second variable remote_command is the output of that command run on your local machine.
Some other hints: If you are making a second, unrelated variable, give it another name. It will avoid unnecessary confusion.
What you are attempting to do next with runstatus and rewriting an already existing but not used variable is totally unclear to me.

awk doesn't work in hadoop's mapper

This is my hadoop job:
hadoop streaming \
-D mapred.map.tasks=1\
-D mapred.reduce.tasks=1\
-mapper "awk '{if(\$0<3)print}'" \ # doesn't work
-reducer "cat" \
-input "/user/***/input/" \
-output "/user/***/out/"
this job always fails, with an error saying:
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `export TMPDIR='..../work/tmp'; /bin/awk { if ($0 < 3) print } '
But if I change the -mapper into this:
-mapper "awk '{print}'"
it works without any error. What's the problem with the if(..) ?
UPDATE:
Thank #paxdiablo for your detailed answer.
what I really want to do is filter out some data whose 1st column is greater than x, before piping the input data to my custom bin. So the -mapper actually looks like this:
-mapper "awk -v x=$x{if($0<x)print} | ./bin"
Is there any other way to achieve that?
The problem's not with the if per se, it's to do with the fact that the quotes have been stripped from your awk command.
You'll realise this when you look at the error output:
sh: -c: line 0: `export TMPDIR='..../work/tmp'; /bin/awk { if ($0 < 3) print } '
and when you try to execute that quote-stripped command directly:
pax> echo hello | awk {if($0<3)print}
bash: syntax error near unexpected token `('
pax> echo hello | awk {print}
hello
The reason the {print} one works is because it doesn't contain the shell-special ( character.
One thing you might want to try is to escape the special characters to ensure the shell doesn't try to interpret them:
{if\(\$0\<3\)print}
It may take some effort to get the correctly escaped string but you can look at the error output to see what is generated. I've had to escape the () since they're shell sub-shell creation commands, the $ to prevent variable expansion, and the < to prevent input redirection.
Also keep in mind that there may be other ways to filter depending on you needs, ways that can avoid shell-special characters. If you specify what your needs are, we can possibly help further.
For example, you could create an shell script (eg, pax.sh) to do the actual awk work for you:
#!/bin/bash
awk -v x=$1 'if($1<x){print}'
then use that shell script in the mapper without any special shell characters:
hadoop streaming \
-D mapred.map.tasks=1 -D mapred.reduce.tasks=1 \
-mapper "pax.sh 3" -reducer "cat" \
-input "/user/***/input/" -output "/user/***/out/"

Extra slash in redis output

In the following example, why do I get an extra slash \ at the end of the string.
[root#server src]# echo 'testme one more word new line' | ./redis-cli -x set mytest
OK
[root#server src]# ./redis-cli
redis> get mytest
"testme one more word new line\"
In the above example, I do not want the \ in "line\". It is not there in the original echo statement.
What I'm getting is not a backslash, but a break line (backslash+n).
That is added by the "echo" command. You can use echo -n to avoid that extra break line:
$ echo -n 'testme one more word new line' | ./src/redis-cli -x set mytest
OK
$ ./src/redis-cli get mytest
"testme one more word new line"