Retain backslashes with while read loop in multiple shells - while-loop

I have the following code:
#!/bin/sh
while read line; do
printf "%s\n" $line
done < input.txt
Input.txt has the following lines:
one\two
eight\nine
The output is as follows
onetwo
eightnine
The "standard" solutions to retain the slashes would be to use read -r.
However, I have the following limitations:
must run under #!/bin/shfor reasons of portability/posix compliance.
not all systems
will support the -r switch to read under /sh
The input file format cannot be changed
Therefore, I am looking for another way to retain the backslash after reading in the line. I have come up with one working solution, which is to use sed to replace the \ with some other value (e.g.||) into a temporary file (thus bypassing my last requirement above) then, after reading them in use sed again to transform it back. Like so:
#!/bin/sh
sed -e 's/[\/&]/||/g' input.txt > tempfile.txt
while read line; do
printf "%s\n" $line | sed -e 's/||/\\/g'
done < tempfile.txt
I'm thinking there has to be a more "graceful" way of doing this.
Some ideas:
1) Use command substitution to store this into a variable instead of a file. Problem - I'm not sure command substitution will be portable here either and my attempts at using a variable instead of a file were unsuccessful. Regardless, file or variable the base solution is really the same (two substitutions).
2) Use IFS somehow? I've investigated a little, but not sure that can help in this issue.
3) ???
What are some better ways to handle this given my constraints?
Thanks

Your constraints seem a little strict. Here's a piece of code I jotted down(I'm not too sure of how valuable your while loop is for the other stuffs you would like to do, so I removed it off just for ease). I don't guarantee this code to be robustness. But anyway, the logic would give you hints in the direction you may wish to proceed. (temp.dat is the input file)
#!/bin/sh
var1="$(cut -d\\ -f1 temp.dat)"
var2="$(cut -d\\ -f2 temp.dat)"
iter=1
set -- $var2
for x in $var1;do
if [ "$iter" -eq 1 ];then
echo $x "\\" $1
else
echo $x "\\" $2
fi
iter=$((iter+1))
done

As Larry Wall once said, writing a portable shell is easier than writing a portable shell script.
perl -lne 'print $_' input.txt
The simplest possible Perl script is simpler still, but I imagine you'll want to do something with $_ before printing it.

Related

How to search/replace a single inline with sed/awk? [duplicate]

This question already has answers here:
Save modifications in place with awk
(7 answers)
Closed 1 year ago.
I have a lot of files, where I would like to edit only those lines that start with private.
It principle I want to
gawk '/private/{gsub(/\//, "_"); gsub(/-/, "_"); print}' filename
but this only prints out the modified part of the file, and not everything.
Question
Does gawk have a way similar to sed -i inplace?
Or is there are much simpler way to do the above woth either sed or gawk?
Just move the final print outside of the filtered pattern. eg:
gawk '/private/{gsub(/\//, "_"); gsub(/-/, "_")} {print}'
usually, that is simplified to:
gawk '/private/{gsub(/\//, "_"); gsub(/-/, "_")}1'
You really, really, really, (emphasis on "really") do not want to use something like sed -i to edit the files "in-place". (I put "in-place" in quotes, because gnu's sed does not edit the files in place, but creates new files with the same name.) Doing so is a recipe for data corruption, and if you have a lot of files you don't want to take that risk. Just write the files into a new directory tree. It will make recovery much simpler.
eg:
d=backup/$(dirname "$filename")
mkdir -p "$d"
awk '...' "$filename" > "$d/$filename"
Consider if you used something like -i which puts backup files in the same directory structure. If you're modifying files in bulk and the process is stopped half-way through, how do you recover? If you are putting output into a separate tree, recovery is trivial. Your original files are untouched and pristine, and there are no concerns if your filtering process is terminated prematurely or inadvertently run multiple times. sed -i is a plague on humanity and should never be used. Don't spread the plague.
GNU awk from 4.1.0 has the in place ability.
And you should put the print outside the reg match block.
Try this:
gawk '/^private/{gsub(/[/-]/, "_");} 1' filename
or, make sure you backed up the file:
gawk -i inplace '/^private/{gsub(/[/-]/, "_");} 1' filename
You forgot the ^ to denote start, you need it to change lines started with private, otherwise all lines contain private will be modified.
And yeah, you can combine the two gsubs with a single one.
The sed command to do the same would be:
sed '/^private/{s/[/-]/_/g;}' filename
Add the -i option when you done testing it.

using literal string for gawk

I thing I'm too close to the problem already that I just can solve it on my own, alltough I'm sure it's easy to solve.
I'm working on a NAS with a SHELL Script for my Raspberry PI which automaticly collects data and distributes it over my other devices. So I decided to include a delete-option, since otherwise it would be a pain in the ass to delete a file, since the raspberry would always copy it right back from the other devices. While the script runs it creats a file: del_tmp_$ip.txt in which are directorys and files to delete from del_$ip.txt (Not del_TMP_$ip.txt).
It looks like this:
test/delete_me.txt
test/hello/hello.txt
pi.txt
I tried to delete the lines viá awk, and this is how far I got by now:
while read r; do
gawk -i inplace '!/^'$r'$/' del_$ip.txt
done <del_tmp_$ip.txt
If the line from del_tmp_$ip.txt tells gawk to delete pi.txt it works without problems, but if the string includes a slash like test/delete_me.txt it doesn't work:
"unexpected newline or end of string"
and it points to the last slash then.
I can't escape the forwardslash with a backwardslash manually, since I don't know whether and how many slashes there will be. Depending on the line of the file which contains the information to be deleted.
I hope you can help me!
Never allow a shell variable to expand to become part of the awk script text before awk evaluates it (which is what you're doing with '!/^'$r'$/') and always quote your shell variables (so the correct shell syntax would have been '!/^'"$r"'$/' IF it hadn't been the wrong approach anyway). The correct syntax to write that command would have been
awk -v r="$r" '$0 !~ "^"r"$"' file
but you said you wanted a string comparison, not regexp so then it'd be simply:
awk -v r="$r" '$0 != r' file
and of course you don't need a shell loop at all:
while read r; do
gawk -i inplace '!/^'$r'$/' del_$ip.txt
done <del_tmp_$ip.txt
you just need 1 awk command:
gawk -i inplace 'NR==FNR{skip[$0];print;next} !($0 in skip)' "del_tmp_$ip.txt" "del_$ip.txt"

How to assign the output from an Awk command to a shell variable?

I've tried to assign the output of an Awk command to a variable but I receive an error. I would like to assign and the echo the result in the variable.
count = `awk '$0 ~ /Reason code "68"/' ladb.log | wc -l`
I've enclosed the statement in backticks and receive this error below
/lsf9/db/dict/=: Unable to open dictionary: No such file or directory
DataArea = does not exist
Your main problem is your usage of spaces. You can't have a spaced assignment in shell scripts.
Backticks may be harmful to your code, but I haven't used IBM AIX in a very long time, so it may be essential to your Posix shell (though this guide and its coverage of $(…) vs `…` probably don't suggest a problem here). One thing you can try is running the code in ksh or bash instead.
The following code assumes a standards-compliant Posix shell. If they don't work, try replacing the "$(…)" notation with "`…`" notation. With these, since it's just a number being returned, you technically don't need the surrounding double quotes, but it's good practice.
count="$(awk '$0 ~ /Reason code "68"/' ladb.log | wc -l)"
The above should work, but it could be written more cleanly as
count="$(awk '/Reason code "68"/ {L++} END { print L }' ladb.log)"
As noted in the comments to the question, grep -c may be faster than awk, but if you know the location of that text, awk can be faster still. Let's say it begins a line:
count="$(awk '$1$2$3 == "Reasoncode\"68\"" {L++} END { print L }' ladb.log)"
Yes, Posix shell is capable of understanding double-quotes inside a "$(…)" are not related to the outer double-quotes, so only the inner double-quotes within that awk string need to be escaped.

Find a word in a text file and replace it with the filename

I have a lot of text files in which I would like to find the word 'CASE' and replace it with the related filename.
I tried
find . -type f | while read file
do
awk '{gsub(/CASE/,print "FILENAME",$0)}' $file >$file.$$
mv $file.$$ >$file
done
but I got the following error
awk: syntax error at source line 1 context is >>> {gsub(/CASE/,print <<< "CASE",$0)}
awk: illegal statement at source line 1
I also tried
for i in $(ls *);
do
awk '{gsub(/CASE/,${i},$0)}' ${i} > file.txt;
done
getting an empty output and
awk: syntax error at source line 1 context is >>> {gsub(/CASE/,${ <<<
awk: illegal statement at source line 1
Why awk? sed is what you want:
while read -r file; do
sed -i "s/CASE/${file##*/}/g" "$file"
done < <( find . -type f )
or
while read -r file; do
sed -i.bak "s/CASE/${file##*/}/g" "$file"
done < <( find . -type f )
To create a backup of the original.
You didn't post any sample input and expected output so this is a guess but maybe this is what you want:
find . -type f |
while IFS= read -r file
do
awk '{gsub(/CASE/,FILENAME)} 1' "$file" > "${file}.$$" &&
mv "${file}.$$" "$file"
done
Every change I made to the shell code is important so if you don't understand why I changed any part of it, ask the question.
btw if after making the changes you are still getting the error message:
awk: syntax error at source line 1
awk: illegal statement at source line 1
then you are using old, broken awk (/usr/bin/awk on Solaris). Never use that awk. On Solaris use /usr/xpg4/bin/awk (or nawk if you must).
Caveats: the above will fail if your file name contains newlines or ampersands (&) or escaped digits (e.g. \1). See Is it possible to escape regex metacharacters reliably with sed for details. If any of that is a problem, post some representative sample input and expected output.
print in that first script is the error.
The second argument to gsub is the replacement string not a command.
You want just FILENAME. (Note not "FILENAME" that's a literal string. FILENAME the variable.)
find . -type f -print0 | while IFS= read -d '' file
do
awk '{gsub(/CASE/,FILENAME,$0)} 7' "$file" >"$file.$$"
mv "$file.$$" "$file"
done
Note that I quoted all your variables and fixed your find | read pipeline to work correctly for files with odd characters in the names (see Bash FAQ 001 for more about that). I also fixed the erroneous > in the mv command.
See the answers on this question for how to properly escape the original filename to make it safe to use in the replacement portion of gsub.
Also note that recent (4.1+ I believe) versions of awk have the -i inplace argument.
To fix the second script you need to add the quotes you removed from the first script.
for i in *; do awk '{gsub(/CASE/,"'"${i}"'",$0)}' "${i}" > file.txt; done
Note that I got rid of the worse than useless use of ls (worse than useless because it actively breaks files with spaces or shell metacharacters in the their names (see Parsing ls for more on that).
That command though is somewhat ugly and unsafe for filenames with various characters in them and would be better written as the following though:
for i in *; do awk -v fname="$i" '{gsub(/CASE/,fname,$0)}' "${i}" > file.txt; done
since that will work with filenames with double quotes/etc. in their names correctly whereas the direct variable expansion version will not.
That being said the corrected first script is the right answer.

Awk Greater Than Less Than

I am using this command
num1=2.2
num2=4.5
result=$(awk 'BEGIN{print ($num2>$num1)?1:0}')
This always returns 0. Whether num2>numl or num1>num2
But when I put the actual numbers as such
result=$(awk 'BEGIN{print (4.5>2.2)?1:0}')
I would get a return value of 1. Which is correct.
What can I do to make this work?
The reason it fails when you use variables is because the awk script enclosed by single quotes is evaluated by awk and not bash: so if you'd like to pass variables you are using from bash to awk, you'll have to specify it with the -v option as follows:
num1=2.2
num2=4.5
result=$(awk -v n1=$num1 -v n2=$num2 'BEGIN{print (n2>n1)?1:0}')
Note that program variables used inside the awk script must not be prefixed with $
Try doing this :
result=$(awk -v num1=2.2 -v num2=4.5 'BEGIN{print (num2 > num1) ? 1 : 0}')
See :
man awk | less +/'^ *-v'
Because $num1 and $num2 are not expanded by bash -- you are using single quotes. The following will work, though:
result=$(awk "BEGIN{print ($num2>$num1)?1:0}")
Note, however, as pointed out in the comments that this is poor coding style and mixing bash and awk. Personally, I don't mind such constructs; but in general, especially for complex things and if you don't remember what things will get evaluated by bash when in double quotes, turn to the other answers to this question.
See the excellent example from #EdMorton below in the comments.
EDIT: Actually, instead of awk, I would use bc:
$num1=2.2
$num2=4.5
result=$( echo "$num2 > $num1" | bc )
Why? Because it is just a bit clearer... and lighter.
Or with Perl (because it is shorter and because I like Perl more than awk and because I like backticks more than $():
result=`perl -e "print ( $num2 > $num1 ) ? 1 : 0;"`
Or, to be fancy (and probably inefficient):
if [ `echo -e "$num1\n$num2" | sort -n | head -1` != "$num1" ] ; then result=0 ; else result=1 ; fi
(Yes, I know)
I had a brief, intensive, 3-year long exposure to awk, in prehistoric times. Nowadays bash is everywhere and can do loads of stuff (I had sh/csh only at that time) so often it can be used instead of awk, while computers are fast enough for Perl to be used in ad hoc command lines instead of awk. Just sayin'.
This might work for you:
result=$(awk 'BEGIN{print ('$num2'>'$num1')?1:0}')
Think of the ''s as like poking holes through the awk command to the underlying bash shell.