How to assign the output from an Awk command to a shell variable? - awk

I've tried to assign the output of an Awk command to a variable but I receive an error. I would like to assign and the echo the result in the variable.
count = `awk '$0 ~ /Reason code "68"/' ladb.log | wc -l`
I've enclosed the statement in backticks and receive this error below
/lsf9/db/dict/=: Unable to open dictionary: No such file or directory
DataArea = does not exist

Your main problem is your usage of spaces. You can't have a spaced assignment in shell scripts.
Backticks may be harmful to your code, but I haven't used IBM AIX in a very long time, so it may be essential to your Posix shell (though this guide and its coverage of $(…) vs `…` probably don't suggest a problem here). One thing you can try is running the code in ksh or bash instead.
The following code assumes a standards-compliant Posix shell. If they don't work, try replacing the "$(…)" notation with "`…`" notation. With these, since it's just a number being returned, you technically don't need the surrounding double quotes, but it's good practice.
count="$(awk '$0 ~ /Reason code "68"/' ladb.log | wc -l)"
The above should work, but it could be written more cleanly as
count="$(awk '/Reason code "68"/ {L++} END { print L }' ladb.log)"
As noted in the comments to the question, grep -c may be faster than awk, but if you know the location of that text, awk can be faster still. Let's say it begins a line:
count="$(awk '$1$2$3 == "Reasoncode\"68\"" {L++} END { print L }' ladb.log)"
Yes, Posix shell is capable of understanding double-quotes inside a "$(…)" are not related to the outer double-quotes, so only the inner double-quotes within that awk string need to be escaped.

Related

using literal string for gawk

I thing I'm too close to the problem already that I just can solve it on my own, alltough I'm sure it's easy to solve.
I'm working on a NAS with a SHELL Script for my Raspberry PI which automaticly collects data and distributes it over my other devices. So I decided to include a delete-option, since otherwise it would be a pain in the ass to delete a file, since the raspberry would always copy it right back from the other devices. While the script runs it creats a file: del_tmp_$ip.txt in which are directorys and files to delete from del_$ip.txt (Not del_TMP_$ip.txt).
It looks like this:
test/delete_me.txt
test/hello/hello.txt
pi.txt
I tried to delete the lines viá awk, and this is how far I got by now:
while read r; do
gawk -i inplace '!/^'$r'$/' del_$ip.txt
done <del_tmp_$ip.txt
If the line from del_tmp_$ip.txt tells gawk to delete pi.txt it works without problems, but if the string includes a slash like test/delete_me.txt it doesn't work:
"unexpected newline or end of string"
and it points to the last slash then.
I can't escape the forwardslash with a backwardslash manually, since I don't know whether and how many slashes there will be. Depending on the line of the file which contains the information to be deleted.
I hope you can help me!
Never allow a shell variable to expand to become part of the awk script text before awk evaluates it (which is what you're doing with '!/^'$r'$/') and always quote your shell variables (so the correct shell syntax would have been '!/^'"$r"'$/' IF it hadn't been the wrong approach anyway). The correct syntax to write that command would have been
awk -v r="$r" '$0 !~ "^"r"$"' file
but you said you wanted a string comparison, not regexp so then it'd be simply:
awk -v r="$r" '$0 != r' file
and of course you don't need a shell loop at all:
while read r; do
gawk -i inplace '!/^'$r'$/' del_$ip.txt
done <del_tmp_$ip.txt
you just need 1 awk command:
gawk -i inplace 'NR==FNR{skip[$0];print;next} !($0 in skip)' "del_tmp_$ip.txt" "del_$ip.txt"

Retain backslashes with while read loop in multiple shells

I have the following code:
#!/bin/sh
while read line; do
printf "%s\n" $line
done < input.txt
Input.txt has the following lines:
one\two
eight\nine
The output is as follows
onetwo
eightnine
The "standard" solutions to retain the slashes would be to use read -r.
However, I have the following limitations:
must run under #!/bin/shfor reasons of portability/posix compliance.
not all systems
will support the -r switch to read under /sh
The input file format cannot be changed
Therefore, I am looking for another way to retain the backslash after reading in the line. I have come up with one working solution, which is to use sed to replace the \ with some other value (e.g.||) into a temporary file (thus bypassing my last requirement above) then, after reading them in use sed again to transform it back. Like so:
#!/bin/sh
sed -e 's/[\/&]/||/g' input.txt > tempfile.txt
while read line; do
printf "%s\n" $line | sed -e 's/||/\\/g'
done < tempfile.txt
I'm thinking there has to be a more "graceful" way of doing this.
Some ideas:
1) Use command substitution to store this into a variable instead of a file. Problem - I'm not sure command substitution will be portable here either and my attempts at using a variable instead of a file were unsuccessful. Regardless, file or variable the base solution is really the same (two substitutions).
2) Use IFS somehow? I've investigated a little, but not sure that can help in this issue.
3) ???
What are some better ways to handle this given my constraints?
Thanks
Your constraints seem a little strict. Here's a piece of code I jotted down(I'm not too sure of how valuable your while loop is for the other stuffs you would like to do, so I removed it off just for ease). I don't guarantee this code to be robustness. But anyway, the logic would give you hints in the direction you may wish to proceed. (temp.dat is the input file)
#!/bin/sh
var1="$(cut -d\\ -f1 temp.dat)"
var2="$(cut -d\\ -f2 temp.dat)"
iter=1
set -- $var2
for x in $var1;do
if [ "$iter" -eq 1 ];then
echo $x "\\" $1
else
echo $x "\\" $2
fi
iter=$((iter+1))
done
As Larry Wall once said, writing a portable shell is easier than writing a portable shell script.
perl -lne 'print $_' input.txt
The simplest possible Perl script is simpler still, but I imagine you'll want to do something with $_ before printing it.

In awk, how can I use a file containing multiple format strings with printf?

I have a case where I want to use input from a file as the format for printf() in awk. My formatting works when I set it in a string within the code, but it doesn't work when I load it from input.
Here's a tiny example of the problem:
$ # putting the format in a variable works just fine:
$ echo "" | awk -vs="hello:\t%s\n\tfoo" '{printf(s "bar\n", "world");}'
hello: world
foobar
$ # But getting the format from an input file does not.
$ echo "hello:\t%s\n\tfoo" | awk '{s=$0; printf(s "bar\n", "world");}'
hello:\tworld\n\tfoobar
$
So ... format substitutions work ("%s"), but not special characters like tab and newline. Any idea why this is happening? And is there a way to "do something" to input data to make it usable as a format string?
UPDATE #1:
As a further example, consider the following using bash heretext:
[me#here ~]$ awk -vs="hello: %s\nworld: %s\n" '{printf(s, "foo", "bar");}' <<<""
hello: foo
world: bar
[me#here ~]$ awk '{s=$0; printf(s, "foo", "bar");}' <<<"hello: %s\nworld: %s\n"
hello: foo\nworld: bar\n[me#here ~]$
As far as I can see, the same thing happens with multiple different awk interpreters, and I haven't been able to locate any documentation that explains why.
UPDATE #2:
The code I'm trying to replace currently looks something like this, with nested loops in shell. At present, awk is only being used for its printf, and could be replaced with a shell-based printf:
#!/bin/sh
while read -r fmtid fmt; do
while read cid name addy; do
awk -vfmt="$fmt" -vcid="$cid" -vname="$name" -vaddy="$addy" \
'BEGIN{printf(fmt,cid,name,addy)}' > /path/$fmtid/$cid
done < /path/to/sampledata
done < /path/to/fmtstrings
Example input would be:
## fmtstrings:
1 ID:%04d Name:%s\nAddress: %s\n\n
2 CustomerID:\t%-4d\t\tName: %s\n\t\t\t\tAddress: %s\n
3 Customer: %d / %s (%s)\n
## sampledata:
5 Companyname 123 Somewhere Street
12 Othercompany 234 Elsewhere
My hope was that I'd be able to construct something like this to do the entire thing with a single call to awk, instead of having nested loops in shell:
awk '
NR==FNR { fmts[$1]=$2; next; }
{
for(fmtid in fmts) {
outputfile=sprintf("/path/%d/%d", fmtid, custid);
printf(fmts[fmtid], $1, $2) > outputfile;
}
}
' /path/to/fmtstrings /path/to/sampledata
Obviously, this doesn't work, both because of the actual topic of this question and because I haven't yet figured out how to elegantly make awk join $2..$n into a single variable. (But that's the topic of a possible future question.)
FWIW, I'm using FreeBSD 9.2 with its built in, but I'm open to using gawk if a solution can be found with that.
Why so lengthy and complicated an example? This demonstrates the problem:
$ echo "" | awk '{s="a\t%s"; printf s"\n","b"}'
a b
$ echo "a\t%s" | awk '{s=$0; printf s"\n","b"}'
a\tb
In the first case, the string "a\t%s" is a string literal and so is interpreted twice - once when the script is read by awk and then again when it is executed, so the \t is expanded on the first pass and then at execution awk has a literal tab char in the formatting string.
In the second case awk still has the characters backslash and t in the formatting string - hence the different behavior.
You need something to interpret those escaped chars and one way to do that is to call the shell's printf and read the results (corrected per #EtanReiser's excellent observation that I was using double quotes where I should have had single quotes, implemented here by \047, to avoid shell expansion):
$ echo 'a\t%s' | awk '{"printf \047" $0 "\047 " "b" | getline s; print s}'
a b
If you don't need the result in a variable, you can just call system().
If you just wanted the escape chars expanded so you don't need to provide the %s args in the shell printf call, you'd just need to escape all the %s (watching out for already-escaped %s).
You could call awk instead of the shell printf if you prefer.
Note that this approach, while clumsy, is much safer than calling an eval which might just execute an input line like rm -rf /*.*!
With help from Arnold Robbins (the creator of gawk), and Manuel Collado (another noted awk expert), here is a script which will expand single-character escape sequences:
$ cat tst2.awk
function expandEscapes(old, segs, segNr, escs, idx, new) {
split(old,segs,/\\./,escs)
for (segNr=1; segNr in segs; segNr++) {
if ( idx = index( "abfnrtv", substr(escs[segNr],2,1) ) )
escs[segNr] = substr("\a\b\f\n\r\t\v", idx, 1)
new = new segs[segNr] escs[segNr]
}
return new
}
{
s = expandEscapes($0)
printf s, "foo", "bar"
}
.
$ awk -f tst2.awk <<<"hello: %s\nworld: %s\n"
hello: foo
world: bar
Alternatively, this shoudl be functionally equivalent but not gawk-specific:
function expandEscapes(tail, head, esc, idx) {
head = ""
while ( match(tail, /\\./) ) {
esc = substr( tail, RSTART + 1, 1 )
head = head substr( tail, 1, RSTART-1 )
tail = substr( tail, RSTART + 2 )
idx = index( "abfnrtv", esc )
if ( idx )
esc = substr( "\a\b\f\n\r\t\v", idx, 1 )
head = head esc
}
return (head tail)
}
If you care to, you can expand the concept to octal and hex escape sequences by changing the split() RE to
/\\(x[0-9a-fA-F]*|[0-7]{1,3}|.)/
and for a hex value after the \\:
c = sprintf("%c", strtonum("0x" rest_of_str))
and for an octal value:
c = sprintf("%c", strtonum("0" rest_of_str))
Since the question explicitly asks for an awk solution, here's one which works on all the awks I know of. It's a proof-of-concept; error handling is abysmal. I've tried to indicate places where that could be improved.
The key, as has been noted by various commentators, is that awk's printf -- like the C standard function it is based on -- does not interpret backslash-escapes in the format string. However, awk does interpret them in command-line assignment arguments.
awk 'BEGIN {if(ARGC!=3)exit(1);
fn=ARGV[2];ARGC=2}
NR==FNR{ARGV[ARGC++]="fmt="substr($0,length($1)+2);
ARGV[ARGC++]="fmtid="$1;
ARGV[ARGC++]=fn;
next}
{match($0,/^ *[^ ]+[ ]+[^ ]+[ ]+/);
printf fmt,$1,$2,substr($0,RLENGTH+1) > ("data/"fmtid"/"$1)
}' fmtfile sampledata
(
What's going on here is that the 'FNR==NR' clause (which executes only on the first file) adds the values (fmtid, fmt) from each line of the first file as command-line assignments, and then inserts the data file name as a command-line argument. In awk, assignments as command line arguments are simply executed as though they were assignments from a string constant with implicit quotes, including backslash-escape processing (except that if the last character in the argument is a backslash, it doesn't escape the implicit closing double-quote). This behaviour is mandated by Posix, as is the order in which arguments are processed which makes it possible to add arguments as you go.
As written, the script must be provided with exactly two arguments: the formats and the data (in that order). There is some room for improvement, obviously.
The snippet also shows two ways of concatenating trailing fields.
In the format file, I assume that the lines are well behaved (no leading spaces; exactly one space after the format id). With those constraints, substr($0, length($1)+2) is precisely the part of the line after the first field and a single space.
Processing the datafile, it may be necessary to do this with fewer constraints. First, the builtin match function is called with the regular expression /^ *[^ ]+[ ]+[^ ]+[ ]+/ which matches leading spaces (if any) and two space-separated fields, along with the following spaces. (It would be better to allow tabs, as well.) Once the regex matches (and matching shouldn't be assumed, so there's another thing to fix), the variables RSTART and RLENGTH are set, so substr($0, RLENGTH+1) picks up everything starting with the third field. (Again, this is all Posix-standard behaviour.)
Honestly, I'd use the shell printf for this problem, and I don't understand why you feel that solution is somehow sub-optimal. The shell printf interprets backslash escapes in formats, and the shell read -r will do the line splitting the way you want. So there's no reason for awk at all, as far as I can see.
Ed Morton shows the problem clearly (edit: and it's now complete, so just go accept it): awk's string literal processing handled the escapes, and file I/O code isn't a lexical analyzer.
It's an easy fix: decide what escapes you want to support, and support them. Here's a one-liner form if you're doing special-purpose work that doesn't need to handle escaped backslashes
awk '{ gsub(/\\n/,"\n"); gsub(/\\t/,"\t"); printf($0 "bar\n", "world"); }' <<\EOD
hello:\t%s\n\tfoo
EOD
but for doit-and-forgetit peace of mind just use the full form in the linked answer.
#Ed Morton's answer explains the problem well.
A simple workaround is to:
pass the format-string file contents via an awk variable, using command substitution,
assuming that file is not too large to be read into memory in full.
Using GNU awk or mawk:
awk -v formats="$(tr '\n' '\3' <fmtStrings)" '
# Initialize: Split the formats into array elements.
BEGIN {n=split(formats, aFormats, "\3")}
# For each data line, loop over all formats and print.
{ for(i=1;i<n;++i) {printf aFormats[i] "\n", $1, $2, $3} }
' sampleData
Note:
The advantage of this solution is that it works generically - you don't need to anticipate specific escape sequences and handle them specially.
On FreeBSD awk, this almost works, but - sadly - split() still splits by newlines, despite being given an explicit separator - this smells like a bug. Observed on versions 20070501 (OS X 10.9.4) and 20121220 (FreeBSD 10.0).
The above solves the core problem (for brevity, it omits stripping the ID from the front of the format strings and omits the output-file creation logic).
Explanation:
tr '\n' '\3' <fmtStrings replaces actual newlines in the format-strings file with \3 (0x3) characters, so as to be able to later distinguish them from the \n escape sequences embedded in the lines, which awk turns into actual newlines when assigning to variable formats (as desired).
\3 (0x3) - the ASCII end-of-text char. - was arbitrarily chosen as an auxiliary separator that is assumed not to be present in the input file.
Note that using \0 (NUL) is NOT an option, because awk interprets that as an empty string, causing split() to split the string into individual characters.
Inside the BEGIN block of the awk script, split(formats, aFormats, "\3") then splits the combined format strings back into individual format strings.
I had to create another answer to start clean, I believe I've come to a good solution, again with perl:
echo '%10s\t:\t%10s\r\n' | perl -lne 's/((?:\\[a-zA-Z\\])+)/qq[qq[$1]]/eeg; printf "$_","hi","hello"'
hi : hello
That bad boy s/((?:\\[a-zA-Z\\])+)/qq[qq[$1]]/eeg will translate any meta character I can think of, let us take a look with cat -A :
echo '%10s\t:\t%10s\r\n' | perl -lne 's/((?:\\[a-zA-Z\\])+)/qq[qq[$1]]/eeg; printf "$_","hi","hello"' | cat -A
hi^I:^I hello^M$
PS. I didn't create that regex, I googled unquote meta and found here
What you are trying to do is called templating. I would suggest that shell tools are not the best tools for this job. A safe way to go would be to use a templating library such as Template Toolkit for Perl, or Jinja2 for Python.
The problem lies in the non-interpretation of the special characters \t and \n by echo: it makes sure that they are understood as as-is strings, and not as tabulations and newlines. This behavior can be controlled by the -e flag you give to echo, without changing your awk script at all:
echo -e "hello:\t%s\n\tfoo" | awk '{s=$0; printf(s "bar\n", "world");}'
tada!! :)
EDIT:
Ok, so after the point rightfully raised by Chrono, we can devise this other answer corresponding to the original request to have the pattern read from a file:
echo "hello:\t%s\n\tfoo" > myfile
awk 'BEGIN {s="'$(cat myfile)'" ; printf(s "bar\n", "world")}'
Of course in the above we have to be careful with the quoting, as the $(cat myfile) is not seen by awk but interpreted by the shell.
This looks extremely ugly, but it works for this particular problem:
s=$0;
gsub(/'/, "'\\''", s);
gsub(/\\n/, "\\\\\\\\n", s);
"printf '%b' '" s "'" | getline s;
gsub(/\\\\n/, "\n", s);
gsub(/\\n/, "\n", s);
printf(s " bar\n", "world");
Replace all single quotes with shell-escaped single quotes ('\'').
Replace all escaped newline sequences that appear normally as \n with the sequence that appears as \\\\n. It would suffice to use \\\\n as the actual replacement string (meaning \\n would print if you printed it), but the version of gawk I have messes things up in POSIX mode.
Invoke the shell to execute printf '%b' 'escape'\''d format' and use awk's getline statement to retrieve the line.
Unescape \\n to yield a newline. This step wouldn't be necessary if gawk in POSIX mode played nicely.
Unescape \n to yield a newline.
Otherwise you're left to call the gsub function for each possible escape sequence, which is terrible for \001, \002, etc.
Graham,
Ed Morton's solution is the best (and perhaps only) one available.
I'm including this answer for a better explanation of WHY you're seeing what you're seeing.
A string is a string. The confusing part here is WHERE awk does the translation of \t to a tab, \n to a newline, etc. It appears NOT to be the case that the backslash and t get translated when used in a printf format. Instead, the translation happens at assignment, so that awk stores the tab as part of the format rather than translating when it runs the printf.
And this is why Ed's function works. When read from stdin or a file, no assignment is performed that will implement the translation of special characters. Once you run the command s="a\tb"; in awk, you have a three character string containing no backslash or t.
Evidence:
$ echo "a\tb\n" | awk '{ s=$0; for (i=1;i<=length(s);i++) {printf("%d\t%c\n",i,substr(s,i,1));} }'
1 a
2 \
3 t
4 b
5 \
6 n
vs
$ awk 'BEGIN{s="a\tb\n"; for (i=1;i<=length(s);i++) {printf("%d\t%c\n",i,substr(s,i,1));} }'
1 a
2
3 b
4
And there you go.
As I say, Ed's answer provides an excellent function for what you need. But if you can predict what your input will look like, you can probably get away with a simpler solution. Knowing how this stuff gets parsed, if you have a limited set of characters you need to translate, you may be able to survive with something simple like:
s=$0;
gsub(/\\t/,"\t",s);
gsub(/\\n/,"\n",s);
That's a cool question, I don't know the answer in awk, but in perl you can use eval :
echo '%10s\t:\t%-10s\n' | perl -ne ' chomp; eval "printf (\"$_\", \"hi\", \"hello\")"'
hi : hello
PS. Be aware of code injection danger when you use eval in any language, no just eval any system call can't be done blindly.
Example in Awk:
echo '$(whoami)' | awk '{"printf \"" $0 "\" " "b" | getline s; print s}'
tiago
What if the input was $(rm -rf /)? You can guess what would happen :)
ikegami adds:
Why would even think of using eval to convert \n to newlines and \t to tabs?
echo '%10s\t:\t%-10s\n' | perl -e'
my %repl = (
n => "\n",
t => "\t",
);
while (<>) {
chomp;
s{\\(?:(\w)|(\W))}{
if (defined($2)) {
$2
}
elsif (exists($repl{$1})) {
$repl{$1}
}
else {
warn("Unrecognized escape \\$1.\n");
$1
}
}eg;
printf($_, "hi", "hello");
}
'
Short version:
echo '%10s\t:\t%-10s\n' | perl -nle'
s/\\(?:(n)|(t)|(.))/$1?"\n":$2?"\t":$3/seg;
printf($_, "hi", "hello");
'

Awk Greater Than Less Than

I am using this command
num1=2.2
num2=4.5
result=$(awk 'BEGIN{print ($num2>$num1)?1:0}')
This always returns 0. Whether num2>numl or num1>num2
But when I put the actual numbers as such
result=$(awk 'BEGIN{print (4.5>2.2)?1:0}')
I would get a return value of 1. Which is correct.
What can I do to make this work?
The reason it fails when you use variables is because the awk script enclosed by single quotes is evaluated by awk and not bash: so if you'd like to pass variables you are using from bash to awk, you'll have to specify it with the -v option as follows:
num1=2.2
num2=4.5
result=$(awk -v n1=$num1 -v n2=$num2 'BEGIN{print (n2>n1)?1:0}')
Note that program variables used inside the awk script must not be prefixed with $
Try doing this :
result=$(awk -v num1=2.2 -v num2=4.5 'BEGIN{print (num2 > num1) ? 1 : 0}')
See :
man awk | less +/'^ *-v'
Because $num1 and $num2 are not expanded by bash -- you are using single quotes. The following will work, though:
result=$(awk "BEGIN{print ($num2>$num1)?1:0}")
Note, however, as pointed out in the comments that this is poor coding style and mixing bash and awk. Personally, I don't mind such constructs; but in general, especially for complex things and if you don't remember what things will get evaluated by bash when in double quotes, turn to the other answers to this question.
See the excellent example from #EdMorton below in the comments.
EDIT: Actually, instead of awk, I would use bc:
$num1=2.2
$num2=4.5
result=$( echo "$num2 > $num1" | bc )
Why? Because it is just a bit clearer... and lighter.
Or with Perl (because it is shorter and because I like Perl more than awk and because I like backticks more than $():
result=`perl -e "print ( $num2 > $num1 ) ? 1 : 0;"`
Or, to be fancy (and probably inefficient):
if [ `echo -e "$num1\n$num2" | sort -n | head -1` != "$num1" ] ; then result=0 ; else result=1 ; fi
(Yes, I know)
I had a brief, intensive, 3-year long exposure to awk, in prehistoric times. Nowadays bash is everywhere and can do loads of stuff (I had sh/csh only at that time) so often it can be used instead of awk, while computers are fast enough for Perl to be used in ad hoc command lines instead of awk. Just sayin'.
This might work for you:
result=$(awk 'BEGIN{print ('$num2'>'$num1')?1:0}')
Think of the ''s as like poking holes through the awk command to the underlying bash shell.

AWK - Transmission of a variable with getline to system ()?

I have a theoretical question:
1) How pass a variable to the system of getline ()?
awk 'BEGIN{var="ls"; var | getline var; system("echo $var")}'
2) How to assign a variable the output system ("ls") and print the result in awk?
awk 'BEGIN{var="system("ls")"; print '$var'}'
3) Can you assign a variable in the system (var = "ls") and print the result in awk?
awk 'BEGIN{system(var="ls"); print "'"$var"'"}'
Thank you for the information.
EDIT:
torek: Thank you for your response.
I understand that in the first example, you can do this:
awk 'BEGIN { while ("ls -l" | getline var) system("echo " var );}'
For this application, you can not assign a variable output from system ()? As in this example:
awk 'BEGIN {var="ls -l"; system(var); print var}'
You're looking at this the wrong way, I think. Awk's system just takes any old string, so give it one, e.g.:
system("echo " var); # see side note below
(remember that in awk, strings are concatenated by adjacency). Moreover, system just runs a command; to capture its output, you need to use getline, similar to your question #1.
If you want to read all the output of ls you need to loop over the result from getline:
awk 'BEGIN { while ("ls" | getline var) print "I got: " var; }'
Since this defines only a BEGIN action, awk will start up, run ls, collect each output line and print it, and then exit.
Side note: be very careful with variables passed to a shell (this includes both calls to system and items on the left hand side of | getline, plus some other cases in modern varieties of awk—anything that runs a command). Backquotes, $(command), and semicolons can all allow users to invoke arbitrary commands. For instance, in the system("echo " var) example above, if var contains ; rm -rf $HOME the command becomes echo ; rm -rf $HOME, which is almost certainly not something you want to have happen.
You can check for "bad" characters and either object, or quote them. Modern 8-bit-clean shells should only require quoting quotes themselves (for syntactic validity), $, <, >, |, and `. If you use single quotes to quote arguments (to make them appear as a single "word"), you need only escape the single quotes. See this unix.stackexchange.com answer for more details.
One other side note: I tend to add "unnecessary" semicolons to my awk scripts, making them look more like C syntactically. Old habit from decades ago.