Keep set -e setting inside || or && - error-handling

I have a simple script with a simple function which can lead to an error. Let's define this function, and make it broken:
brokenFunction () {
ls "non-existing-folder"
}
If we execute this function in a block detecting if it is broken, it works well:
brokenFunction || printf "It is broken\n"
prints "It is broken"
Now, let's make the function a bit more complex, by adding a correct command at the end :
#!/bin/sh
brokenFunction () {
ls "non-existing-folder"
printf "End of function\n"
}
brokenFunction || printf "It is broken\n"
This script prints :
$ ./script.sh
ls: cannot access 'non-existing-folder': No such file or directory
End of function
while I expected the function to stop before the printf statement, and the next block to display "It is broken".
And indeed, if I check the exit status code of brokenFunction, it is 0.
I tried adding set -e to the top of the script. The behavior is still the same, but the exit code of brokenFunction if called without || now becomes 2. If called with it, the status code is still 0.
Is there any way to keep the set -e setting inside a function called with ||?
EDIT: I just realized that the function in the example was useless. I encounter the same issue with a simple block and a condition.
#!/bin/sh
set -e
{
ls "non-existing-dir"
printf "End of block\n"
} || {
printf "It is broken\n"
}
prints
$ ./script.sh
ls: cannot access 'non-existing-dir': No such file or directory
End of block

As written in man bash, set -e is ignored in some contexts. A command before || or && is such a context.
trap looks like a possible solution here. A working alternative to the last script using trap would look like that:
#!/bin/sh
abort () {
printf "It is broken\n"
}
trap 'abort' ERR
(
set -e
false
printf "End of block\n"
)
trap - ERR
Some things have to be noticed here:
trap 'abort' ERR binds the abort function to any raised error ;
the broken block is executed in a sub-shell for 2 reasons. First is to keep the set -e setting inside the block and limit the border effects. Second is to exit this sub-shell on error (set -e effect), and not the whole script ;
trap - ERR at the end resets the trap binding, meaning the following part of the script is executed as before.
To test the border effects, we can add the previously non-working part :
#!/bin/sh
abort () {
printf "It is broken\n"
}
trap 'abort' ERR
(
set -e
false
printf "End of block\n"
)
trap - ERR
{
false
printf "End of second block\n"
} || {
printf "It is broken too\n"
}
prints:
It is broken
End of second block

Related

Endless recursion in gawk-script

Please pardon me in advance for posting such a big part of my problem, but I just can't put my finger on the part that fails...
I got input-files like this (abas-FO if you care to know):
.fo U|xiininputfile = whatever
.type text U|xigibsgarnich
.assign U|xigibsgarnich
..
..Comment
.copy U|xigibswohl = Spaß
.ein "ow1/UWEDEFTEST.FOP"
.in "ow1/UWEINPUT2"
.continue BOTTOM
.read "SOemthing" U|xttmp
!BOTTOM
..
..
Now I want to recursivly follow each .in[put]/.ein[gabe]-statement, parse the mentioned file and if I don't know it yet, add it to an array. My code looks like this:
#!/bin/awk -f
function getFopMap(inputregex, infile, mandantdir, infiles){
while(getline f < infile){
#printf "*"
#don't match if there is a '
if(f ~ inputregex "[^']"){
#remove .input-part
sub(inputregex, "", f)
#trim right
sub(/[[:blank:]]+$/, "", f)
#remove leading and trailing "
gsub(/(^\"|\"$)/,"" ,f)
if(!(f in infiles)){
infiles[f] = "found"
}
}
}
close(infile)
for (i in infiles){
if(infiles[i] == "found"){
infiles[i] = "parsed"
cmd = "test -f \"" i "\""
if(system(cmd) == 0){
close(cmd)
getFopMap(inputregex, f, mandantdir, infiles)
}
}
}
}
BEGIN{
#Matches something like [.input myfile] or [.ein "ow1/myfile"]
inputregex = "^\\.(in|ein)[^[:blank:]]*[[:blank:]]+"
#Get absolute path of infile
cmd = "python -c 'import os;print os.path.abspath(\"" ARGV[1] "\")'"
cmd | getline rootfile
close(cmd)
infiles[rootfile] = "parsed"
getFopMap(inputregex, rootfile, mandantdir, infiles)
#output result
for(infile in infiles) print infile
exit
}
I call the script (in the same directory the paths are relative to) like this:
./script ow1/UWEDEFTEST.FOP
I get no output. It just hangs up. If I remove the comment before the printf "*" command, I'm seeing stars, without end.
I appreciate every help and hints how to do it better.
My awk:
gawk Version 3.1.7
idk it it's your only problem but you're calling getline incorrectly and consequently will go into an infinite loop in some scenarios. Make sure you fully understand all of the caveats at http://awk.info/?tip/getline and you might want to use the recursion example there as the starting point for your code.
The most important item initially for your code is that when getline fails it can return a negative value so then while(getline f < infile) will create an infinite loop since the failing getline will always be returning non-zero and will so continue to be called and continue to fail. You need to use while ( (getline f < infile) > 0) instead.

How can I check if a GNU awk coprocess is open, or force it to open without writing to it?

I have a gawk program that uses a coprocess. However, sometimes I don't have any data to write to the coprocess, and my original script hangs while waiting for the output of the coprocess.
The code below reads from STDIN, writes each line to a "cat" program, running as a coprocess. Then it reads the coprocess output back in and writes it to STDOUT. If we change the if condition to be 1==0, nothing gets written to the coprocess, and the program hangs at the while loop.
From the manual, it seems that the coprocess and the two-way communication channels are only started the first time there is an IO operation with the |& operator. Perhaps we can start things without actually writing anything (e.g. writing an empty string)? Or is there a way to check if the coprocess ever started?
#!/usr/bin/awk -f
BEGIN {
cmd = "cat"
## print "" |& cmd
}
{
if (1 == 1) {
print |& cmd
}
}
END {
close (cmd, "to")
while ((cmd |& getline line)>0) {
print line
}
close(cmd)
}
Great question, +1 for that!
Just test the return code of the close(cmd, "to") - it will be zero if the pipe was open, -1 (or some other value) otherwise. e.g.:
if (close(cmd, "to") == 0) {
while ((cmd |& getline line)>0) {
print line
}
close(cmd)
}

How do I get the exit status of a command in a getline pipeline?

In POSIX awk, how do I get the exit status (return code) from command after processing its output via command | getline var? I want my awk script to exit 1 if command exited with a non-zero exit status.
For example, suppose I had an awk script named foo.awk that looks like this:
function close_and_get_exit_status(cmd) {
# magic goes here...
}
BEGIN {
cmd = "echo foo; echo bar; echo baz; false"
while ((cmd | getline line) > 0)
print "got a line of text: " line
if (close_and_get_exit_status(cmd) != 0) {
print "ERROR: command '" cmd "' failed" | "cat >&2"
exit 1
}
print "command '" cmd "' was successful"
}
then I want the following to happen:
$ awk -f foo.awk
got a line of text: foo
got a line of text: bar
got a line of text: baz
ERROR: command 'echo foo; echo bar; echo baz; false' failed
$ echo $?
1
According to the POSIX specification for awk, command | getline returns 1 for successful input, zero for end-of-file, and -1 for an error. It's not an error if command exits with a non-zero exit status, so this can't be used to see if command is done and has failed.
Similarly, close() can't be used for this purpose: close() returns non-zero only if the close fails, not if the associated command returns a non-zero exit status. (In gawk, close(command) returns the exit status of command. This is the behavior I'd like, but I think it violates the POSIX spec and not all implementations of awk behave this way.)
The awk system() function returns the exit status of the command, but as far as I can tell there's no way to use getline with it.
The simplest thing to do is just echo the exit status from shell after the command executes and then read that with getline. e.g.
$ cat tst.awk
BEGIN {
cmd = "echo foo; echo bar; echo baz; false"
mod = cmd "; echo \"$?\""
while ((mod | getline line) > 0) {
if (numLines++)
print "got a line of text: " prev
prev = line
}
status = line
close(mod)
if (status != 0) {
print "ERROR: command '" cmd "' failed" | "cat >&2"
exit 1
}
print "command '" cmd "' was successful"
}
$ awk -f tst.awk
got a line of text: foo
got a line of text: bar
got a line of text: baz
ERROR: command 'echo foo; echo bar; echo baz; false' failed
$ echo $?
1
In case anyone's reading this and considering using getline, make sure you read http://awk.freeshell.org/AllAboutGetline and FULLY understand all the caveats and implications of doing so first.
Not an ideal solution, but you can do:
"command || echo failure" | getline var; ... if( var == "failure" ) exit;
There is some ambiguity in that you have to select the string "failure" in such a way that command can never generate the same string, but perhaps this is an adequate workaround.
The following is horrifically complicated, but it:
is POSIX conformant (mostly -- fflush() isn't yet in the POSIX standard, but it will be and it's widely available)
is general (it works no matter what kind of output is emitted by the command)
does not introduce any processing delay. The accepted answer to this question makes a line available only after the next line has been printed by the command. If the command slowly outputs lines and responsiveness is important (e.g., occasional events printed by an IDS system that should trigger a firewall change or email notification), this answer might be more appropriate than the accepted answer.
The basic approach is to echo the exit status/return value after the command completes. If this last line is non-zero, exit the awk script with an error. To prevent the code from mistaking a line of text output by the command for the exit status, each line of text output by the command is prepended with a letter that is later stripped off.
function stderr(msg) { print msg | "cat >&2"; }
function error(msg) { stderr("ERROR: " msg); }
function fatal(msg) { error(msg); exit 1; }
# Wrap cmd so that each output line of cmd is prefixed with "d".
# After cmd is done, an additional line of the format "r<ret>" is
# printed where "<ret>" is the integer return code/exit status of the
# command.
function safe_cmd_getline_wrap(cmd) {
return \
"exec 3>&1;" \
"ret=$(" \
" exec 4>&1;" \
" { ( "cmd" ) 4>&-; echo $? >&4; } 3>&- |" \
" awk '{print\"d\"$0;fflush()}' >&3 4>&-;" \
");" \
"exec 3>&-;" \
"echo r${ret};"
}
# like "cmd | getline line" except:
# * if getline fails, the awk script exits with an error
# * if cmd fails (returns non-zero), the awk script exits with an
# error
# * safe_cmd_getline_close(cmd) must be used instead of close(cmd)
function safe_cmd_getline(cmd, wrapped_cmd,ret,type) {
wrapped_cmd = safe_cmd_getline_wrap(cmd)
ret = (wrapped_cmd | getline line)
if (ret == -1) fatal("failed to read line from command: " cmd)
if (ret == 0) return 0
type = substr(line, 1, 1)
line = substr(line, 2)
if (type == "d") return 1
if (line != "0") fatal("command '" cmd "' failed")
return 0
}
function safe_cmd_getline_close(cmd) {
if (close(safe_cmd_getline_wrap(cmd))) fatal("failed to close " cmd)
}
You use the above like this:
cmd = "ls no-such-file"
while (safe_cmd_getline(cmd)) {
print "got a line of text: " line
}
safe_cmd_getline_close(cmd)
If you have mktemp command, you could store the exit status in a temporary file:
#!/bin/sh
set -e
file=$(mktemp)
finish() {
rm -f "$file"
}
trap 'finish' EXIT
trap 'finish; trap - INT; kill -s INT $$' INT
trap 'finish; trap - TERM; kill $$' TERM
awk -v file="$file" 'BEGIN{
o_cmd="echo foo; echo bar; echo baz; false"
cmd = "("o_cmd "); echo $? >\""file"\""
print cmd
while ((cmd | getline) > 0) {
print "got a line of text: " $0
}
close(cmd)
getline ecode <file; close(file)
print "exit status:", ecode
if(ecode)exit 1
}'

Awk iterating with out a loop construct

I was reading a tutorial on awk scripting, and observed this strange behaviour, Why this awk script while executing asks for a number repeatedly even with out a loop construct like while or for. If we enter CTRL+D(EOF) it stops prompting for another number.
#!/bin/awk -f
BEGIN {
print "type a number";
}
{
print "The square of ", $1, " is ", $1*$1;
print "type another number";
}
END {
print "Done"
}
Please explain this behaviour of the above awk script
awk continues to work on lines until end of file is reached. Since in this case the input (STDIN) never ends as you keep entering number or hitting enter, it causes an endless loop.
When you hit CTRL+D you indicate the awk script that EOF is reached there by exiting the loop.
try this and enter 0 to exit
BEGIN {
print "type a number";
}
{
if($1==0)
exit;
print "The square of ", $1, " is ", $1*$1;
print "type another number";
}
END {
print "Done"
}
From the famous The AWK Programming Language:
If you don't provide a input file to the awk script on the command line, awk will apply the program to whatever you type next on your terminal until you type an end-of-file signal (control-d on Unix systems).

Are there any AWK syntax checkers?

Are there any AWK syntax checkers? I'm interested in both minimal checkers that only flag syntax errors and more extensive checkers along the lines of lint.
It should be a static checker only, not dependent on running the script.
If you prefix your Awk script with BEGIN { exit(0) } END { exit(0) }, you're guaranteed that none of your of code will run. Exiting during BEGIN and END prevents other begin and exit blocks from running. If Awk returns 0, your script was fine; otherwise there was a syntax error.
If you put the code snippet in a separate argument, you'll get good line numbers in the error messages. This invocation...
gawk --source 'BEGIN { exit(0) } END { exit(0) }' --file syntax-test.awk
Gives error messages like this:
gawk: syntax-test.awk:3: x = f(
gawk: syntax-test.awk:3: ^ unexpected newline or end of string
GNU Awk's --lint can spot things like global variables and undefined functions:
gawk: syntax-test.awk:5: warning: function `g': parameter `x' shadows global variable
gawk: warning: function `f' called but never defined
And GNU Awk's --posix option can spot some compatibility problems:
gawk: syntax-test.awk:2: error: `delete array' is a gawk extension
Update: BEGIN and END
Although the END { exit(0) } block seems redundant, compare the subtle differences between these three invocations:
$ echo | awk '
BEGIN { print("at begin") }
/.*/ { print("found match") }
END { print("at end") }'
at begin
found match
at end
$ echo | awk '
BEGIN { exit(0) }
BEGIN { print("at begin") }
/.*/ { print("found match") }
END { print("at end") }'
at end
$ echo | awk '
BEGIN { exit(0) } END { exit(0) }
BEGIN { print("at begin") }
/.*/ { print("found match") }
END { print("at end") }'
In Awk, exiting during BEGIN will cancel all other begin blocks, and will prevent matching against any input. Exiting during END is the only way to prevent all other event blocks from running; that's why the third invocation above shows that no print statements were executed. The GNU Awk User's Guide has a section on the exit statement.
GNU awk appears to have a --lint option.
For a minimal syntax checker, which stops at the first error, try awk -f prog < /dev/null.