Awk syntax error due to improper placement of the function definition

Awk syntax error due to improper placement of the function definition - awk

It is embarrassing, but could someone briefly explain why this gives syntax error?
echo why this fails? | gawk '{
function why(fail) {
print fail
}
why($0)
}'

The function definition has to be at the top level. You have it inside the {...}.
echo This works | gawk '
function why(fail) {
print fail
}
{
why($0)
}'

Related

How to write awk function

I have a quick file file.awk with
function attr(attrname,str, a) {
if (!str) str=$0
match(str,"#" attrname "=([^,/]*)",a)
return a[1]
}
I am getting an error
awk: file.awk: line 139: syntax error at or near ,
where line 139 is the line with match()
Any idea whats wrong with the syntax?

You're trying to use the non-POSIX 3rd arg to match() but not using GNU awk which supports it. See https://www.gnu.org/software/gawk/manual/gawk.html#String-Functions.

As always, Ed Morton has it correct. The only reason I'm posting is that I can share with MacOS users that the awk man page includes the answer.
man awk yields:
match(s, r)
the position in s where the regular expression r occurs, or 0 if it does not. The variables RSTART and RLENGTH are set to the position and
length of the matched string.
So it is very clear that your call has an extra parameter.
I tried runnning awk with your function on MacOS and got this error:
awk: syntax error at source line 1 in function attr source file
context is
match(str,"#" attrname >>> "=([^,/]*)", <<<
The "<<<" is indicating awk doesn't like that comma. This is similar to your error message:
syntax error at or near ,

You absolutely don't need the match() function's capture array to get the attribute value you're seeking :
echo 'april#token=sha256hmackey/0.ts' \
\
| mawk 'function attr(___,_,__,____) { ____="\32\21"
if(_=="") {
_=$(_<_) }
return \
sub("[#]"(___)"=[^,\\/]*",____"&"____,_) \
\
? substr(_=__[split(_,__,____)-\
(_~_)],index(_,"=")+(_~"")) : ""
} {
print attr("token") }'
sha256hmackey

Trap or evaluate bad regular expression string at runtime in awk script

How can I trap an error if a dynamic regular expression evaluation is bad like:
var='lazy dog'
# a fixed Regex here, but original is coming from ouside the script
Regex='*.'
#try and failed
if (var ~ Regex) foo
The goal is to manage this error as I cannot test the regex itself (it comes from external source). Using POSIX awk (AIX)

Something like this?
$ echo 'foo' |
awk -v re='*.' '
BEGIN {
cmd="awk --posix \047/" re "/\047 2>&1"
cmd | getline rslt
print "rslt="rslt
close(cmd)
}
{ print "got " $0 " but re was bad" }
'
rslt=awk: cmd. line:1: error: Invalid preceding regular expression: /*./
got foo but re was bad
I use gawk so I had to add --posix to make it not just accept that regexp as a literal * followed by any char. You'll probably have to change the awk command being called in cmd to behave sensibly for your needs with both valid and invalid regexps but you get the idea - to do something like an eval in awk you need to have awk call itself via system() or a pipe to getline. Massage to suit...
Oh, and I don't think you can get the exit status of cmd with the above syntax and you can't capture the output of a system() call within awk so you may need to test the re twice - first with system() to find out if it fails but redirecting it's output to /dev/null, and then on a failure run it again with getline to capture the error message.
Something like:
awk -v re='*.' '
BEGIN {
cmd="awk --posix \047/" re "/\047 2>&1"
if ( system(cmd " > /dev/null") ) {
close(cmd " > /dev/null")
cmd | getline rslt
print "rslt="rslt
close(cmd)
}
}
{ print "got " $0 " but re was bad" }
'

reproducing grep "my pattern" myfile.log | sort | uniq | wc -l in awk

If I perform this grep on my target file I get eg 275 as result.
But I want to learn awk so tried this in awk:
awk 'BEGIN { count=0 } /my pattern/ { count++ } END { print count }' myfile.log
And this prints the 275 as expected.
So getting ambitious I created an awk script like this:
BEGIN {
print "Log File Analysis";
message=0;
events=0;
}
{
/message/ { messages++; }
/event/ { events++; }
}
END {
print "messages:\t" messages;
print "events:\t" events;
}
I get a syntax error,
$ awk -f test_learn.awk test_log.log
awk: test_learn.awk:16: /message/ { messages++; }
awk: test_learn.awk:16: ^ syntax error
What am I doing wrong?
I am using awk from MinGW shell on windows 7.

try
awk 'BEGIN { count=0 }; /my pattern/{count++ }; END { print count }' myfile.log
OR
awk 'BEGIN { count=0}; { if ($0 ~ /my pattern/) count++ }; END { print count };' myfile.log
Better yet, as variables are initialized as zero by default, you don't need the BEGIN block, so
awk '/my pattern/{count++ }; END { print count };' myfile.log
You can either have a default loop applied to all lines in a file, as in 2d example with the if, or you can have multiple blocks, "filtered" by pattern, as above, and in your edited addition.
When doing one-liners have you have, some awks required the semi-colon to separate the BEGIN and END blocks from the main loop block.
Edit
Same Idea with your 2nd issue, and integrating Ed Morton's improvments (thanks)
/message/ { messages++ }
/event/ { events++ }
END {
print "Log File Analysis"
print "messages:\t" messages
print "events:\t" events
}
IHTH

Using a variable defined inside AWK

I got this piece of script working. This is what i wanted:
input
3.76023 0.783649 0.307724 8766.26
3.76022 0.764265 0.307646 8777.46
3.7602 0.733251 0.30752 8821.29
3.76021 0.752635 0.307598 8783.33
3.76023 0.79528 0.307771 8729.82
3.76024 0.814664 0.307849 8650.2
3.76026 0.845679 0.307978 8802.97
3.76025 0.826293 0.307897 8690.43
with script
!/bin/bash
awk -F ', ' '
{
for (i=3; i<=10; i++) {
if (i==NR) {
npc1[i]=sprintf("%s", $1);
npc2[i]=sprintf("%s", $2);
npc3[i]=sprintf("%s", $3);
npRs[i]=sprintf("%s", $4);
print npc1[i],npc2[i],\
npc3[i], npc4[i];
}
}
} ' p_walls.raw
echo "${npc1[100]}"
But now I can't use those arrays npc1[i], outside awk. That last echo prints nothing. Isnt it possible or am I missing something?

AWK is a separate process, after it finishes all internal data is gone. This is true for all external processes/commands. Bash only sees what bash builtins touch.
i is never 100, so why do you want to access npc1[100]?
What are you really trying to do? If you rewrite the question we might be able to help...

(Cherry on the cake is always good!)
Sorry, but all of #yi_H 's answer and comments above are correct.
But there's really no problem loading 2 sets of data into 2 separate arrays in awk, ie.
awk '{
if (FILENAME == "file1") arr1[i++]=$0 ;
#same for file2; }
END {
f1max=++i; f2max=++j;
for (i=1;i<f1max;i++) {
arr1[i]
# put what you need here for arr1 processing
#
# dont forget that you can do things like
if (arr1[i] in arr2) { print arr1[i]"=arr2[arr1["i"]=" arr2[arr1[i]] }
}
for j=1;j<f2max;j++) {
arr2[j]
# and here for arr2
}
}' file1 file2
You'll have to fill the actual processing for arr1[i] and arr2[j].
Also, get an awk book for the weekend and be up and running by Monday. It's easy. You can probably figure it out from grymoire.com/Unix/awk.html
I hope this helps.

Are there any AWK syntax checkers?

Are there any AWK syntax checkers? I'm interested in both minimal checkers that only flag syntax errors and more extensive checkers along the lines of lint.
It should be a static checker only, not dependent on running the script.

If you prefix your Awk script with BEGIN { exit(0) } END { exit(0) }, you're guaranteed that none of your of code will run. Exiting during BEGIN and END prevents other begin and exit blocks from running. If Awk returns 0, your script was fine; otherwise there was a syntax error.
If you put the code snippet in a separate argument, you'll get good line numbers in the error messages. This invocation...
gawk --source 'BEGIN { exit(0) } END { exit(0) }' --file syntax-test.awk
Gives error messages like this:
gawk: syntax-test.awk:3: x = f(
gawk: syntax-test.awk:3: ^ unexpected newline or end of string
GNU Awk's --lint can spot things like global variables and undefined functions:
gawk: syntax-test.awk:5: warning: function `g': parameter `x' shadows global variable
gawk: warning: function `f' called but never defined
And GNU Awk's --posix option can spot some compatibility problems:
gawk: syntax-test.awk:2: error: `delete array' is a gawk extension
Update: BEGIN and END
Although the END { exit(0) } block seems redundant, compare the subtle differences between these three invocations:
$ echo | awk '
BEGIN { print("at begin") }
/.*/ { print("found match") }
END { print("at end") }'
at begin
found match
at end
$ echo | awk '
BEGIN { exit(0) }
BEGIN { print("at begin") }
/.*/ { print("found match") }
END { print("at end") }'
at end
$ echo | awk '
BEGIN { exit(0) } END { exit(0) }
BEGIN { print("at begin") }
/.*/ { print("found match") }
END { print("at end") }'
In Awk, exiting during BEGIN will cancel all other begin blocks, and will prevent matching against any input. Exiting during END is the only way to prevent all other event blocks from running; that's why the third invocation above shows that no print statements were executed. The GNU Awk User's Guide has a section on the exit statement.

GNU awk appears to have a --lint option.

For a minimal syntax checker, which stops at the first error, try awk -f prog < /dev/null.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Awk syntax error due to improper placement of the function definition - awk

It is embarrassing, but could someone briefly explain why this gives syntax error? echo why this fails? | gawk '{ function why(fail) { print fail } why($0) }'

The function definition has to be at the top level. You have it inside the {...}. echo This works | gawk ' function why(fail) { print fail } { why($0) }'

Related

How to write awk function

Trap or evaluate bad regular expression string at runtime in awk script

reproducing grep "my pattern" myfile.log | sort | uniq | wc -l in awk

Using a variable defined inside AWK

Are there any AWK syntax checkers?

Categories

Resources