Are there any AWK syntax checkers? - awk

Are there any AWK syntax checkers? I'm interested in both minimal checkers that only flag syntax errors and more extensive checkers along the lines of lint.
It should be a static checker only, not dependent on running the script.

If you prefix your Awk script with BEGIN { exit(0) } END { exit(0) }, you're guaranteed that none of your of code will run. Exiting during BEGIN and END prevents other begin and exit blocks from running. If Awk returns 0, your script was fine; otherwise there was a syntax error.
If you put the code snippet in a separate argument, you'll get good line numbers in the error messages. This invocation...
gawk --source 'BEGIN { exit(0) } END { exit(0) }' --file syntax-test.awk
Gives error messages like this:
gawk: syntax-test.awk:3: x = f(
gawk: syntax-test.awk:3: ^ unexpected newline or end of string
GNU Awk's --lint can spot things like global variables and undefined functions:
gawk: syntax-test.awk:5: warning: function `g': parameter `x' shadows global variable
gawk: warning: function `f' called but never defined
And GNU Awk's --posix option can spot some compatibility problems:
gawk: syntax-test.awk:2: error: `delete array' is a gawk extension
Update: BEGIN and END
Although the END { exit(0) } block seems redundant, compare the subtle differences between these three invocations:
$ echo | awk '
BEGIN { print("at begin") }
/.*/ { print("found match") }
END { print("at end") }'
at begin
found match
at end
$ echo | awk '
BEGIN { exit(0) }
BEGIN { print("at begin") }
/.*/ { print("found match") }
END { print("at end") }'
at end
$ echo | awk '
BEGIN { exit(0) } END { exit(0) }
BEGIN { print("at begin") }
/.*/ { print("found match") }
END { print("at end") }'
In Awk, exiting during BEGIN will cancel all other begin blocks, and will prevent matching against any input. Exiting during END is the only way to prevent all other event blocks from running; that's why the third invocation above shows that no print statements were executed. The GNU Awk User's Guide has a section on the exit statement.

GNU awk appears to have a --lint option.

For a minimal syntax checker, which stops at the first error, try awk -f prog < /dev/null.

Related

How can I store the length of a line into a var withing awk script?

I have this simple awk script with which I attempt to check the amount of characters in the first line.
if the first line has more of less than 10 characters I want to store the amount
of caracters into a var.
Somehow the first print statement works but storing that result into a var doesn't.
Please help.
I tried removing dollar sign " thelength=(length($0))"
and removing the parenthesis "thelength=length($0)" but it doen't print anything...
Thanks!
#!/bin/ksh
awk ' BEGIN {FS=";"}
{
if (NR==1)
if(length($0)!=10)
{
print(length($0))
thelength=$(length($0))
print "The length of the first line is: ",$thelength;
exit 1;
}
}
END { print "STOP" }' $1
Two issues dealing with mixing ksh and awk scripting ...
no need to make a sub-shell call within awk to obtain the length; use thelength=length($0)
awk variables do not require a leading $ when being referenced; use print ... ,thelength
So your code becomes:
#!/bin/ksh
awk ' BEGIN {FS=";"}
{
if (NR==1)
if(length($0)!=10)
{
print(length($0))
thelength=length($0)
print "The length of the first line is: ",thelength;
exit 1;
}
}
END { print "STOP" }' $1

Keep set -e setting inside || or &&

I have a simple script with a simple function which can lead to an error. Let's define this function, and make it broken:
brokenFunction () {
ls "non-existing-folder"
}
If we execute this function in a block detecting if it is broken, it works well:
brokenFunction || printf "It is broken\n"
prints "It is broken"
Now, let's make the function a bit more complex, by adding a correct command at the end :
#!/bin/sh
brokenFunction () {
ls "non-existing-folder"
printf "End of function\n"
}
brokenFunction || printf "It is broken\n"
This script prints :
$ ./script.sh
ls: cannot access 'non-existing-folder': No such file or directory
End of function
while I expected the function to stop before the printf statement, and the next block to display "It is broken".
And indeed, if I check the exit status code of brokenFunction, it is 0.
I tried adding set -e to the top of the script. The behavior is still the same, but the exit code of brokenFunction if called without || now becomes 2. If called with it, the status code is still 0.
Is there any way to keep the set -e setting inside a function called with ||?
EDIT: I just realized that the function in the example was useless. I encounter the same issue with a simple block and a condition.
#!/bin/sh
set -e
{
ls "non-existing-dir"
printf "End of block\n"
} || {
printf "It is broken\n"
}
prints
$ ./script.sh
ls: cannot access 'non-existing-dir': No such file or directory
End of block
As written in man bash, set -e is ignored in some contexts. A command before || or && is such a context.
trap looks like a possible solution here. A working alternative to the last script using trap would look like that:
#!/bin/sh
abort () {
printf "It is broken\n"
}
trap 'abort' ERR
(
set -e
false
printf "End of block\n"
)
trap - ERR
Some things have to be noticed here:
trap 'abort' ERR binds the abort function to any raised error ;
the broken block is executed in a sub-shell for 2 reasons. First is to keep the set -e setting inside the block and limit the border effects. Second is to exit this sub-shell on error (set -e effect), and not the whole script ;
trap - ERR at the end resets the trap binding, meaning the following part of the script is executed as before.
To test the border effects, we can add the previously non-working part :
#!/bin/sh
abort () {
printf "It is broken\n"
}
trap 'abort' ERR
(
set -e
false
printf "End of block\n"
)
trap - ERR
{
false
printf "End of second block\n"
} || {
printf "It is broken too\n"
}
prints:
It is broken
End of second block

Awk iterating with out a loop construct

I was reading a tutorial on awk scripting, and observed this strange behaviour, Why this awk script while executing asks for a number repeatedly even with out a loop construct like while or for. If we enter CTRL+D(EOF) it stops prompting for another number.
#!/bin/awk -f
BEGIN {
print "type a number";
}
{
print "The square of ", $1, " is ", $1*$1;
print "type another number";
}
END {
print "Done"
}
Please explain this behaviour of the above awk script
awk continues to work on lines until end of file is reached. Since in this case the input (STDIN) never ends as you keep entering number or hitting enter, it causes an endless loop.
When you hit CTRL+D you indicate the awk script that EOF is reached there by exiting the loop.
try this and enter 0 to exit
BEGIN {
print "type a number";
}
{
if($1==0)
exit;
print "The square of ", $1, " is ", $1*$1;
print "type another number";
}
END {
print "Done"
}
From the famous The AWK Programming Language:
If you don't provide a input file to the awk script on the command line, awk will apply the program to whatever you type next on your terminal until you type an end-of-file signal (control-d on Unix systems).

reproducing grep "my pattern" myfile.log | sort | uniq | wc -l in awk

If I perform this grep on my target file I get eg 275 as result.
But I want to learn awk so tried this in awk:
awk 'BEGIN { count=0 } /my pattern/ { count++ } END { print count }' myfile.log
And this prints the 275 as expected.
So getting ambitious I created an awk script like this:
BEGIN {
print "Log File Analysis";
message=0;
events=0;
}
{
/message/ { messages++; }
/event/ { events++; }
}
END {
print "messages:\t" messages;
print "events:\t" events;
}
I get a syntax error,
$ awk -f test_learn.awk test_log.log
awk: test_learn.awk:16: /message/ { messages++; }
awk: test_learn.awk:16: ^ syntax error
What am I doing wrong?
I am using awk from MinGW shell on windows 7.
try
awk 'BEGIN { count=0 }; /my pattern/{count++ }; END { print count }' myfile.log
OR
awk 'BEGIN { count=0}; { if ($0 ~ /my pattern/) count++ }; END { print count };' myfile.log
Better yet, as variables are initialized as zero by default, you don't need the BEGIN block, so
awk '/my pattern/{count++ }; END { print count };' myfile.log
You can either have a default loop applied to all lines in a file, as in 2d example with the if, or you can have multiple blocks, "filtered" by pattern, as above, and in your edited addition.
When doing one-liners have you have, some awks required the semi-colon to separate the BEGIN and END blocks from the main loop block.
Edit
Same Idea with your 2nd issue, and integrating Ed Morton's improvments (thanks)
/message/ { messages++ }
/event/ { events++ }
END {
print "Log File Analysis"
print "messages:\t" messages
print "events:\t" events
}
IHTH

awk 1 unexpected character '.' suddenly appeared

the script was working. I added some comments and renamed it then submitted it. today my instructor told me it doesnt work and give me the error of awk 1 unexpected character '.'
the script is supposed to read a name in command line and return the student information for the name back.
right now I checked it and surprisingly it gives me the error.
I should run it by the command like this:
scriptName -v name="aname" -f filename
what is this problem and which part of my code make it?
#!/usr/bin/awk
BEGIN{
tmp=name;
nameIsValid;
if (name && tolower(name) eq ~/^[a-z]+$/ )
{
inputName=tolower(name)
nameIsValid++;
}
else
{
print "you have not entered the student name"
printf "Enter the student's name: "
getline inputName < "-"
tmp=inputName;
if (tolower(inputName) eq ~/^[a-z]+$/)
{
tmpName=inputName
nameIsValid++
}
else
{
print "Enter a valid name!"
exit
}
}
inputName=tolower(inputName)
FS=":"
}
{
if($1=="Student Number")
{
split ($0,header,FS)
}
if ($1 ~/^[0-9]+$/ && length($1)==8)
{
split($2,names," ")
if (tolower(names[1]) == inputName || tolower(names[2])==inputName )
{
counter++
for (i=1;i<=NF;i++)
{
printf"%s:%s ",header[i], $i
}
printf "\n"
}
}
}
END{
if (counter == 0 && nameIsValid)
{
printf "There is no record for the %-10s\n" , tmp
}
}
Here are the steps to fix the script:
Get rid of all those spurious NULL statements (trailing semi-colons at the end of lines).
Get rid of the unset variable eq (it is NOT an equality operator!) from all of your comparions.
Cleanup the indenting.
Get rid of that first non-functional nameIsValid; statement.
Change printf "\n" to the simpler print "".
Get rid of the useless ,FS arg to split().
Change name && tolower(name) ~ /^[a-z]+$/ to just the second part of that condition since if that matches then of course name is populated.
Get rid of all of those tolower()s and use character classes instead of explicit a-z ranges.
Get rid of the tmp variable.
Simplify your BEGIN logic.
Get rid of the unnecessary nameIsValid variable completely.
Make the awk body a bit more awk-like
And here's the result (untested since no sample input/output posted):
BEGIN {
if (name !~ /^[[:alpha:]]+$/ ) {
print "you have not entered the student name"
printf "Enter the student's name: "
getline name < "-"
}
if (name ~ /^[[:alpha:]]+$/) {
inputName=tolower(name)
FS=":"
}
else {
print "Enter a valid name!"
exit
}
}
$1=="Student Number" { split ($0,header) }
$1 ~ /^[[:digit:]]+$/ && length($1)==8 {
split(tolower($2),names," ")
if (names[1]==inputName || names[2]==inputName ) {
counter++
for (i=1;i<=NF;i++) {
printf "%s:%s ",header[i], $i
}
print ""
}
}
}
END {
if (counter == 0 && inputName) {
printf "There is no record for the %-10s\n" , name
}
}
I changed the shebang line to:
#!/usr/bin/awk -f
and then in command line didnt use -f. It is working now
Run the script in the following way:
awk -f script_name.awk input_file.txt
This seems to suppress the warnings and errors.
In my case, the problem was resetting the IFS variable to be IFS="," as suggested in this answer for splitting string into an array. So I resetted the IFS variable and got my code to work.
IFS=', '
read -r -a array <<< "$string"
IFS=' ' # reset IFS