Awk syntax error pattern matching - awk

I want to match every empty line, and when it is an empty line, i want to go to the next line.
The problem is when i type
/^$/ { next }
it always gives me a syntaxt error. It always refers to the first '{'. I thought this was the correct syntax. Can anyone help me please?
My script:
BEGIN{
FS=" "
matching=0
num=0
}
{
/^$/ { next }
if(matching==1 && num>=NF){
print("")
matching=0
}
if(match($NF,type)>0){
for(i=1;i<=NF;i++){
printf($i)
}
printf("\n")
num=NF
matching=1
next
}
if(matching==1){
for(i=1;i<=NF-num;i++){
printf(" ")
}
printf($NF)
printf("\n")
}
}
END{
}
This is my script

You are trying to use the pattern {action} syntax from within an {action} block there.
You need to move your /^$/ {next} line outside of the action block you are starting on line 6 (by moving it above that opening { to get what you want. Or use the if style matching in the action block.
Your script:
BEGIN{
FS=" "
matching=0
num=0
}
{ # <---- This starts an action block.
/^$/ { next } # <----- This is a pattern/action pair.
if(matching==1 && num>=NF){
print("")
matching=0
}
if(match($NF,type)>0){
for(i=1;i<=NF;i++){
printf($i)
}
printf("\n")
num=NF
matching=1
next
}
if(matching==1){
for(i=1;i<=NF-num;i++){
printf(" ")
}
printf($NF)
printf("\n")
}
}
END {
}

Your script has several issues, commented below:
BEGIN{
FS=" "
matching=0 # No need to init variables to zero, this is default behavior.
num=0 # Ditto.
}
{
/^$/ { next } # You can't just use "condition { action }" when you're already
# inside an awk action block. Move this outside of the action
# block or change it to "if (/^$/) { next }"
if(matching==1 && num>=NF){
print("") # print is a builtin not a function. Just do print "".
matching=0
}
if(match($NF,type)>0){
for(i=1;i<=NF;i++){
printf($i) # printf is a builtin, not a function and NEVER put input data
# where the printf formatting string should be. Change this to
# printf "%s", $i
}
printf("\n") # print ""
num=NF
matching=1
next
}
if(matching==1){
for(i=1;i<=NF-num;i++){
printf(" ") # printf " "
}
printf($NF) # printf "%s", $NF
printf("\n") # print ""
}
}
END{ # unused and unnecessary, remove this section.
}
I suspect if you posted some sample input and expected output we could help you write a better (more concise and more idomatic) script.

Related

Error trying to redirect output of awk script to a new file

I am working on the following code in an awk script and I need the output to be redirected to another file within the same script.
BEGIN { FS=OFS="," }
NR==1 {print; next}
{ $9 = sprintf("%0.2f", $9) }
{ a[$0]++ }
BEGIN { FS=OFS="," }
{ gsub(/\r/,"") }
FNR==1 { $10="Survival Percentage" }
FNR > 1 && ($5+0==$5 && $6+0==$6 && $3+0==$3){
$10=sprintf("%0.2f",(($5-$6)/$3)*100)
}1
END {
if (i>0){
for (i in a){
print "i" > nj.csv
}}}
This is my code and just by executing it I get an error pointing to the point between nj and csv (nj.csv). Any idea to solve it?
gsub(/\r/,"") is almost always the wrong thing to do, you probably meant sub(/\r$/,""), and print "i" > nj.csv should be print i > "nj.csv" but idk why you have 2 identical BEGIN sections or what the overall purpose of the script is as it doesn't seem to make any sense.

In AWK, skip the rest of the current action?

Thanks for looking.
I have an AWK script with something like this;
/^test/{
if ($2 == "2") {
# What goes here?
}
# Do some more stuff with lines that match test, but $2 != "2".
}
NR>1 {
print $0
}
I'd like to skip the rest of the action, but process the rest of the patterns/actions on the same line.
I've tried return but this isn't a function.
I've tried next but that skips the rest of the patterns/actions for the current line.
For now I've wrapped the rest of the ^test action in the if statement's else, but I was wondering if there was a better approach.
Not sure this matters but I am using gawk on OSX, installed via brew (for better compatibility with my target OS).
Update (w/solution):
Edits: Expanded code sample based on #karakfa's answer.
BEGIN{
keepLastLine = 1;
}
/^test/ && !keepLastLine{
printLine = 1;
print $0;
next;
}
/^test/ && keepLastLine{
printLine = 0;
next;
}
/^foo/{
# This is where I have the rest of my logic (approx 100 lines),
# including updates to printLine and keepLastLine
}
NR>1 {
if (printLine) {
print $0
}
}
This will work for me, I even like it better that what I was thinking of.
However I do wonder what if my keepLastLine condition was only accessible in a for loop?
I gather from what #karakfa has said, there isn't a control structure for exiting only an action, and continuing with other patterns, so that would have to be implemented with a flag of some sort (not unlike #RavinderSingh13's answer).
If I got it correct could you please try following. I am creating a variable named flag here which will be chedked if condition inside test block for checking if 2nd field is 2 is TRUE then it will be SET. When it is SET so rest of statements in test BLOCK will NOT be executed. Also resetting flag's value before read starts for a line too.
awk '
{
found=""
}
/^test/{
if ($2 == "2") {
# What goes here?
found=1
}
if(!found){
# Do some more stuff with lines that match test, but $2 != "2".
}
}
NR>1 {
print $0
}' Input_file
Testing of code here:
Let's say following is the Input_file:
cat Input_file
file
test 2 file
test
abcd
After running code following we will get following output, where if any line is having test keyword and NOT having $2==2 then also it will execute statements outside of test condition.
awk '
{
found=""
}
/^test/{
if ($2 == "2") {
print "# What goes here?"
found=1
}
if(!found){
print "Do some more stuff with lines that match test, but $2 != 2"
}
}
NR>1 {
print $0
}' Input_file
# What goes here?
test 2 file
Do some more stuff with lines that match test, but $2 != 2
test
abcd
the magic keyword you're looking for is else
/^test/{ if($2==2) { } # do something
else { } # do something else
}
NR>1 # {print $0} is implied.
for some reason if you don't want to use else just move up condition one up (flatten the hierarchy)
/^test/ && $2==2 { } # do something
/^test/ && $2!=2 { } # do something else
# other action{statement}s

reproducing grep "my pattern" myfile.log | sort | uniq | wc -l in awk

If I perform this grep on my target file I get eg 275 as result.
But I want to learn awk so tried this in awk:
awk 'BEGIN { count=0 } /my pattern/ { count++ } END { print count }' myfile.log
And this prints the 275 as expected.
So getting ambitious I created an awk script like this:
BEGIN {
print "Log File Analysis";
message=0;
events=0;
}
{
/message/ { messages++; }
/event/ { events++; }
}
END {
print "messages:\t" messages;
print "events:\t" events;
}
I get a syntax error,
$ awk -f test_learn.awk test_log.log
awk: test_learn.awk:16: /message/ { messages++; }
awk: test_learn.awk:16: ^ syntax error
What am I doing wrong?
I am using awk from MinGW shell on windows 7.
try
awk 'BEGIN { count=0 }; /my pattern/{count++ }; END { print count }' myfile.log
OR
awk 'BEGIN { count=0}; { if ($0 ~ /my pattern/) count++ }; END { print count };' myfile.log
Better yet, as variables are initialized as zero by default, you don't need the BEGIN block, so
awk '/my pattern/{count++ }; END { print count };' myfile.log
You can either have a default loop applied to all lines in a file, as in 2d example with the if, or you can have multiple blocks, "filtered" by pattern, as above, and in your edited addition.
When doing one-liners have you have, some awks required the semi-colon to separate the BEGIN and END blocks from the main loop block.
Edit
Same Idea with your 2nd issue, and integrating Ed Morton's improvments (thanks)
/message/ { messages++ }
/event/ { events++ }
END {
print "Log File Analysis"
print "messages:\t" messages
print "events:\t" events
}
IHTH

awk 1 unexpected character '.' suddenly appeared

the script was working. I added some comments and renamed it then submitted it. today my instructor told me it doesnt work and give me the error of awk 1 unexpected character '.'
the script is supposed to read a name in command line and return the student information for the name back.
right now I checked it and surprisingly it gives me the error.
I should run it by the command like this:
scriptName -v name="aname" -f filename
what is this problem and which part of my code make it?
#!/usr/bin/awk
BEGIN{
tmp=name;
nameIsValid;
if (name && tolower(name) eq ~/^[a-z]+$/ )
{
inputName=tolower(name)
nameIsValid++;
}
else
{
print "you have not entered the student name"
printf "Enter the student's name: "
getline inputName < "-"
tmp=inputName;
if (tolower(inputName) eq ~/^[a-z]+$/)
{
tmpName=inputName
nameIsValid++
}
else
{
print "Enter a valid name!"
exit
}
}
inputName=tolower(inputName)
FS=":"
}
{
if($1=="Student Number")
{
split ($0,header,FS)
}
if ($1 ~/^[0-9]+$/ && length($1)==8)
{
split($2,names," ")
if (tolower(names[1]) == inputName || tolower(names[2])==inputName )
{
counter++
for (i=1;i<=NF;i++)
{
printf"%s:%s ",header[i], $i
}
printf "\n"
}
}
}
END{
if (counter == 0 && nameIsValid)
{
printf "There is no record for the %-10s\n" , tmp
}
}
Here are the steps to fix the script:
Get rid of all those spurious NULL statements (trailing semi-colons at the end of lines).
Get rid of the unset variable eq (it is NOT an equality operator!) from all of your comparions.
Cleanup the indenting.
Get rid of that first non-functional nameIsValid; statement.
Change printf "\n" to the simpler print "".
Get rid of the useless ,FS arg to split().
Change name && tolower(name) ~ /^[a-z]+$/ to just the second part of that condition since if that matches then of course name is populated.
Get rid of all of those tolower()s and use character classes instead of explicit a-z ranges.
Get rid of the tmp variable.
Simplify your BEGIN logic.
Get rid of the unnecessary nameIsValid variable completely.
Make the awk body a bit more awk-like
And here's the result (untested since no sample input/output posted):
BEGIN {
if (name !~ /^[[:alpha:]]+$/ ) {
print "you have not entered the student name"
printf "Enter the student's name: "
getline name < "-"
}
if (name ~ /^[[:alpha:]]+$/) {
inputName=tolower(name)
FS=":"
}
else {
print "Enter a valid name!"
exit
}
}
$1=="Student Number" { split ($0,header) }
$1 ~ /^[[:digit:]]+$/ && length($1)==8 {
split(tolower($2),names," ")
if (names[1]==inputName || names[2]==inputName ) {
counter++
for (i=1;i<=NF;i++) {
printf "%s:%s ",header[i], $i
}
print ""
}
}
}
END {
if (counter == 0 && inputName) {
printf "There is no record for the %-10s\n" , name
}
}
I changed the shebang line to:
#!/usr/bin/awk -f
and then in command line didnt use -f. It is working now
Run the script in the following way:
awk -f script_name.awk input_file.txt
This seems to suppress the warnings and errors.
In my case, the problem was resetting the IFS variable to be IFS="," as suggested in this answer for splitting string into an array. So I resetted the IFS variable and got my code to work.
IFS=', '
read -r -a array <<< "$string"
IFS=' ' # reset IFS

Extracting multiple columns without loop

I am writing an awk script that will take the output of grep and nicely format that into an HTML table. The delimiter is the ":" character; the problem I'm running into is that that character can appear in the text as well. So if I just use $1, $2, and $3 for the filename, line number, and comment respectively, I lose anything after the first : in the comment
Is there a way to say $1, $2, and then $3..NR without explicitly looping over the columns and concatenating them together?
Here's the script so far:
`
#!/usr/bin/awk
BEGIN {
FS=":"
print "<html><body>"
print "<table>"
print "<tr><td>File name</td><td>Line number</td><td>Comment</td></tr>"
}
{
print "<tr><td>" $1 "</td><td>" $2 "</td><td>" $3 "</td></tr>"
}
END {
print "</table>"
print "</body></html>"
}`
And some sample input:
./mysql-connector-java-5.0.8/src/com/mysql/jdbc/BlobFromLocator.java:177: // TODO: Make fetch size configurable
./mysql-connector-java-5.0.8/src/com/mysql/jdbc/CallableStatement.java:243: // TODO Auto-generated method stub
./mysql-connector-java-5.0.8/src/com/mysql/jdbc/CallableStatement.java:836: // TODO: Do this with less memory allocation
BEGIN { FS=":"; OFS=":" }
{ name=$1; number=$2; $1=""; $2=""; comment=substr($0,3); }
{ print gensub(/^[^:]*:[^:]*:/,"","g") }