Getting output for every line of input file. Only need output once - awk

This should be pretty simple but I'm having an issue with the flow of an awk script. I run the following script and it prints the output over and over again (if I had to guess I would say that it's printing once for every line of the input file). As requested, here is some fake input:
[30000] (03/20 00:00:02.950):{0x2D90} Pattern1 5.0.3.57
[30000] (03/20 00:00:03.911):{0x2D90} Pattern2 5.0.3.57
[30000] (03/20 00:00:02.950):{0x2D90} Pattern3 5.0.3.16
[30000] (03/20 00:00:03.911):{0x2D90} Pattern4 5.0.3.16
Here is the script:
/Pattern1/ {
gsub(/\./,"");
agtver=$5;
}
/Pattern2/ {
gsub(/\./,"");
ctrver=$5;
}
{
if (agtver ~ 50357 && ctrver ~ 50357) {
print "Blamo!";
}
else print "No blamo. :("
}
And here is the output that I'm getting:
[chawkins#chawkins-DT Devel]$ ./fakeawk.awk < fake.txt
No blamo. :(
Blamo!
Blamo!
Blamo!
The output that I expect is a single Blamo! if the patterns match and a single No blamo. :( if it doens't match.
The problem seems to be that there are three separate { ... } sections, but I need these to be able to process two patterns... unless there is a way to condense this.

If you never see pattern1 and pattern2 after the first time, then agtver and ctrver remain set. You have to zero them out again.
edit added debug output, you should be able to see where the logic is failing.
Tested with your data, thanks for adding that!
/Pattern1/ { gsub(/\./,""); agtver=$5;}
/Pattern2/ { gsub(/\./,""); ctrver=$5;}
{
#dbg print "\n#dbg: $5=" $5 "xx\tagtver=" agtver "xx\tctrver=" ctrver "xxx\t$0=" $0
if (agtver ~ 50357 && ctrver ~ 50357) {
print "Blamo!";
agtver="" ; ctrver=""
}
else print "No blamo. :("
}
./fakeawk.awk < fake.txt
output
No blamo. :(
Blamo!
No blamo. :(
No blamo. :(
I hope this helps.

TXR:
#(gather :vars (agtver ctrver))
# (skip :greedy) #/Pattern1/ #{agtver /5\.0\.3\.57/}
# (skip :greedy) #/Pattern2/ #{ctrver /5\.0\.3\.57/}
#(end)
#(do (put-string "Blamo!\n"))
Output:
$ txr fake.txr fake.log
Blamo!
$ echo "junk" | txr fake.txr -
false
The #(gather) directive is perfect for this. It matches material that can appear in any order, and :vars (agtver ctrver) adds the constraint that bindings must be found for both of these variables, or else a failure occurs.
We can then express the two indepedent conditions we are looking for as a pair of independent whole-line pattern matches which bind two different variables.
The logic may be read as "please scan the input to gather bindings variables agtver and ctrver or else fail". And then the rules for gathering the variables are specified, one per line.
We don't really need the side effect of printing Blamo!: the successful or failed termination of the program tells us everything.

Related

Else syntax error when nesting array formula

I am recieving a syntax error on "else" for this shell:
{for (i=8;i<=NF;i+=3)
{if ($0~"=>") # if-else statement designed to flag file / directory transfers
print "=> flag,"$1"," $2","$3","$4 ","$5","$6","$7"," $(i)","$(i+1)","$(i+2);
{split ($(i+2), array, "/");
for (x in array)
{j++;
a[j] =j;
printf (array[x] ",");}
printf ("%s\n", "");}
else
print "no => flag,"$1"," $2","$3","$4 ","$5","$6","$7"," $(i)","$(i+1)","$(i+2)
}
}
Can't figure out why. If I delete the array block (starting with split()), all is well. But I need to scan the contents of $(i+2), so cutting it does me no good.
Also, if anyone has guidance on a good list of how to interpret error messages, that would be great.
Thanks for your advice.
EDIT: here is the above script laid out with sensible formatting:
{
for (i=8;i<=NF;i+=3) {
if ($0~"=>") # if-else statement designed to flag file / directory transfers
print "=> flag,"$1"," $2","$3","$4 ","$5","$6","$7"," $(i)","$(i+1)","$(i+2);
{
split ($(i+2), array, "/");
for (x in array) {
j++;
a[j] =j;
printf (array[x] ",");
}
printf ("%s\n", "");
}
else
print "no => flag,"$1"," $2","$3","$4 ","$5","$6","$7"," $(i)","$(i+1)","$(i+2)
}
}
First thing first, since you didn't post any samples of input and expected output so didn't test it at all. Could you please try following, I hope you are running this in .awk script style. Also these are mostly syntax/cosmetic changes NOT on logic part, since no background was given on problem.
BEGIN{
OFS=","
}
{
for (i=8;i<=NF;i+=3){
if ($0~/=>/){
print "=> flag,"$1,$2,$3,$4,$5,$6,$7,$(i),$(i+1),$(i+2)
split ($(i+2), array, "/");
for(x in array){
j++;
a[j] =j;
printf (array[x] ",")
}
printf ("%s\n", "")
}
else{
print "no => flag",$1,$2,$3,$4,$5,$6,$7,$(i),$(i+1),$(i+2)
}
}
}
Problems fixed in OP's attempt:
{ starting curly braces(which indicates that if condition of for loop with multiple statements is started) could be in last of the line where they are present, NOT in next line, for better visibility purposes, I fixed in for loop and if condition first.
Since you are using regexp matching with a pattern so I fixed from $0~"=>" TO $0~/=>/.
Added BEGIN section in your attempt where I have set OFS(output field separator) value to , so that you need NOT to print like "," to print comma between variables, just , between variables will do the trick.
Fixed indentation, so that we are NOT confused where to close loop/condition and where to NOT.

In AWK, skip the rest of the current action?

Thanks for looking.
I have an AWK script with something like this;
/^test/{
if ($2 == "2") {
# What goes here?
}
# Do some more stuff with lines that match test, but $2 != "2".
}
NR>1 {
print $0
}
I'd like to skip the rest of the action, but process the rest of the patterns/actions on the same line.
I've tried return but this isn't a function.
I've tried next but that skips the rest of the patterns/actions for the current line.
For now I've wrapped the rest of the ^test action in the if statement's else, but I was wondering if there was a better approach.
Not sure this matters but I am using gawk on OSX, installed via brew (for better compatibility with my target OS).
Update (w/solution):
Edits: Expanded code sample based on #karakfa's answer.
BEGIN{
keepLastLine = 1;
}
/^test/ && !keepLastLine{
printLine = 1;
print $0;
next;
}
/^test/ && keepLastLine{
printLine = 0;
next;
}
/^foo/{
# This is where I have the rest of my logic (approx 100 lines),
# including updates to printLine and keepLastLine
}
NR>1 {
if (printLine) {
print $0
}
}
This will work for me, I even like it better that what I was thinking of.
However I do wonder what if my keepLastLine condition was only accessible in a for loop?
I gather from what #karakfa has said, there isn't a control structure for exiting only an action, and continuing with other patterns, so that would have to be implemented with a flag of some sort (not unlike #RavinderSingh13's answer).
If I got it correct could you please try following. I am creating a variable named flag here which will be chedked if condition inside test block for checking if 2nd field is 2 is TRUE then it will be SET. When it is SET so rest of statements in test BLOCK will NOT be executed. Also resetting flag's value before read starts for a line too.
awk '
{
found=""
}
/^test/{
if ($2 == "2") {
# What goes here?
found=1
}
if(!found){
# Do some more stuff with lines that match test, but $2 != "2".
}
}
NR>1 {
print $0
}' Input_file
Testing of code here:
Let's say following is the Input_file:
cat Input_file
file
test 2 file
test
abcd
After running code following we will get following output, where if any line is having test keyword and NOT having $2==2 then also it will execute statements outside of test condition.
awk '
{
found=""
}
/^test/{
if ($2 == "2") {
print "# What goes here?"
found=1
}
if(!found){
print "Do some more stuff with lines that match test, but $2 != 2"
}
}
NR>1 {
print $0
}' Input_file
# What goes here?
test 2 file
Do some more stuff with lines that match test, but $2 != 2
test
abcd
the magic keyword you're looking for is else
/^test/{ if($2==2) { } # do something
else { } # do something else
}
NR>1 # {print $0} is implied.
for some reason if you don't want to use else just move up condition one up (flatten the hierarchy)
/^test/ && $2==2 { } # do something
/^test/ && $2!=2 { } # do something else
# other action{statement}s

Endless recursion in gawk-script

Please pardon me in advance for posting such a big part of my problem, but I just can't put my finger on the part that fails...
I got input-files like this (abas-FO if you care to know):
.fo U|xiininputfile = whatever
.type text U|xigibsgarnich
.assign U|xigibsgarnich
..
..Comment
.copy U|xigibswohl = Spaß
.ein "ow1/UWEDEFTEST.FOP"
.in "ow1/UWEINPUT2"
.continue BOTTOM
.read "SOemthing" U|xttmp
!BOTTOM
..
..
Now I want to recursivly follow each .in[put]/.ein[gabe]-statement, parse the mentioned file and if I don't know it yet, add it to an array. My code looks like this:
#!/bin/awk -f
function getFopMap(inputregex, infile, mandantdir, infiles){
while(getline f < infile){
#printf "*"
#don't match if there is a '
if(f ~ inputregex "[^']"){
#remove .input-part
sub(inputregex, "", f)
#trim right
sub(/[[:blank:]]+$/, "", f)
#remove leading and trailing "
gsub(/(^\"|\"$)/,"" ,f)
if(!(f in infiles)){
infiles[f] = "found"
}
}
}
close(infile)
for (i in infiles){
if(infiles[i] == "found"){
infiles[i] = "parsed"
cmd = "test -f \"" i "\""
if(system(cmd) == 0){
close(cmd)
getFopMap(inputregex, f, mandantdir, infiles)
}
}
}
}
BEGIN{
#Matches something like [.input myfile] or [.ein "ow1/myfile"]
inputregex = "^\\.(in|ein)[^[:blank:]]*[[:blank:]]+"
#Get absolute path of infile
cmd = "python -c 'import os;print os.path.abspath(\"" ARGV[1] "\")'"
cmd | getline rootfile
close(cmd)
infiles[rootfile] = "parsed"
getFopMap(inputregex, rootfile, mandantdir, infiles)
#output result
for(infile in infiles) print infile
exit
}
I call the script (in the same directory the paths are relative to) like this:
./script ow1/UWEDEFTEST.FOP
I get no output. It just hangs up. If I remove the comment before the printf "*" command, I'm seeing stars, without end.
I appreciate every help and hints how to do it better.
My awk:
gawk Version 3.1.7
idk it it's your only problem but you're calling getline incorrectly and consequently will go into an infinite loop in some scenarios. Make sure you fully understand all of the caveats at http://awk.info/?tip/getline and you might want to use the recursion example there as the starting point for your code.
The most important item initially for your code is that when getline fails it can return a negative value so then while(getline f < infile) will create an infinite loop since the failing getline will always be returning non-zero and will so continue to be called and continue to fail. You need to use while ( (getline f < infile) > 0) instead.

awk: "default" action if no pattern was matched?

I have an awk script which checks for a lot of possible patterns, doing something for each pattern. I want something to be done in case none of the patterns was matched. i.e. something like this:
/pattern 1/ {action 1}
/pattern 2/ {action 2}
...
/pattern n/ {action n}
DEFAULT {default action}
Where of course, the "DEFAULT" line is no awk syntax and I wish to know if there is such a syntax (like there usually is in swtich/case statements in many programming languages).
Of course, I can always add a "next" command after each action, but this is tedious in case I have many actions, and more importantly, it prevents me from matching the line to two or more patterns.
You could invert the match using the negation operator ! so something like:
!/pattern 1|pattern 2|pattern/{default action}
But that's pretty nasty for n>2. Alternatively you could use a flag:
{f=0}
/pattern 1/ {action 1;f=1}
/pattern 2/ {action 2;f=1}
...
/pattern n/ {action n;f=1}
f==0{default action}
GNU awk has switch statements:
$ cat tst1.awk
{
switch($0)
{
case /a/:
print "found a"
break
case /c/:
print "found c"
break
default:
print "hit the default"
break
}
}
$ cat file
a
b
c
d
$ gawk -f tst1.awk file
found a
hit the default
found c
hit the default
Alternatively with any awk:
$ cat tst2.awk
/a/ {
print "found a"
next
}
/c/ {
print "found c"
next
}
{
print "hit the default"
}
$ awk -f tst2.awk file
found a
hit the default
found c
hit the default
Use the "break" or "next" as/when you want to, just like in other programming languages.
Or, if you like using a flag:
$ cat tst3.awk
{ DEFAULT = 1 }
/a/ {
print "found a"
DEFAULT = 0
}
/c/ {
print "found c"
DEFAULT = 0
}
DEFAULT {
print "hit the default"
}
$ gawk -f tst3.awk file
found a
hit the default
found c
hit the default
It's not exaclty the same semantics as a true "default" though so it's usage like that could be misleading. I wouldn't normally advocate using all-upper-case variable names but lower case "default" would clash with the gawk keyword so the script wouldn't be portable to gawk in future.
As mentioned above by tue, my understanding of the standard approach in Awk is to put next at each alternative and then have a final action without a pattern.
/pattern1/ { action1; next }
/pattern2/ { action2; next }
{ default-action }
The next statement will guarantee that no more patterns are considered for the line in question. And the default-action will always happen if the previous ones don't happen (thanks to all the next statements).
There is no "maintanance free" solution for a DEFAULT-Branch in awk.
The first possibility i would suggest is to complete each branch of a pattern match with a 'next' statement. So it's like a break statement. Add a final action at the end that matches everything. So it's the DEAFULT branch.
The other possibility would be:
set a flag for each branch that has a pattern match (i.e. your non-default branches)
e.g. start your actions with NONDEFAULT=1;
Add a last action at the end (the default branch) and define a condition NONDEFAULT==0 instaed of a reg expression match.
A fairly clean, portable workaround is using an if statement:
Instead of:
pattern1 { action1 }
pattern2 { action2 }
...
one could use the following:
{
if ( pattern1 ) { action1 }
else if ( pattern2 ) { action2 }
else { here is your default action }
}
As mentioned above, GNU awk has switch statements, but other awk implementations don't, so using switch would not be portable.

Using a variable defined inside AWK

I got this piece of script working. This is what i wanted:
input
3.76023 0.783649 0.307724 8766.26
3.76022 0.764265 0.307646 8777.46
3.7602 0.733251 0.30752 8821.29
3.76021 0.752635 0.307598 8783.33
3.76023 0.79528 0.307771 8729.82
3.76024 0.814664 0.307849 8650.2
3.76026 0.845679 0.307978 8802.97
3.76025 0.826293 0.307897 8690.43
with script
!/bin/bash
awk -F ', ' '
{
for (i=3; i<=10; i++) {
if (i==NR) {
npc1[i]=sprintf("%s", $1);
npc2[i]=sprintf("%s", $2);
npc3[i]=sprintf("%s", $3);
npRs[i]=sprintf("%s", $4);
print npc1[i],npc2[i],\
npc3[i], npc4[i];
}
}
} ' p_walls.raw
echo "${npc1[100]}"
But now I can't use those arrays npc1[i], outside awk. That last echo prints nothing. Isnt it possible or am I missing something?
AWK is a separate process, after it finishes all internal data is gone. This is true for all external processes/commands. Bash only sees what bash builtins touch.
i is never 100, so why do you want to access npc1[100]?
What are you really trying to do? If you rewrite the question we might be able to help...
(Cherry on the cake is always good!)
Sorry, but all of #yi_H 's answer and comments above are correct.
But there's really no problem loading 2 sets of data into 2 separate arrays in awk, ie.
awk '{
if (FILENAME == "file1") arr1[i++]=$0 ;
#same for file2; }
END {
f1max=++i; f2max=++j;
for (i=1;i<f1max;i++) {
arr1[i]
# put what you need here for arr1 processing
#
# dont forget that you can do things like
if (arr1[i] in arr2) { print arr1[i]"=arr2[arr1["i"]=" arr2[arr1[i]] }
}
for j=1;j<f2max;j++) {
arr2[j]
# and here for arr2
}
}' file1 file2
You'll have to fill the actual processing for arr1[i] and arr2[j].
Also, get an awk book for the weekend and be up and running by Monday. It's easy. You can probably figure it out from grymoire.com/Unix/awk.html
I hope this helps.