AWK: How to supress default print - awk

AWK: How to supress default print
Following awk if statement always prints $0. How to stop it from doing so
( nodeComplete && count )
{
#print $0
#print count;
for (i = 0; i < count; i++) {print array1[i];};
nodeComplete=0;
count=0;
}

Welcome to SO, try changing your braces { position and let me know if this helps.
( nodeComplete && count ){
#print $0
#print count;
for (i = 0; i < count; i++) {print array1[i];};
nodeComplete=0;
count=0;
}
Explanation of above change:
logic behind this is simple { next to condition means coming
statements should be executed as per condition. If you put them in
next line then it will all together a different set of block and
condition will be a different block. So if condition is TRUE then it
will print complete line since { is altogether a separate block.

Related

Awk how to negate a condition

I'm trying to compute some stuff in awk, and at the end print the result in the order of the input. For each line, I check if it has not been already seen. If not, I add it to the array and also store it in an order array.
{
if (! $0 in seen) {
seen[$0] = 1
order[o++] = $0
}
} END {
for (i=0; i<o; i++)
printf "%s\n", order[i]
}
You can try it with
printf 'a\nb\na\nc\nb\na\n' | awk script_above
It prints nothing. If I print the variable o at the end, it shows that its value is still 0. What am I doing wrong?
You just need to add parens to get the right operator precedence*:
# a.awk
{
if (!($0 in seen)) {
seen[$0] = 1
order[o++] = $0
}
}
END {
for (i=0; i<o; i++)
printf "%s\n", order[i]
}
Test:
$ awk -f a.awk file
a
b
c
* (The unary ! binds more tightly than the in operator: https://www.gnu.org/software/gawk/manual/html_node/Precedence.html)
What you are trying to do is in Shell way, awk has a way where you could keep checking if an element is part of an array or not, try following once.
printf 'a\nb\na\nc\nb\na\n' | awk '
!seen[$0]++ {
order[o++] = $0
}
END {
for (i=0; i<o; i++)
printf "%s\n", order[i]
}'
Here !seen[$0]++ means it is checking condition if an element is NOT a part of indexes of array named a then go inside the BLOCK(where your next statements are provided) then it does ++ which makes sure that this element(which was NOT there in array before checking condition)'s counter incremented by 1 so that next time this !seen[$0]++` condition is NOT TRUE for the already passed element.

Awk program creating very large negative number when trying to add up values from csv file

Here's my program with the unnecessary to the issue things taken out:
BEGIN{
count = 0
total = 0
FS = ","
}
{
for(i=1; i<10; i++)
count += $i;
total += count
count = 0
}
END{ print(total) }
The count when it prints out comes out as the very large negative number
-2519999999999999782145076764868608
when I'm expecting a positive number.
How would I go about fixing this? I don't think it's a concatenation issue because there are more values in the csv than in the printed out number.
Okay I got it
Instead of += the total,
I'm doing
for(i=0; i<count; i++)
total ++
it may not be the prettiest but it gets the right answer!

count occurrence of value in multiple fields independently (awk)

I have seen numerous posts to achieve this task for individual fields, but I am struggling to apply it on multiple field separately.
input:
group1|apple|orange|lemon
group1|apple|kiwi|banana
group1|orange|cherry| lemon
group1|apple|orange|pear
(The real file has many more fields, so I need to use a loop to process each fields)
output:
Field|Fruit|Count
2|apple|3
2|orange|1
3|orange|2
3|kiwi|1
3|cherry|1
4|lemon|2
4|banana|1
4|pear|1
What I tried so far, but returns the entire count for all the fields:
awk '
BEGIN{FS=OFS="|"; print "Field|Fruit|Count"}
{
for(i=2; i<=NF; i++){
a[$i]=$i
count[$i]++
}
}
END{
for(j in count) print j OFS count[j]
}'
Use the field number as part of the key in the count array.
awk '
BEGIN{FS=OFS="|"; print "Field|Fruit|Count"}
{
for (i = 2; i <= NF; i++) {
count[i OFS $i]++;
}
}
END {
for (j in count) {
print j, count[j];
}
}'

Parsing errors in awk blocks

awk 'BEGIN
{
INPUTFILE ='XXX'; iterator =0;
requestIterator =0;
storageFlag =T;
printFlag =F;
currentIteration =F;
recordCount =1;
while (getline < "'"$INPUTFILE"'")
{
requestArray[requestIterator]++;
requestIterator++;
}
}
if ($1 ~ /RequestId/)
{
FS = "=";
if($2 in requestArray)
{
storage[iterator] =$0;
printFlag =T;
next
}
else
{
storageFlag =F;
next
}
}
else
{
if((storageFlag =='T' && $0 != "EOE"))
{
storage[iterator]=$0; iterator++;
}
else {if(storageFlag == 'F')
{
next
}
else
{
if(printFlag == 'T')
{
for(details in storage)
{
print storage[details] >> FILE1;
delete storage[details];
}
printFlag =F;
storageFlag =T;
next
}
}'
I am facing some syntax error in the above code. Could you ppl please help me?
awk: BEGIN{INPUTFILE =XXXX;iterator =0;requestIterator =0;storageFlag =T;printFlag =F;currentIteration =F;recordCount =1;while (getline < ""){requestArray[requestIterator]++;requestIterator++;}}if ($1 ~ /RequestId/){FS = "=";if($2 in requestArray){storage[iterator] =$0;printFlag =T;next}else{storageFlag =F;next}}else{if((storageFlag ==T && $0 != EOE)){storage[iterator]=$0;iterator++;}else{if(storageFlag == F){next}else{if(printFlag == T){for(details in storage){print storage[details] >> XXXX;delete storage[details];}printFlag = F;storageFlag =T;next}}}}
awk: ^ syntax error
awk: ^ syntax error
Quotes are the problem. The first single quotes on INPUTFILE ='XXX' is going to be parsed as matching the one before BEGIN, and from then on all the parsing is broken.
Either escape the quotes or just put the awk file into a seperate file rather than "inline".
# STARTING POINT - known bad
awk 'BEGIN { INPUTFILE ='XXX'; iterator =0; ... '
Has to be rewritten to remove all of the single quotes inside the outer pair
awk 'BEGIN { INPUTFILE ="XXX"; iterator =0; ... '
Or depending on if you need doubles or singles, use doubles outside and single inside
awk "BEGIN { INPUTFILE ='XXX'; iterator =0; ... '
or escape the singles quotes so they make it through to awk and don't get consumed by the shell.
awk 'BEGIN { INPUTFILE =\'XXX\'; iterator =0; ... '
All of your problems go away if you put the awk script into a separate file rather than inlining it the shell. You can have whatever quotes you like and no one will care !!

Awk Iterate through several Arrays in a for loop

I have created an awk program to go through the columns of a file and count each distinct word and then output totals into separate files
awk -F"$delim" {Field_Arr1[$1]++; Field_Arr2[$2]++; Field_Arr3[$3]++; Field_Arr4[$4]++};
END{\
# output fields
out_field1="top_field1"
out_field2="top_field2"
out_field3="top_field3"
out_field4="top_field4"
for( i=1; i <= NF; i++)
{
for (element in Field_Arr$i)
{
print element"\t"Field_Arr$i[element] >>out_field$i;
}
}
}' inputfile
but I don't know the appropriate syntax, so that the for loop will iterate through Field_Arr1, Field_Arr2, Field_Arr3, Field_Arr4?
I have tried using: i, $i, ${i}, {i}, "$i", and "i".
Am I trying the wrong approach or is there a way to change Field_Arr$i to Field_Arr1..4?
Thanks for the advice.
awk variables don't work that way; you'll have to do them individually by name, or use fake multidimensional arrays and parse out the components, something along the lines of:
{Field_Arr[1, $1]++; Field_Arr[2, $2]++; Field_Arr[3, $3]++; Field_Arr[4, $4]++}
END {
for (elt in Field_Arr) {
split(elt, ec, SUBSEP)
print ec[2] "\t" Field_Arr[elt] >> ("top_field" ec[1])
}
}
To count the frequencies for each column (3 in my example), try this
# Print list of word frequencies
function p_array(t,a) {
print t
for (i in a) {
print i, a[i]
}
}
{
c1[$1]++
c2[$1]++
c3[$1]++
}
END {
p_array("1st col",c1)
p_array("2nd col",c2)
p_array("3rd col",c3)
}