How do I pass a variable into AWK FNR? - awk

#!/bin/bash
export num=50
echo $num
awk -v awk_num=$num 'FNR==2, FNR==$awknum {print $1;}' big_report > short_report
I have a big_report file. The desired output is to print rows 2 to 50 in column 1 of big_report into short_report. However, when I run above the result in short_report, includes all lines in column 1 instead of the specified rows 2-50.
I would really appreciate it if anyone could help! Thanks!!!

Like this:
awk -v awk_num=$num 'FNR==2, FNR==awk_num {print $1}' big_report > short_report

Related

Replace value in particular columns in csv file

I would like to replace the values which are > than 20 in columns 5 and 7 to AAA
input file
9179,22.4,-0.1,22.4,2.6,0.1,2.6,39179
9179,98.1,-1.7,98.11,1.9,1.7,2.55,39179
9179,-48.8,0.5,48.8,-1.2,-0.5,1.3,39179
6121,25,0,25,50,0,50,36121
6123,50,0,50,50,0,50,36123
6125,75,0,75,50,0,50,36125
output desired
9179,22.4,-0.1,22.4,2.6,0.1,2.6,39179
9179,98.1,-1.7,98.11,1.9,1.7,2.55,39179
9179,-48.8,0.5,48.8,-1.2,-0.5,1.3,39179
6121,25,0,25,AAA,0,AAA,36121
6123,50,0,50,AAA,0,AAA,36123
6125,75,0,75,AAA,0,AAA,36125
I tried
With this command I replace the values in column 5, how to do it for column 7 too.
awk -F ',' -v OFS=',' '$1 { if ($5>20) $5="AAA"; print}' file
Thanks in advance
here is another take for making the columns set configurable
$ awk -v cols="5,7" 'BEGIN {FS=OFS=","; split(cols,a)}
{for(i in a) if($a[i]>20) $a[i]="AAA"}1' file
9179,22.4,-0.1,22.4,2.6,0.1,2.6,39179
9179,98.1,-1.7,98.11,1.9,1.7,2.55,39179
9179,-48.8,0.5,48.8,-1.2,-0.5,1.3,39179
6121,25,0,25,AAA,0,AAA,36121
6123,50,0,50,AAA,0,AAA,36123
6125,75,0,75,AAA,0,AAA,36125
awk 'BEGIN{FS=OFS=","} $5>20{$5="AAA"} $7>20{$7="AAA"}1' file
9179,22.4,-0.1,22.4,2.6,0.1,2.6,39179
9179,98.1,-1.7,98.11,1.9,1.7,2.55,39179
9179,-48.8,0.5,48.8,-1.2,-0.5,1.3,39179
6121,25,0,25,AAA,0,AAA,36121
6123,50,0,50,AAA,0,AAA,36123
6125,75,0,75,AAA,0,AAA,36125
You can use two {..} for multiple checks and action

awk: print each column of a file into separate files

I have a file with 100 columns of data. I want to print the first column and i-th column in 99 separate files, I am trying to use
for i in {2..99}; do awk '{print $1" " $i }' input.txt > data${i}; done
But I am getting errors
awk: illegal field $(), name "i"
input record number 1, file input.txt
source line number 1
How to correctly use $i inside the {print }?
Following single awk may help you too here:
awk -v start=2 -v end=99 '{for(i=start;i<=end;i++){print $1,$i > "file"i;close("file"i)}}' Input_file
An all awk solution. First test data:
$ cat foo
11 12 13
21 22 23
Then the awk:
$ awk '{for(i=2;i<=NF;i++) print $1,$i > ("data" i)}' foo
and results:
$ ls data*
data2 data3
$ cat data2
11 12
21 22
The for iterates from 2 to the last field. If there are more fields that you desire to process, change the NF to the number you'd like. If, for some reason, a hundred open files would be a problem in your system, you'd need to put the print into a block and add a close call:
$ awk '{for(i=2;i<=NF;i++){f=("data" i); print $1,$i >> f; close(f)}}' foo
If you want to do what you try to accomplish :
for i in {2..99}; do
awk -v x=$i '{print $1" " $x }' input.txt > data${i}
done
Note
the -v switch of awk to pass variables
$x is the nth column defined in your variable x
Note2 : this is not the fastest solution, one awk call is fastest, but I just try to correct your logic. Ideally, take time to understand awk, it's never a wasted time

Awk splitting a string and comparison

I have a string like AS|REQ|XYZ|value=12 which I am splitting with:
awk -F\| 'print {$4}' | awk -F"=" '{print $2}'
This gives the value 12.
But for the string DF|REG|EXP|value=, it comes back blank.
What I need as if my string encounters value in fourth column and is blank, throw error. Can this be done in awk command ?
Thanks
#JamesBrown has the right answer to your question as asked, but given the input you posted all you need to produce the output you want is:
awk -F'=' '{print ($NF=="" ? "Error" : $NF)}' file
If that's NOT all you need then edit your question to show some more truly representative sample input and expected output.
You could be more specific about what you mean by throwing an error. If you want the program to exit with a non-zero exit code, use if and exit with value`:
$ awk 'BEGIN{exit}'
$ echo $?
0
$ awk 'BEGIN{exit 1}'
$ echo $?
1
$ awk -F\| '{split($4,a,"="); if(a[2]=="") exit 1; else print a[2]}' foo
12
$ echo $?
1
or just print an error message and continue execution:
$ awk -F\| '{split($4,a,"="); print (a[2]==""?"ERROR":a[2])}' foo
12
ERROR
Test data used above:
$ cat foo
AS|REQ|XYZ|value=12
DF|REG|EXP|value=
Something like this perhaps?
awk -F\| '{print $4}' | awk -F"=" '{if ($2 == "") print "ERROR: Empty Value"; else print $2}'
Hope this command might work for you. The below command will behave as expected. If you have any value in the value field, it will just print the value. Else if it is blank, it prints "error". The string was placed in test.txt
awk -F\| '{if($4!="value=") {gsub("value=","",$4);print $4} else print "error" }' test.txt
Something like this -
cat f
AS|REQ|XYZ|value=12
AS|REQ|XYZ|value=
awk -F'[|=]' '{if($4 == "value" && $5 == "") {print ("Error Found at Line: ",NR)} else {print $0}}' f
AS|REQ|XYZ|value=12
Error Found at Line: 2
It search for value in 4th column and blank in 5th column.

How to sum first 100 rows of a specific column using Awk?

How to sum first 100 rows of a specific column using Awk? I wrote
awk 'BEGIN{FS="|"} NR<=100 {x+=$5}END {print x}' temp.txt
But this is taking lot of time to process; is there any other way which gives result quickly?
Just exit after the required first 100 records:
awk -v iwant=100 '{x+=$5} NR==iwant{exit} END{print x+0}' test.in
Take it out for a spin:
$ for i in {1..1000}; do echo 1 >> test.in ; done # thousand of records
$ awk -v iwant=100 '{x+=$1} NR==iwant{exit} END{print x+0}' test.in
100
'{x+=$5} NR==iwant{exit} END{print x+0}'
you can always trim the input and use the same script
head -100 file | awk ... your script here ...

awk and log2 divisions

I have a tab delimited file that looks something like this:
foo 0 4
boo 3 2
blah 4 0
flah 1 1
I am trying to calculate log2 for between the two columns for each row. my problem is with the division by zero
What I have tried is this:
cat file.txt | awk -v OFS='\t' '{print $1, log($3/$2)log(2)}'
when there is a zero as the denominator, the awk will crash. What I would want to do is some sort of conditional statement that would print an "inf" as the result when the denominator is equal to 0.
I am really not sure how to go about this?
Any help would be appreciated
Thanks
You can implement that as follows (with a few additional tweaks):
awk 'BEGIN{OFS="\t"} {if ($2==0) {print $1, "inf"} else {print $1, log($3/$2)log(2)}} file.txt
Explanation:
if ($2==0) {print $1, "inf"} else {...} - First check to see if the 2nd field ($2) is zero. If so, print $1 and inf and move on to the next line; otherwise proceed as usual.
BEGIN{OFS="\t"} - Set OFS inside the awk script; mostly a preference thing.
... file.txt - awk can read from files when you specify it as an argument; this saves the use of a cat process. (See UUCA)
awk -F'\t' '{print $1,($2 ? log($3/$2)log(2) : "inf")}' file.txt