Confusion about conditional expression in AWK - awk

Here is my input file:
$ cat abc
0 1
2 3
4 5
Why does the following give a one-column output instead of a two-column one?
$ cat abc | awk '{ print $1==0?"000":"111" $1==0? "222":"333" }'
000
333
333
Shouldn't the output be the following?
000 222
111 333
111 333

I think awk is going to parse this as:
awk '{ print ($1==0) ? "000" : (("111" $1==0) ? "222" : "333") }'
That is, when it prints the three zeros, it doesn't even consider the rest of the operation. And when it doesn't print the three zeros, it prints the triple threes because "111" concatenated with any string is not going to evaluate to zero.
You probably want to use:
awk '{ print ($1==0?"000":"111"), ($1==0? "222":"333") }'
where the comma puts a space (OFS or output field separator, to be precise) in the output between the two strings. Or you might prefer:
awk '{ print ($1==0?"000":"111") ($1==0? "222":"333") }'
which concatenates the two strings with no space.

Related

awk: Perform arithmetic on subset of columns and print all columns with modified values

I have columns containing some values e.g.:
A 10 20 30 AA AAA AAAA
B 40 50 60 BB BBB BBBB
C 70 80 90 CC CCC CCCC
I want to perform an arithmetic operation like multiplication on cols 2,3,4 and return a new table.
A 100 200 300 AA AAA AAAA
B 400 500 600 BB BBB BBBB
C 700 800 900 CC CCC CCCC
I can operate specifically on cols 2,3,4 using
awk '{print $2*10"\s"$3*10"\s"$4*10}' inp > out
but dont know how to print the entire table with cols with modified values. Is there a way to do this in awk?
Adding a generic solution here, written and tested with shown samples in GNU awk. Just mention all field numbers in fields variable of awk with comma separated and mention digit by which you want to multiply fields in multiplyBy and that should do the trick.
awk -v multplyBy="10" -v fields="2,3,4" '
BEGIN{
num=split(fields,arr,",")
for(i=1;i<=num;i++){
look[arr[i]]
}
}
{
for(i=1;i<=NF;i++){
if(i in look){
$i=($i * multplyBy)
}
}
}
1' Input_file
NOTE: Just now saw user's comments in other answer. In case some one wants to skip first 5 lines then change { before for loop to FNR>5{ and that should do the trick for it.
In your example you calculate and print together. With awk you can do any modifications first on the fields and print finally all the line or a part of it like this:
awk '{$2=10*$2; $3=10*$3; $4=10*$4} {print}' file
{print} with no arguments means {print $0}, print the whole line. Also it can be replaced by any true condition, like 1, for example awk '1' file means print every line.
So your command can be also:
awk '{$2=10*$2; $3=10*$3; $4=10*$4} 1' file
Additionally, before any body with actions ({}) we can have conditions. For example if we want to skip the first 5 lines, that condition is NR>5 where NR is the record (usually means row) number. So here we do not consider the 5 first lines for the calculation but we print them together with all lines:
awk 'NR>5 {$2=10*$2; $3=10*$3; $4=10*$4} {print}' file
Here we totally ignore 5 first lines, we don't print them too:
awk 'NR>5 {$2=10*$2; $3=10*$3; $4=10*$4; print}' file

How to add zeros to print according to number of digit in awk?

I have input:
12
2
3
56
and I would like to print a word with number in first column consisting from 5 numbers. How to add zeros before this number please?
awk '{print "file"$1".ASC}' input > output
The desired output:
file00012
file00002
file00003
file00056
$ awk '{printf "file%05d\n", $1}' infile.txt
file00012
file00002
file00003
file00056

Print every second consequtive field in two columns - awk

Assume the following file
#zvview.exe
#begin Present/3
77191.0000 189.320100 0 0 3 0111110 16 1
-8.072430+6-8.072430+6 77190 0 1 37111110 16 2
37 2 111110 16 3
8.115068+6 0.000000+0 8.500000+6 6.390560-2 9.000000+6 6.803440-1111110 16 4
9.500000+6 1.685009+0 1.000000+7 2.582780+0 1.050000+7 3.260540+0111110 16 5
37 2 111110 16 18
What I would like to do, is print in two columns, the fields after line 6. This can be done using NR. The tricky part is the following : Every second field, should go in one column as well as adding an E before the sign, so that the output file will look like this
8.115068E+6 0.000000E+0
8.500000E+6 6.390560E-2
9.000000E+6 6.803440E-1
9.500000E+6 1.685009E+0
1.000000E+7 2.582780E+0
1.050000E+7 3.260540E+0
From the output file you see that I want to keep in $6 only length($6)=10 characters.
How is it possible to do it in awk?
can do all in awk but perhaps easier with the unix toolset
$ sed -n '6,7p' file | cut -c2-66 | tr ' ' '\n' | pr -2ats' '
8.115068+6 0.000000+0
8.500000+6 6.390560-2
9.000000+6 6.803440-1
9.500000+6 1.685009+0
1.000000+7 2.582780+0
1.050000+7 3.260540+0
Here is a awk only solution or comparison
$ awk 'NR>=6 && NR<=7{$6=substr($6,1,10);
for(i=1;i<=6;i+=2) {f[++c]=$i;s[c]=$(i+1)}}
END{for(i=1;i<=c;i++) print f[i],s[i]}' file
8.115068+6 0.000000+0
8.500000+6 6.390560-2
9.000000+6 6.803440-1
9.500000+6 1.685009+0
1.000000+7 2.582780+0
1.050000+7 3.260540+0
Perhaps shorter version,
$ awk 'NR>=6 && NR<=7{$6=substr($6,1,10);
for(i=1;i<=6;i+=2) print $i FS $(i+1)}' file
8.115068+6 0.000000+0
8.500000+6 6.390560-2
9.000000+6 6.803440-1
9.500000+6 1.685009+0
1.000000+7 2.582780+0
1.050000+7 3.260540+0
to convert format to standard scientific notation, you can pipe the result to
sed or embed something similar in awk script (using gsub).
... | sed 's/[+-]/E&/g'
8.115068E+6 0.000000E+0
8.500000E+6 6.390560E-2
9.000000E+6 6.803440E-1
9.500000E+6 1.685009E+0
1.000000E+7 2.582780E+0
1.050000E+7 3.260540E+0
With GNU awk for FIELDWIDTHS:
$ cat tst.awk
BEGIN { FIELDWIDTHS="9 2 9 2 9 2 9 2 9 2 9 2" }
NR>5 && NR<8 {
for (i=1;i<NF;i+=4) {
print $i "E" $(i+1), $(i+2) "E" $(i+3)
}
}
$ awk -f tst.awk file
8.115068E+6 0.000000E+0
8.500000E+6 6.390560E-2
9.000000E+6 6.803440E-1
9.500000E+6 1.685009E+0
1.000000E+7 2.582780E+0
1.050000E+7 3.260540E+0
If you really want to get rid of the leading blanks then there's various ways to do it (simplest being gsub(/ /,"",$<field number>) on the relevant fields) but I left them in because the above allows your output to line up properly if/when your numbers start with a -, like they do on line 4 of your sample input.
If you don't have GNU awk, get it as you're missing a LOT of extremely useful functionality.
I tried to combine #karafka 's answer using substr, so the following does the trick!
awk 'NR>=6 && NR<=7{$6=substr($6,1,10);for(i=1;i<=6;i+=2) print substr($i,1,8) "E" substr($i,9) FS substr($(i+1),1,8) "E" substr($(i+1),9)}' file
and the output is
8.115068E+6 0.000000E+0
8.500000E+6 6.390560E-2
9.000000E+6 6.803440E-1
9.500000E+6 1.685009E+0
1.000000E+7 2.582780E+0
1.050000E+7 3.260540E+0

Confusion about awk command when dealing with if statement

$ cat awk.txt
12 32 45
5 2 3
33 11 33
$ cat awk.txt | awk '{FS='\t'} $1==5 {print $0}'
5 2 3
$ cat awk.txt | awk '{FS='\t'} $1==33 {print $0}'
Nothing is returned when judging the first field is 33 or not. It's confusing.
By saying
awk '{FS='\t'} $1==5 {print}' file
You are defining the field separator incorrectly. To make it be a tab, you need to say "\t" (with double quotes). Further reading: awk not capturing first line / separator.
Also, you are setting it every line, so it does not affect the first one. You want to use:
awk 'BEGIN{FS='\t'} $1==5' file
Yes, but why did it work in one case but not in the other?
awk '{FS='\t'} $1==5' file # it works
awk '{FS='\t'} $1==33' file # it does not work
You're using single quotes around '\t', which means that you're actually concatenating 3 strings together: '{FS=', \t and '} $1==5' to produce your awk command. The shell interprets the \t as t, so your awk script is actually:
awk '{FS=t} $1==5'
The variable t is unset, so you're setting the field separator to the empty string "". This means that the line is split into as many fields as characters you have. You can see it doing awk 'BEGIN{FS='\t'} {print NF}' file, that will show how many fields each record has.
Then, $1 is just 3 and $2 contains the second 3.
first of all !. Could you explain better what you really want to do before you ask ?. look....!
more awk.txt
12 32 45
5 2 3
33 11 33
awk -F"[ \t]" '$1 == 5 { print $0}' awk.txt
5 2 3
awk -F"[ \t]" '$1 == 33 { print $0}' awk.txt
33 11 33
awk -F"[ \t]" '$1 == 12 { print $0}' awk.txt
12 32 45
http://www.staff.science.uu.nl/~oostr102/docs/nawk/nawk_23.html
Fcs

awk to handle un formatted input

Would like know how to handle below situation, sample input delimited by space and want to format as comma-separated output.
All the text in a line up until the first field starting with a digit should be considered as a single field in the output. In the sample data, there are always 3 numeric fields at the end of a line; in the real data, there are 14 such fields.
Input.txt
mmm 4394850 4465411 2579770
xxx yyy 2155419 2178791 1516446
aaa bbb (incl. ccc) 14291585 14438704 6106341
U.U.(W) 6789781 6882021 5940226
nnn 7335050 7534302 2963345
Have tried the command below, but I know it is incomplete:
awk 'BEGIN {FS =" "; OFS = ","} {print $1,$2,$3,$4,$5,$6} ' Input.txt
Desired output:
mmm,4394850,4465411,2579770
xxx yyy,2155419,2178791,1516446
aaa bbb (incl. ccc),14291585,14438704,6106341
U.U.(W),6789781,6882021,5940226
nnn,7335050,7534302,2963345
With GNU awk for gensub():
$ awk '{match($0,/[0-9 ]+$/); print substr($0,1,RSTART-1) gensub(/ /,",","g",substr($0,RSTART,RLENGTH))}' file
mmm,4394850,4465411,2579770
xxx yyy,2155419,2178791,1516446
aaa bbb (incl. ccc),14291585,14438704,6106341
U.U.(W),6789781,6882021,5940226
nnn,7335050,7534302,2963345
with other awks, save the 2nd substr() output in a var and use gsub():
awk '{match($0,/[0-9 ]+$/); digs=substr($0,RSTART,RLENGTH); gsub(/ /,",",digs); print substr($0,1,RSTART-1) digs}' file
Assuming that it's the last 3 columns that are numerical (as in your example):
awk '{for(i=1;i<=NF;++i)printf "%s%s",$i,(i<NF-3?OFS:(i<NF?",":ORS))}' file
Basically print each field followed by a space, comma or newline depending on the field number.
Another awk
awk '$0=gensub(/ ([0-9]+)/,",\\1","g")' file
mmm,4394850,4465411,2579770
xxx yyy,2155419,2178791,1516446
aaa bbb (incl. ccc),14291585,14438704,6106341
U.U.(W),6789781,6882021,5940226
nnn,7335050,7534302,2963345