Automatically round off - awk command

Automatically round off - awk command - awk

I have a csv file and I want to add a column that takes some values from other columns and make some calculations. As a simplified version I'm trying this:
awk -F"," '{print $0, $1+1}' myFile.csv |head -1
The output is:
29.325172701023977,...other columns..., 30
The column added should be 30.325172701023977 but the output is rounded off.
I tried some options using printf, CONVFMT and OFMT but nothing worked.
How can I avoid the round off?

Assumptions:
the number of decimal places is not known beforehand
the number of decimal places can vary from line to line
Setup:
$ cat myfile.csv
29.325172701023977,...other columns...
15.12345,...other columns...
120.666777888,...other columns...
46,...other columns...
One awk idea where we use the number of decimal places to dynamically generate the printf "%.?f" format:
awk '
BEGIN { FS=OFS="," }
{ split($1,arr,".") # split $1 on period
numdigits=length(arr[2]) # count number of decimal places
newNF=sprintf("%.*f",numdigits,$1+1) # calculate $1+1 and format with "numdigits" decimal places
print $0,newNF # print new line
}
' myfile.csv
NOTE: assumes OP's locale uses a decimal/period to separate integer from fraction; for a locale that uses a comma to separate integer from fraction it gets more complicated since it will be impossible to distinguish between a comma as integer/fraction delimiter vs field delimiter without some changes to the file's format
This generates:
29.325172701023977,...other columns...,30.325172701023977
15.12345,...other columns...,16.12345
120.666777888,...other columns...,121.666777888
46,...other columns...,47

as long as you aren't dealing with numbers greater than 9E15, there's no need to fudge any one of CONVFMT, OFMT, or s/printf() at all :
{m,g}awk '$++NF = int((_=$!__) + sub("^[^.]*",__,_))_' FS=',' OFS=','
29.325172701023977,...other columns...,30.325172701023977
15.12345,...other columns...,16.12345
120.666777888,...other columns...,121.666777888
46,...other columns…,47
if mawk-1 is sending ur numbers to scientific notation do :
mawk '$++NF=int((_=$!!NF)+sub("^[^.]*",__,_))_' FS=',' OFS=',' CONVFMT='%.f'
when u scroll right you'll notice all input digits beyond the decimal point are fully preserved
2929292929.32323232325151515151727272727270707070701010101010232323232397979797977,...other columns...,2929292930.32323232325151515151727272727270707070701010101010232323232397979797977
1515151515.121212121234343434345,...other columns...,1515151516.121212121234343434345
12121212120.66666666666767676767777777777788888888888,...other columns...,12121212121.66666666666767676767777777777788888888888
4646464646,...other columns…,4646464647
2929.32325151727270701010232397977,...other columns...,2930.32325151727270701010232397977
1515.121234345,...other columns...,1516.121234345
12120.66666767777788888,...other columns...,12121.66666767777788888
4646,...other columns...,4647
change it to CONVFMT='%\47.f', and you can even get mawk-1 to nicely comma format them for u :
29292929292929.323232323232325151515151515172727272727272707070707070701010101010101023232323232323979797979797977,...other columns…,29,292,929,292,930.323232323232325151515151515172727272727272707070707070701010101010101023232323232323979797979797977
15151515151515.12121212121212343434343434345,...other columns…,15,151,515,151,516.12121212121212343434343434345
121212121212120.666666666666666767676767676777777777777777888888888888888,…other columns…,121,212,121,212,121.666666666666666767676767676777777777777777888888888888888
46464646464646,...other columns...,46,464,646,464,647

Related

AWK script- Not showing data

I'm trying to create a variable to sum columns 26 to 30 and 32.
SO far I have this code which prints me the hearder and the output format like I want but no data is being shown.
#! /usr/bin/awk -f
BEGIN { FS="," }
NR>1 {
TotalPositiveStats= ($26+$27+$28+$29+$30+$32)
}
{printf "%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%.2f %,%s,%s,%.2f %,%s,%s,%.2f %,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s, %s\n",
EndYear,Rk,G,Date,Years,Days,Age,Tm,Home,Opp,Win,Diff,GS,MP,FG,FGA,FG_PCT,3P,3PA,3P_PCT,FT,FTA,FT_PCT,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,GmSc,TotalPositiveStats
}
NR==1 {
print "EndYear,Rk,G,Date,Years,Days,Age,Tm,HOme,Opp,Win,Diff,GS,MP,FG,FGA,FG_PCT,3P,3PA,3P_PCT,FT,FTA,FT_PCT,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,GmSc,TotalPositiveStats" }#header
Input data:
EndYear,Rk,G,Date,Years,Days,Age,Tm,Home,Opp,Win,Diff,GS,MP,FG,FGA,FG_PCT,3P,3PA,3P_PCT,FT,FTA,FT_PCT,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,GmSc
1985,1,1,10/26/1984,21,252,21.6899384,CHI,1,WSB,1,16,1,40,5,16,0.313,0,0,,6,7,0.857,1,5,6,7,2,4,5,2,16,12.5
1985,2,2,10/27/1984,21,253,21.69267625,CHI,0,MIL,0,-2,1,34,8,13,0.615,0,0,,5,5,1,3,2,5,5,2,1,3,4,21,19.4
1985,3,3,10/29/1984,21,255,21.69815195,CHI,1,MIL,1,6,1,34,13,24,0.542,0,0,,11,13,0.846,2,2,4,5,6,2,3,4,37,32.9
1985,4,4,10/30/1984,21,256,21.7008898,CHI,0,KCK,1,5,1,36,8,21,0.381,0,0,,9,9,1,2,2,4,5,3,1,6,5,25,14.7
1985,5,5,11/1/1984,21,258,21.7063655,CHI,0,DEN,0,-16,1,33,7,15,0.467,0,0,,3,4,0.75,3,2,5,5,1,1,2,4,17,13.2
1985,6,6,11/7/1984,21,264,21.72279261,CHI,0,DET,1,4,1,27,9,19,0.474,0,0,,7,9,0.778,1,3,4,3,3,1,5,5,25,14.9
1985,7,7,11/8/1984,21,265,21.72553046,CHI,0,NYK,1,15,1,33,15,22,0.682,0,0,,3,4,0.75,4,4,8,5,3,2,5,2,33,29.3
Output expected:
EndYear,Rk,G,Date,Years,Days,Age,Tm,Home,Opp,Win,Diff,GS,MP,FG,FGA,FG_PCT,3P,3PA,3P_PCT,FT,FTA,FT_PCT,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,GmSc,TotalPositiveStats
1985,1,1,10/26/1984,21,252,21.6899384,CHI,1,WSB,1,16,1,40,5,16,0.313,0,0,,6,7,0.857,1,5,6,7,2,4,5,2,16,12.5,35
1985,2,2,10/27/1984,21,253,21.69267625,CHI,0,MIL,0,-2,1,34,8,13,0.615,0,0,,5,5,1,3,2,5,5,2,1,3,4,21,19.4,34
1985,3,3,10/29/1984,21,255,21.69815195,CHI,1,MIL,1,6,1,34,13,24,0.542,0,0,,11,13,0.846,2,2,4,5,6,2,3,4,37,32.9,54
1985,4,4,10/30/1984,21,256,21.7008898,CHI,0,KCK,1,5,1,36,8,21,0.381,0,0,,9,9,1,2,2,4,5,3,1,6,5,25,14.7,38
1985,5,5,11/1/1984,21,258,21.7063655,CHI,0,DEN,0,-16,1,33,7,15,0.467,0,0,,3,4,0.75,3,2,5,5,1,1,2,4,17,13.2,29
1985,6,6,11/7/1984,21,264,21.72279261,CHI,0,DET,1,4,1,27,9,19,0.474,0,0,,7,9,0.778,1,3,4,3,3,1,5,5,25,14.9,36
1985,7,7,11/8/1984,21,265,21.72553046,CHI,0,NYK,1,15,1,33,15,22,0.682,0,0,,3,4,0.75,4,4,8,5,3,2,5,2,33,29.3,51
This script will be called like gawk -f script.awk <filename>.
Currently when calling this is the output (It seems to be calculating the variable but the rest of fields are empty)

awk is well suited to summing columns:
awk 'NR>1{$(NF+1)=$26+$27+$28+$29+$30+$32}1' FS=, OFS=, input-file > tmp
mv tmp input-file
That doesn't add a field in the header line, so you might want something like:
awk '{$(NF+1) = NR>1 ? ($26+$27+$28+$29+$30+$32) : "TotalPositiveStats"}1' FS=, OFS=,

An explanation on the issues with the current printf output is covered in the 2nd half of this answer (below).
It appears OP's objective is to reformat three of the current fields while also adding a new field on the end of each line. (NOTE: certain aspects of OPs code are not reflected in the expected output so I'm not 100% sure what OP is looking to generate; regardless, OP should be able to tweak the provided code to generate the desired result)
Using sprintf() to reformat the three fields we can rewrite OP's current code as:
awk '
BEGIN { FS=OFS="," }
NR==1 { print $0, "TotalPositiveStats"; next }
{ TotalPositiveStats = ($26+$27+$28+$29+$30+$32)
$17 = sprintf("%.3f",$17) # FG_PCT
if ($20 != "") $20 = sprintf("%.3f",$20) # 3P_PCT
$23 = sprintf("%.3f",$23) # FT_PCT
print $0, TotalPositiveStats
}
' raw.dat
NOTE: while OP's printf shows a format of %.2f % for the 3 fields of interest ($17, $20, $23), the expected output shows that the fields are not actually being reformatted (eg, $17 remains %.3f, $20 is an empty string, $23 remains %.2f); I've opted to leave $20 as blank otherwise reformat all 3 fields as %.3f; OP can modify the sprintf() calls as needed
This generates:
EndYear,Rk,G,Date,Years,Days,Age,Tm,Home,Opp,Win,Diff,GS,MP,FG,FGA,FG_PCT,3P,3PA,3P_PCT,FT,FTA,FT_PCT,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,GmSc,TotalPositiveStats
1985,1,1,10/26/1984,21,252,21.6899384,CHI,1,WSB,1,16,1,40,5,16,0.313,0,0,,6,7,0.857,1,5,6,7,2,4,5,2,16,12.5,40
1985,2,2,10/27/1984,21,253,21.69267625,CHI,0,MIL,0,-2,1,34,8,13,0.615,0,0,,5,5,1.000,3,2,5,5,2,1,3,4,21,19.4,37
1985,3,3,10/29/1984,21,255,21.69815195,CHI,1,MIL,1,6,1,34,13,24,0.542,0,0,,11,13,0.846,2,2,4,5,6,2,3,4,37,32.9,57
1985,4,4,10/30/1984,21,256,21.7008898,CHI,0,KCK,1,5,1,36,8,21,0.381,0,0,,9,9,1.000,2,2,4,5,3,1,6,5,25,14.7,44
1985,5,5,11/1/1984,21,258,21.7063655,CHI,0,DEN,0,-16,1,33,7,15,0.467,0,0,,3,4,0.750,3,2,5,5,1,1,2,4,17,13.2,31
1985,6,6,11/7/1984,21,264,21.72279261,CHI,0,DET,1,4,1,27,9,19,0.474,0,0,,7,9,0.778,1,3,4,3,3,1,5,5,25,14.9,41
1985,7,7,11/8/1984,21,265,21.72553046,CHI,0,NYK,1,15,1,33,15,22,0.682,0,0,,3,4,0.750,4,4,8,5,3,2,5,2,33,29.3,56
NOTE: in OP's expected output it appears the last/new field (TotalPositiveStats) does not contain the value from $30 hence the mismatch between the expected results and this answer; again, OP can modify the assignment statement for TotalPositiveStats to include/exclude fields as needed
Regarding the issues with the current printf ...
{printf "%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%.2f %,%s,%s,%.2f %,%s,%s,%.2f %,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s, %s\n",
EndYear,Rk,G,Date,Years,Days,Age,Tm,Home,Opp,Win,Diff,GS,MP,FG,FGA,FG_PCT,3P,3PA,3P_PCT,FT,FTA,FT_PCT,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS,GmSc,TotalPositiveStats}
... is referencing (awk) variables that have not been defined (eg, EndYear, Rk, G). [NOTE: one exeception is the very last variable in the list - TotalPositiveStats - which has in fact been defined earlier in the script.]
The default value for undefined variables is the empty string ("") or zero (0), depending on how the awk code is referencing the variable, eg:
printf "%s", EndYear => EndYear is treated as a string and the printed result is an empty string; with an output field delimiter of a comma (,) this empty strings shows up as 2 commas next to each other (,,)
printf "%.2f %", FG_PCT => FG_PCT is treated as a numeric (because of the %f format) and the printed result is 0.00 %
Where it gets a little interesting is when the (undefined) variable name starts with a numeric (eg, 3P) in which case the P is ignored and the entire reference is treated as a number, eg:
printf "%s", 3P => 3P is processed as 3 and the printed result is 3
This should explain the 5 static values (0.00 %, 3, 3, 3.00 % and 0.00 %) printed in all output lines as well as the 'missing' values between the rest of the commas (eg, ,,,,).
Obviously the last value in the line is an actual number, ie, the value of the awk variable TotalPositiveStats.

awk command or sed command

000Bxxxxx111118064085vxas - header
10000000001000000000053009-000000000053009-
10000000005000000000000000+000000000000000+
10000000030000000004025404-000000004025404-
10000000039000000000004930-000000000004930-
10000000088000005417665901-000005417665901-
90000060883328364801913 - trailer
In the above file we have header and trailer and the records which start with 1 is the detail record
in the detail record,want to sum the values starting from position 28 till 44 including the sign using awk/sed command

Here is sed, with help from bc to do the arithmetic:
sed -rn '
/header|trailer/! {
s/[[:digit:]]*[+-]([[:digit:]]+)([+-])$/\2\1/
H
}
$ {
x
s/\n//gp
}
' file | bc
I assume the +/- sign follows the number.

Using awk we can solve this problem making use of substr:
substr(s, m[, n ]):
Return the at most n-character substring of s that begins at position m, numbering from 1. If n is omitted, or if n specifies more characters than are left in the string, the length of the substring shall be limited by the length of the string s.
This allows us to take the string which represents the number. Here, I assumed that the sign before and after the number is same and thus the sign of the number :
$ echo "10000000001000000000053009-000000000053009-" \
| awk '{print length($0); print substr($0,27,43-27)}'
43
-000000000053009
Since awk implicitly converts strings to numbers if you do numeric operations on them we can write the following awk-code to achieve the requested :
$ awk '/header|trailer/{next}
{s+=substr($0,27,43-27)}
END{print s}' file.dat
-5421749244
Or in a single line:
$ awk '/header|trailer/{next}{s+=substr($0,27,43-27)} END{print s}' file.dat
-5421749244
The above examples just work on the example file given by the OP. However, if you have a file containing multiple blocks with header and trailer and you only want to use the text inside these blocks (exclude everything outside of the blocks), then you should handle it a bit differently :
$ awk '/header/{s=0;c=1;next}
/trailer/{S+=s;c=0;next}
c{s+=substr($0,27,43-27)}
END{print S}' file.dat
Here we do the following:
If a line with header is found, reset the block sum s to ZERO and set c=1 indicating that we take the next lines into account
If a line with trailer is found, add the block sum s to the overall sum S and set c=0 indicating to ignore the lines.
If c/=0 compute the block sum s
At the END, print the total sum S

Comparing hexadecimal values with awk

I'm having trouble with awk and comparing values. Here's a minimal example :
echo "0000e149 0000e152" | awk '{print($1==$2)}'
Which outputs 1 instead of 0. What am I doing wrong ? And how should I do to compare such values ?
Thanks,

To convert a string representing a hex number to a numerical value, you need 2 things: prefix the string with "0x" and use the strtonum() function.
To demonstrate:
echo "0000e149 0000e152" | gawk '{
print $1, $1+0
print $2, $2+0
n1 = strtonum("0x" $1)
n2 = strtonum("0x" $2)
print $1, n1
print $2, n2
}'
0000e149 0
0000e152 0
0000e149 57673
0000e152 57682
We can see that naively treating the strings as numbers, awk thinks their value is 0. This is because the digits preceding the first non-digit happen to be only zeros.
Ref: https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html
Note that strtonum is a GNU awk extension

You need to convert $1 and $2 to strings in order to enforce alphanumeric comparison. This can be done by simply append an empty string to them:
echo "0000e149 0000e152" | awk '{print($1""==$2"")}'
Otherwise awk would perform a numeric comparison. awk will need to convert them to numeric values in this case. Converting those values to numbers in awk leads to 0 - because of the leading zero(s) they are treated as octal numbers but parsing as an octal number fails because the values containing invalid digits which aren't allowed in octal numbers, which results in 0. You can verify that using the following command:
echo "0000e149 0000e152" | awk '{print $1+0; print $2+0)}'
0
0

When using non-decimal data you just need to tell gawk that's what you're doing and specify what base you're using in each number:
$ echo "0xe152 0x0000e152" | awk --non-decimal-data '{print($1==$2)}'
1
$ echo "0xE152 0x0000e152" | awk --non-decimal-data '{print($1==$2)}'
1
$ echo "0xe149 0x0000e152" | awk --non-decimal-data '{print($1==$2)}'
0
See http://www.gnu.org/software/gawk/manual/gawk.html#Nondecimal-Data

i think many forgot the fact that the hexdigits 0-9 A-F a-f rank order in ASCII - instead of wasting time performing the conversion, or risk facing numeric precision shortfall, simply :
trim out leading edge zeros, including the optional 0x / 0X
depending on the input source, also trim out delimiters such as ":" (e.g. IPv6, MAC address), "-" (e.g. UUID), "_" (e.g. "0xffff_ffff_ffff_ffff"), "%" (e.g. URL-encoding) etc
—- be mindful of the need to pad in missing leading zeros for formats that are very flexible with delimiters, such as IPv6
compare their respective string length()s :
if those differ, then one is already distinctly larger,
— otherwise
prefix both with something meaningless like "\1" to guarantee a string-compare operation without risk of either awk being too smart or running into extreme edge cases like locale-specific peculiarities to its collating order :
(("\1") toupper(hex_str_1)) == (("\1") toupper(hex_str_2))

setting default numeric format in awk

I wanted to do a simple parsing of two files with ids and some corresponding numerical values. I didn't want awk to print numbers in scientific notation.
File looks like this:
someid-1 860025 50.0401 4.00022
someid-2 384319 22.3614 1.78758
someid-3 52096 3.03118 0.242314
someid-4 43770 2.54674 0.203587
someid-5 33747 1.96355 0.156967
someid-6 20281 1.18004 0.0943328
someid-7 12231 0.711655 0.0568899
someid-8 10936 0.636306 0.0508665
someid-9 10224.8 0.594925 0.0475585
someid-10 10188.8 0.59283 0.047391
when use print instead of printf :
awk 'BEGIN{FS=OFS="\t"} NR==FNR{x[$1]=$0;next} ($1 in x){split(x[$1],k,FS); print $1,k[2],k[3],k[4],$2,$3,$4}' OSCAo.txt dme_miRNA_PIWI_OSC.txt | sort -n -r -k 7 | head
i get this result:
dme-miR-iab-4-5p 0.333333 0.000016 0.000001 0.25 0.000605606 9.36543e-07
dme-miR-9c-5p 10987.300000 0.525413 0.048798 160.2 0.388072 0.000600137
dme-miR-9c-3p 731.986000 0.035003 0.003251 2.10714 0.00510439 7.89372e-06
dme-miR-9b-5p 30322.500000 1.450020 0.134670 595.067 1.4415 0.00222922
dme-miR-9b-3p 2628.280000 0.125684 0.011673 48 0.116276 0.000179816
dme-miR-9a-3p 10.365000 0.000496 0.000046 0.25 0.000605606 9.36543e-07
dme-miR-999-5p 103.433000 0.004946 0.000459 0.0769231 0.00018634 2.88167e-07
dme-miR-999-3p 1513.790000 0.072389 0.006723 28 0.0678278 0.000104893
dme-miR-998-5p 514.000000 0.024579 0.002283 73 0.176837 0.000273471
dme-miR-998-3p 3529.000000 0.168756 0.015673 42 0.101742 0.000157339
Notice the scientific notation in the last column
I understand that printf with appropriate format modifier can do the job but the code becomes very lengthy. I have to write something like this:
awk 'BEGIN{FS=OFS="\t"} NR==FNR{x[$1]=$0;next} ($1 in x){split(x[$1],k,FS); printf "%s\t%3.6f\t%3.6f\t%3.6f\t%3.6f\t%3.6f\t%3.6f\n", $1,k[2],k[3],k[4],$2,$3,$4}' file1.txt file2.txt > fileout.txt
This becomes clumsy when I have to parse fileout with another similarly structured file.
Is there any way to specify default numerical output, such that any string will be printed like a string but all numbers follow a particular format.

I think You misinterpreted the meaning of %3.6f. The first number before the decimal point is the field width not the "number of digits before decimal point". (See prinft(3))
So You should use %10.6f instead. It can be tested easily in bash
$ printf "%3.6f\n%3.6f\n%3.6f" 123.456 12.345 1.234
123.456000
12.345000
1.234000
$ printf "%10.6f\n%10.6f\n%10.6f" 123.456 12.345 1.234
123.456000
12.345000
1.234000
You can see that the later aligns to the decimal point properly.
As sidharth c nadhan mentioned You can use the OFMT awk internal variable (seem awk(1)). An example:
$ awk 'BEGIN{print 123.456; print 12.345; print 1.234}'
123.456
12.345
1.234
$ awk -vOFMT=%10.6f 'BEGIN{print 123.456; print 12.345; print 1.234}'
123.456000
12.345000
1.234000
As I see in You example the number with maximum digits can be 123456.1234567, so the format %15.7f to cover all and show a nice looking table.
But unfortunately it will not work if the number has no decimal point in it or even if it does, but it ends with .0.
$ awk -vOFMT=%15.7f 'BEGIN{print 123.456;print 123;print 123.0;print 0.0+123.0}'
123.4560000
123
123
123
I even tried gawk's strtonum() function, but the integers are considered as non-OFMT strings. See
awk -vOFMT=%15.7f -vCONVFMT=%15.7f 'BEGIN{print 123.456; print strtonum(123); print strtonum(123.0)}'
It has the same output as before.
So I think, you have to use printf anyway. The script can be a little bit shorter and a bit more configurable:
awk -vf='\t'%15.7f 'NR==FNR{x[$1]=sprintf("%s"f f f,$1,$2,$3,$4);next}$1 in x{printf("%s"f f f"\n",x[$1],$2,$3,$4)}' file1.txt file2.txt
The script will not work properly if there are duplicated IDs in the first file. If it does not happen then the two conditions can be changed and the ;next can be left off.

awk 'NR==FNR{x[$1]=$0;next} ($1 in x){split(x[$1],k,FS); printf "%s\t%9s\t%9s\t%9s\t%9s\t%9s\t%9s\n", $1,k[2],k[3],k[4],$2,$3,$4}' file1.txt file2.txt > fileout.txt

formatted reading using awk

I am trying to read in a formatted file using awk. The content looks like the following:
1PS1 A1 1 11.197 5.497 7.783
1PS1 A1 1 11.189 5.846 7.700
.
.
.
Following c format, these lines are in following format
"%5d%5s%5s%5d%8.3f%.3f%8.3f"
where, first 5 positions are integer (1), next 5 positions are characters (PS1), next 5 positions are characters (A1), next 5 positions are integer (1), next 24 positions are divided into 3 columns of 8 positions with 3 decimal point floating numbers.
What I've been using is just calling these lines separated by columns using "$1, $2, $3". For example,
cat test.gro | awk 'BEGIN{i=0} {MolID[i]=$1; id[i]=$2; num[i]=$3; x[i]=$4;
y[i]=$5; z[i]=$6; i++} END { ...} >test1.gro
But I ran into some problems with this, and now I am trying to read these files in a formatted way as discussed above.
Any idea how I do this?

Looking at your sample input, it seems the format string is actually "%5d%-5s%5s%5d%8.3f%.3f%8.3f" with the first string field being left-justified. It's too bad awk doesn't have a scanf() function, but you can get your data with a few substr() calls
awk -v OFS=: '
{
a=substr($0,1,5)
b=substr($0,6,5)
c=substr($0,11,5)
d=substr($0,16,5)
e=substr($0,21,8)
f=substr($0,29,8)
g=substr($0,37,8)
print a,b,c,d,e,f,g
}
'
outputs
1:PS1 : A1: 1: 11.197: 5.497: 7.783
1:PS1 : A1: 1: 11.189: 5.846: 7.700
If you have GNU awk, you can use the FIELDWIDTHS variable like this:
gawk -v FIELDWIDTHS="5 5 5 5 8 8 8" -v OFS=: '{print $1, $2, $3, $4, $5, $6, $7}'
also outputs
1:PS1 : A1: 1: 11.197: 5.497: 7.783
1:PS1 : A1: 1: 11.189: 5.846: 7.700

You never said exactly which fields you think should have what number, so I'd like to be clear about how awk thinks that works (Your choice to be explicit about calling the whitespace in your output format string fields makes me worry a little. You might have a different idea about this than awk.).
From the manpage:
An input line is normally made up of fields separated by white space,
or by regular expression FS. The fields are denoted $1, $2, ..., while
$0 refers to the entire line. If FS is null, the input line is split
into one field per character.
Take note that the whitespace in the input line does not get assigned a field number and that sequential whitespace is treated as a single field separator.
You can test this with something like:
echo "1 2 3 4" | awk '{print "1:" $1 "\t2:" $2 "\t3:" $3 "\t4:" $4}'
at the command line.
All of this assumes that you have not diddles the FS variable, of course.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Automatically round off - awk command - awk

Related

AWK script- Not showing data

awk command or sed command

Comparing hexadecimal values with awk

setting default numeric format in awk

formatted reading using awk

Categories

Resources