AWK reading a file and operate between different columns - awk

I have three files!
coord.xvg
veloc.xvg
force.xvg
each of these files have lines with multiple numbers lets say 10000
I would like to construct a script that
opens the three files
reads columns
and make arithmetic operations between them for every line.
For example
if every file has 4 words
coord.xvg >> Time x y z
veloc.xvg >> Time vx vy vz
force.xvg >> Time fx fy fz
and c,v,f stands for coord.xvg, veloc.xvg,force.xvg
if I write the operation 2*v*v+c*f*c the output should be
column1 Column2 Column3 Column4
Time 2*vx*vx+cx*fx*cx 2*vy*vy+cy*fy*cy 2*vz*vz+cz*fz*cz
I have found in the internet the following
awk '{
{ getline < "coord.xvg" ; if (FNR==90307) for(i=1;i<=2;i+=1) c=$i}
{ getline < "veloc.xvg" ; if (FNR==90307) for(i=1;i<=2;i+=1) v=$i}
{ getline < "force.xvg" ; if (FNR==90307) for(i=1;i<=2;i+=1) f=$i}
}
END {print c+v+f}' coord.xvg
which stands for my files which I want to begin reading after 90307 lines.
but it didn't help me much as it returns only the last values of every variable
Any thought??

Something to get you started if I understood you correctly
$ cat *.xvg
Time 1 2 3
Time 4 5 6
Time 7 8 9
Time 10 11 12
Time 13 14 15
Time 16 17 18
Time 19 20 21
Time 22 23 24
Time 25 26 27
The following awk-script
{ if (FNR>=1) {
{ getline < "coord.xvg" ; c1=$2;c2=$3;c3=$4}
{ getline < "veloc.xvg" ; v1=$2;v2=$3;v3=$4}
{ getline < "force.xvg" ; f1=$2;f2=$3;f3=$4}
print c1,c2,c3,v1,v2,v3,f1,f2,f3
print $1, c1+v1+f1, c2+v2+f2, c3+v3+f3
}}
reads a line from each of the files and puts the data in variables
as can be seen here
$ awk -f s.awk coord.xvg
1 2 3 19 20 21 10 11 12
Time 30 33 36
4 5 6 22 23 24 13 14 15
Time 39 42 45
7 8 9 25 26 27 16 17 18
Time 48 51 54
The if (FNR>=1) part controls which lines are displayed. Counting starts at 1, change this to match your need. The actual calculations I leave to you :-)

Related

how to grep a N (7) column row before a grepped number from a long text file

I have a text file with a spacial format.
After the top "N" rows, the file will have a 7 column row ans then there will be "X" rows (X is the value from column number 6 in this 7 column row). Then there will be another row with 7 column and it will have further "Y" sub-rows (Y is the value from column number 6 in this 7 column row). and it occurance of rows will go upto some fixed numbers, say 40.
En example is here
(I am skipping top few rows).
2.857142857143E-01 2.857142857143E-01-2.857142857143E-01 1 1533 9 1.0
1 -3.52823873905418
2 -3.52823873905417
3 -1.77620635653680
4 -1.77620635653680
5 -1.77620570068355
6 -1.77620570068354
7 -1.77620570066112
8 -1.77620570066112
9 -1.60388273192418
1.428571428571E-01 1.428571428571E-01-1.428571428571E-01 2 1506 14 8.0
1 -3.52823678441811
2 -3.52823678441810
3 -1.77620282216865
4 -1.77620282216865
5 -1.77619365786042
6 -1.77619365786042
7 -1.77619324280126
8 -1.77619324280125
9 -1.60387130881086
10 -1.60387130881086
11 -1.60387074066972
12 -1.60387074066972
13 -1.51340357895078
14 -1.51340357895078
1.000000000000E+00 4.285714285714E-01-1.428571428571E-01 20 1524 51 24.0
1 -3.52823580096110
2 -3.52823580096109
3 -1.77624472106293
4 -1.77624472106293
5 -1.77623455229910
6 -1.77623455229909
7 -1.77620473017160
8 -1.77620473017159
9 -1.60387169115834
10 -1.60387169115834
11 -1.60386634866654
12 -1.60386634866654
13 -1.51340851656332
14 -1.51340851656332
15 -1.51340086887553
16 -1.51340086887553
17 -1.51321967923767
18 -1.51321967923766
19 -1.40212716813452
20 -1.40212716813451
21 -1.40187887062753
22 -1.40187887062753
23 -0.749391485667459
24 -0.749391485667455
25 -0.740712218931955
26 -0.740712218931954
27 -0.714030906779278
28 -0.714030906779278
29 -0.689087278411268
30 -0.689087278411265
31 -0.687054399753234
32 -0.687054399753233
33 -0.677686868127079
34 -0.677686868127075
35 -0.405343895324740
36 -0.405343895324739
37 -0.404786479693490
38 -0.404786479693488
39 -0.269454266134757
40 -0.269454266134755
41 -0.267490250650300
42 -0.267490250650296
43 -0.262198373307171
44 -0.262198373307170
45 -0.260912148881762
46 -0.260912148881761
47 -9.015623907768122E-002
48 -9.015623907767983E-002
49 0.150591609452852
50 0.150591609452856
51 0.201194203960446
I want to grep a particular number from my text file and to do so, I use
awk -v c=2 -v t=$GREP 'NR==1{d=$c-t;d=d<0?-d:d;v=$c;next}{m=$c-t;m=m<0?-m:m}m<d{d=m;v=$c}END{print v}' case.dat
Here $GREP is 0.2011942 which prints the last row (it will change according to different file)
51 0.201194203960446
I want to print the header row also with this number, i.e., my script should print,
51 0.201194203960446
1.000000000000E+00 4.285714285714E-01-1.428571428571E-01 20 1524 51 24.0.
How can I print this header row of the grepped numbers?
I have idea, but I could not implement it in script format.
Simply, grep the number using my script and print the first row before this number that have 7 columns.
This may be what you're looking for
awk -v t="$GREP" '
BEGIN { sub("\\.", "\\.", t) }
NF > 2 { header=$0; next }
NF == 2 && $2 ~ t { printf("%s %s\n%s\n", $1, $2, header) }
' file
You can replace the NF > 2 with NF == 7 if you want the strictly seven-column header to be printed (that header contains 6 columns in your sample data, not 7).
Update after the comment "Can you please modify my script so that it should grep upto 13 decimal number":
awk -v t="$GREP" '
BEGIN { if (match(t, "\\.")) {
t = substr(t, 1, RSTART + 13)
sub("\\.", "\\.", t)
}
}
NF > 2 { header=$0; next }
NF == 2 && $2 ~ t { printf("%s %s\n%s\n", $1, $2, header) }
' file

Transform a 1xA table into a BxC table in awk

I am trying to turn a 1xA table into a BxC table. Let's say A is 15, B is 3 and C is 5, hence after each 5 entries I want it to start a new row in the same table.
I have a rather tedious way that appears to get close be it misses some values after each 5. I think the issue is with RS, as a new line forgets the "space" needed by RS, but I tried changing this to something else in file.sum and still no luck. Perhaps there is a better way to do it, but feel this should work.
awk -v RS=" " '{getline a1; getline a2; getline a3; getline a4; getline a5; print a1,a2,a3,a4,a5}' OFS='\t' file.sum
file.sum (my 1xA):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Expected results (my BxC):
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
Actual results:
1 2 3 4 5
7 8 9 10 11
13 14 15 10 11
This should be one of the simplest solution:
xargs -n5 <file
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
To follow up on your awk. I do not like the getline so I always try to avoid it. Also loop slows down awk some.
But using RS=" " you can do like this:
awk -v RS=" " '{$1=$1} {printf NR%5==0?"%s\n":"%s ",$0}' file
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
You can remove the {$1=$1}, but will then get a blank line at the end.
The NR%5==0 test if record is every 5th and insert newline when needed.
A tab version:
awk -v RS=" " '{$1=$1} {printf NR%5==0?"%s\n":"%s\t",$0}' file
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15

Select current and previous line if certain value is found

To figure out my problem, I subtract column 3 and create a new column 5 with new values, then I print the previous and current line if the value found is equal to 25 in column 5.
Input file
1 1 35 1
2 5 50 1
2 6 75 1
4 7 85 1
5 8 100 1
6 9 125 1
4 1 200 1
I tried
awk '{$5 = $3 - prev3; prev3 = $3; print $0}' file
output
1 1 35 1 35
2 5 50 1 15
2 6 75 1 25
4 7 85 1 10
5 8 100 1 15
6 9 125 1 25
4 1 200 1 75
Desired Output
2 5 50 1 15
2 6 75 1 25
5 8 100 1 15
6 9 125 1 25
Thanks in advance
you're almost there, in addition to previous $3, keep the previous $0 and only print when condition is satisfied.
$ awk '{$5=$3-p3} $5==25{print p0; print} {p0=$0;p3=$3}' file
2 5 50 1 15
2 6 75 1 25
5 8 100 1 15
6 9 125 1 25
this can be further golfed to
$ awk '25==($5=$3-p3){print p0; print} {p0=$0;p3=$3}' file
check the newly computed field $5 whether equal to 25. If so print the previous line and current line. Save the previous line and previous $3 for the computations in the next line.
You are close to the answer, just pipe it another awk and print it
awk '{$5 = $3 - prev3; prev3 = $3; print $0}' oxxo.txt | awk ' { curr=$0; if($5==25) { print prev;print curr } prev=curr } '
with Inputs:
$ cat oxxo.txt
1 1 35 1
2 5 50 1
2 6 75 1
4 7 85 1
5 8 100 1
6 9 125 1
4 1 200 1
$ awk '{$5 = $3 - prev3; prev3 = $3; print $0}' oxxo.txt | awk ' { curr=$0; if($5==25) { print prev;print curr } prev=curr } '
2 5 50 1 15
2 6 75 1 25
5 8 100 1 15
6 9 125 1 25
$
Could you please try following.
awk '$3-prev==25{print line ORS $0,$3} {$(NF+1)=$3-prev;prev=$3;line=$0}' Input_file | column -t
Here's one:
$ awk '{$5=$3-q;t=p;p=$0;q=$3;$0=t ORS $0}$10==25' file
2 5 50 1 15
2 6 75 1 25
5 8 100 1 15
6 9 125 1 25
Explained:
$ awk '{
$5=$3-q # subtract
t=p # previous to temp
p=$0 # store previous for next round
q=$3 # store subtract value for next round
$0=t ORS $0 # prepare record for output
}
$10==25 # output if equals
' file
No checking for duplicates so you might get same record printed twice. Easiest way to fix is to pipe the output to uniq.

change place of last column but getting new line

I have a file separated by \t.
header text with many lines
V F A B
10 30 26 42
14 33 25 45
16 32 23 43
18 37 22 48
I want to change the 3rd column by the 4th and vice versa. I'm using
awk '
BEGIN {
RS = "\n";
OFS="\t";
record=0;
};
record {
a = $4;
$4 = $3;
$3 = a;
};
$1=="V" {
record=1
};
{
print $0
};
'
}
Instead of just changing the position of the columns, column 3 also has the line break of the original 4th column:
header text with many lines
V F A B
10 30 42
26
14 33 45
25
16 32 43
23
18 37 48
22
How can I prevent this in order to get?
header text with many lines
V F A B
10 30 42 26
14 33 45 25
16 32 43 23
18 37 48 22
Could you please try following, using usual method of storing 1 field's value to a variable and then exchanging the value of 4th field to 3rd field, at last putting 4th field value as variable value(could say swapping values using a variable).
awk 'FNR==1{print;next} {val=$3;$3=$4;$4=val} 1' OFS="\t" Input_file
Or, this messy sed:
sed -E 's/([[:digit:]]+)([[:blank:]]+)([[:digit:]]+)([[:space:]]*)$/\3\2\1\4/' file
# ^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
# 3rd column tab 4th column optional whitespce

print a line from every 5 elements of a column

I am looking for a way to select a column (e. g. eighth column) of a data file and write the first five numbers of that column in a row, the next five numbers in second row, and so on.
I have been testing with awk and printf without success.
The awk way to do this is to switch from using OFS and ORS to separate the output using the modulus function:
$ seq 1 20 | awk '{printf "%s", $1 (NR % 5 ? OFS : ORS)}'
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
Change $1 to $8 for the eigth column for example and NR % 5 to NR % 10 for rows of 10 instead of 5. The seq command just generate a single column of digits from 1 to 20 used for demonstration.
I also find using xargs useful for this kind of thing:
$ seq 1 20 | awk '{print $1}' | xargs -n5
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
The awk isn't necessary for the example as seq only produces a single column however for your question change $1 to $8 to select only the eighth column from your input. With this approach you could also switch out awk with cut.
This will also produce the format requested
seq 1 20 | awk '{printf("%s ", $1); if (NR % 5 == 0) printf("\n")}'
where $1 indicates de column number which could be changed when passing an archive to the awk line.