awk script to add numbers in one column [closed] - awk

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
How can I add (numbers on the 2nd column) using for or while loops in an awk script?
While numbers at $1 are random and in increasing order:
1 through 2 in first iteration,
1.1 through 2.1 in 2nd iteration,
1.2 through 2.2 in 3rd iteration,
1.3 through 2.3 in 4th iteration
... up to the end
That means with 0.1 increment at each iteration.
Expected output:
39 for 1st iteration
47 for 2nd iteration...
Input data:
1.0 1
1.1 3
1.2 4
1.3 3
1.4 5
1.5 7
1.6 10
2.0 6
2.1 9
2.2 2
2.3 8
2.4 0
3.0 4
3.2 5
4.0 8
4.1 6
5.0 7
6.0 6
7.0 7
8.7 9
9.8 2

Here, I multiply each $1 by 10 to avoid problems with imprecise decimal numbers.
awk -v max=$(tail -1 data | awk '{print $1*10}') '
{n = $1 * 10}
NR==1 {min = n}
{
for (i=min; i<=(max-10); i++) {
if (i <= n && n <= (i+10)) {
sum[i, i+10] += $2
}
}
}
END {
for (key in sum) {
split(key, a, SUBSEP)
printf "[%.1f,%.1f] = %d\n", a[1]/10, a[2]/10, sum[key]
}
}
' data | sort -n
output
[1.0,2.0] = 39
[1.1,2.1] = 47
[1.2,2.2] = 46
[1.3,2.3] = 50
...
[8.6,9.6] = 9
[8.7,9.7] = 9
[8.8,9.8] = 2

Related

using awk insert new data and adding how far away [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
i have a data set like this (file.txt) tab separated
1 a
3 b
5 c
7 d
8 e
12 f
17 g
20 h
When i want to add a new data
6
I want to create new column which includes remaining from new entry.
Desired output is like that:
1 a 5
3 b 3
5 c 1
6 0
7 d 1
8 e 2
12 f 6
17 g 11
20 h 14
I have tried:
awk -v new="6" '
BEGIN {
FS=OFS="\t"
{gsub(new,",\n")}
{print $0"\n"$2,$3=|new-$1|}
' file
$ awk -v new=6 '
BEGIN {
FS=OFS="\t"
}
{
if((new<$1)&&f=="") { # when greater than new values seen
print new # for the first time, act
f=1 # flag to print only once
}
print $1,$2,((v=new-$1)<0?-v:v) # abs with ternary operator
}' file
Output:
1 a 5
3 b 3
5 c 1
6
7 d 1
8 e 2
12 f 6
17 g 11
20 h 14

How to separate a float number in awk into two values? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
There is an input file with values
2 0.2
3 0.3
4 0.4
5 1.0
6 1.1
7 1.2
8 1.3
9 2.0
10 2.1
11 2.2
12 3.0
13 3.1
14 4.0
0 0.0
1 0.1
Which are produces by the part of the code:
BEGIN{
n=4
b=n/10
t=0
for (k=0.0; k<=n; k++){
for (j=0.0; j<=b; j+=0.1){
arr[t]=k+j
t++
}
b=b-0.1
}
for(n in arr){
printf(" %d %.1f\n ",n, arr[n] )}
The question is, how to get an output by separating the floating number.
An expected output has to be:
2 0 2
3 0 3
and so on..
Have you tried:
$ echo "0.1" | awk '{split($0,a,"."); print a[2]; print a[1]}'
This should give the desired output.

Using awk, extract the first and last numbers between two strings in a column in a text file, and difference those?

I have a text file that looks similar to below.
Code 1 (3)
5 10 10
6 10 10
7 10 10
Code 2 (2)
9 11 11
10 8 8
Code 3 (1)
12 10 9
Code 4 (2)
14 8 10
15 8 10
I am only interested in the first and last numbers, in the first column. I would like to extract the first, last, and difference (1+last-first) to a new text file with a column for each first, last, and difference. The result should look like below. Technically, the difference column could be the number between the parentheses, as this number would always be the 1+difference between the last and first numbers between each string. Note, the last row in the input text file does not have a string below it.
5 7 3
9 10 2
12 12 1
14 15 2
Trying awk '/Code/{flag=1;next}/Code/{flag=0}flag' gives me all the lines and columns between each string. Trying awk '$1 ~ /Code/{flag=1;next},$1 ~ 1 /Code/{flag=0}flag' results in a syntax error at ,.
You may use this awk:
awk -v OFS='\t' '/^Code/ {
if (NR > 1)
print first, prev, (prev-first+1)
first = prev = ""
next
}
(first == "") {
first = $1
}
{
prev = $1
}
END {
print first, prev, (prev-first+1)
}' file
5 7 3
9 10 2
12 12 1
14 15 2

Convert n number of rows to columns repeatedly using awk

My data is a large text file that consists of 12 rows repeating. It looks something like this:
{
1
2
3
4
5
6
7
8
9
10
}
repeating over and over. I want to turn every 12 rows into columns. so the data would look like this:
{ 1 2 3 4 5 6 7 8 9 10 }
{ 1 2 3 4 5 6 7 8 9 10 }
{ 1 2 3 4 5 6 7 8 9 10 }
I have found some examples of how to convert all the rows to columns using awk: awk '{printf("%s ", $0)}', but no examples of how to convert every 12 rows into columns and then repeat the process.
Here is an idiomatic way (read golfed down version of Tom Fenech's answer) of doing it with awk:
$ awk '{ORS=(NR%12?FS:RS)}1' file
{ 1 2 3 4 5 6 7 8 9 10 }
{ 1 2 3 4 5 6 7 8 9 10 }
{ 1 2 3 4 5 6 7 8 9 10 }
ORS stands for Output Record Separator. We set the ORS to FS which by default is space for every line except the 12th line where we set it to RS which is a newline by default.
You could use something like this:
awk '{printf "%s%s", $0, (NR%12?OFS:RS)}' file
NR%12 evaluates to true except when the record number is exactly divisible by 0. When it is true, the output field separator is used (which defaults to a space). When it is false, the record separator is used (by default, a newline).
Testing it out:
$ awk '{printf "%s%s", $0, (NR%12?OFS:RS)}' file
{ 1 2 3 4 5 6 7 8 9 10 }

Selecting elements of two column whose difference is less than some given value using awk

While doing post processing for a numerical analysis, I have the following problem of selection of data :
time_1 result_1 time_2 result_2
1 10 1.1 10.1
2 20 1.6 15.1
3 30 2.1 20.1
4 40 2.6 25.1
5 50 3.1 30.1
6 60 3.6 35.1
7 70 4.1 40.1
8 80 4.6 45.1
9 90 5.1 50.1
10 100 5.6 55.1
6.1 60.1
6.6 65.1
7.1 70.1
7.6 75.1
8.1 80.1
8.6 85.1
9.1 90.1
9.6 95.1
10.1 100.1
This file has 4 columns, the first column (time_1) represents the calculated instants of a program 1, the second column (result_1) is the results calculated for each instant.
The third column (time_2) represents represents the calculated instants of another program, the fourth column (result_2) is the results calculated for each instant of this program 2.
Now I wish to select only the instants of the third column (time_2) that is very near the instants of the first column (time_1), the difference admitted is less than or equal to 0.1. For example :
for the instant 1 of the time_1 column, I wish to select the instant 1.1 of the time_2 column, because (1.1 - 1) = 0.1, I do not want to select the others instants of the time_2 column because (1.6 - 1) > 0.1, or (2.1 - 1) > 0.1
for the instant 2 of the time_1 column, I wish to select the instant 2.1 of the time_2 column, because (2.1 - 2) = 0.1, I do not want to select the others instants of the time_2 column because (2.6 - 1) > 0.1, or (3.1 - 1) > 0.1
At the end, I would like to obtain the following data :
time_1 result_1 time_2 result_2
1 10 1.1 10.1
2 20 2.1 20.1
3 30 3.1 30.1
4 40 4.1 40.1
5 50 5.1 50.1
6 60 6.1 60.1
7 70 7.1 70.1
8 80 8.1 80.1
9 90 9.1 90.1
10 100 10.1 100.1
I wish to use awk but I have not been familiarized with this code. I do not know how to fix an element of the first column then compare this to all elements of the third column in order to select the right value of this third column. If I do very simply like this, I can print only the first line :
{if (($3>=$1) && (($3-$1) <= 0.1)) {print $2, $4}}
Thank you in advance for your help !
You can try the following perl script:
#! /usr/bin/perl
use strict;
use warnings;
use autodie;
use File::Slurp qw(read_file);
my #lines=read_file("file");
shift #lines; # skip first line
my #a;
for (#lines) {
my #fld=split;
if (#fld == 4) {
push (#a,{id=>$fld[0], val=>$fld[1]});
}
}
for (#lines) {
my #fld=split;
my $id; my $val;
if (#fld == 4) {
$id=$fld[2]; $val=$fld[3];
} elsif (#fld == 2) {
$id=$fld[0]; $val=$fld[1];
}
my $ind=checkId(\#a,$id);
if ($ind>=0) {
$a[$ind]->{sel}=[] if (! exists($a[$ind]->{sel}));
push(#{$a[$ind]->{sel}},{id=>$id,val=>$val});
}
}
for my $item (#a) {
if (exists $item->{sel}) {
my $s= $item->{sel};
for (#$s) {
print $item->{id}."\t".$item->{val}."\t";
print $_->{id}."\t".$_->{val}."\n";
}
}
}
sub checkId {
my ($a,$id) = #_;
my $dif=0.1+1e-10;
for (my $i=0; $i<=$#$a; $i++) {
return $i if (abs($a->[$i]->{id}-$id)<=$dif)
}
return -1;
}
One thing to be aware of: due to the vagaries of floating point numbers, comparing a value to 0.1 is unlikely to give you the results you're looking for:
awk 'BEGIN {x=1; y=x+0.1; printf "%.20f", y-x}'
0.10000000000000008882⏎
here, y=x+0.1, but y-x > 0.1
So, we will look at the diff as diff = 10*y - 10x:
Also, I'm going to process the file twice: once to grab all the time_1/result_1 values, the second time to extract the "matching" time_2/result_2 values.
awk '
NR==1 {print; next}
NR==FNR {if (NF==4) r1[$1]=$2; next}
FNR==1 {next}
{
if (NF == 4) {t2=$3; r2=$4} else {t2=$1; r2=$2}
for (t1 in r1) {
diff = 10*t1 - 10*t2;
if (-1 <= diff && diff <= 1) {
print t1, r1[t1], t2, r2
break
}
}
}
' ~/tmp/timings.txt ~/tmp/timings.txt | column -t
time_1 result_1 time_2 result_2
1 10 1.1 10.1
2 20 2.1 20.1
3 30 3.1 30.1
4 40 4.1 40.1
5 50 5.1 50.1
6 60 6.1 60.1
7 70 7.1 70.1
8 80 8.1 80.1
9 90 9.1 90.1
10 100 10.1 100.1