replacing associative array indexes with their value using awk or sed - awk

I would like to replace column values of ref using key value pairs from id
cat id:
[1] a 8-23
[2] g 8-21
[3] d 8-13
cat ref:
a 1 2
b 3 4
c 5 3
d 1 2
e 3 1
f 1 2
g 2 3
desired output
8-23 1 2
b 3 4
c 5 3
8-13 1 2
e 3 1
f 1 2
8-21 2 3
I assume it would be best done using awk.
cat replace.awk
BEGIN { OFS="t" }
NR==FNR {
a[$2]=$3; next
}
$1 in !{!a[#]} {
print $0
}
Not sure what I need to change?

$1 in !{!a[#]} is not awk syntax. You just need $1 in a:
BEGIN { OFS='\t' }
NR==FNR {
a[$2] = $3
next
}
{
$1 = ($1 in a) ? a[$1] : $1
print
}
to force OFS to update, this version always assigns to $1
print uses $0 if unspecified

Related

Merge two files depending on range using awk

I have a file (let's call it main.txt) where the 1st column contains some numbers (2, 4, 8, 15).
2 OtherData
4 OtherData
8 OtherData
15 OtherData
Also, I have other file mapping.txt. I want to compare each value from main.txt (2, 4, 8, 15) with first 2 columns of mapping file.
The 1st column is a minimum allowed value, the 2nd is a maximum.
1 4 1stType
5 9 2ndType
10 14 3rdType
15 99 4thType
100 1000 5thType
How can I get a result like this using awk?
2 OtherData 1stType # 1 <= 2 <= 4
4 OtherData 1stType # 1 <= 4 <= 4
8 OtherData 2ndType # 5 <= 8 <= 9
15 OtherData 4thType # 15 <= 15 <= 100
Could you please try following, written and tested with shown samples only in GNU awk.
awk '
FNR==NR{
++count
start[count]=$1
end[count]=$2
value[count]=$NF
next
}
{
for(i=1;i<=count;i++){
if($1>=start[i] && $1<=end[i]){
print $0,value[i]
}
}
}
' Input_file2 Input_file1 | column -t
Output will be as follows.
2 OtherData 1stType
4 OtherData 1stType
8 OtherData 2ndType
15 OtherData 4thType
A shorter awk solution that loops through range and store mapping in an array:
awk 'NR == FNR {
for (i=$1; i<=$2; i++)
map[i] = $3
next
}
$1 in map {
print $0, map[$1]
}' mapping.txt main.txt
2 OtherData 1stType
4 OtherData 1stType
8 OtherData 2ndType
15 OtherData 4thType
Alternative awk:
awk 'NR == FNR {
map[$1,$2] = $3
next
}
{
for (i in map) {
split(i, a, SUBSEP)
if ($1 >= a[1] && $1 <= a[2]) {
print $0, map[i]
next
}
}
}' mapping.txt main.txt

Using awk, how to average numbers in column between two strings in a text file

A text file containing multiple tabular delimited columns between strings with an example below.
Code 1 (3)
5 10 7 1 1
6 10 9 1 1
7 10 10 1 1
Code 2 (2)
9 11 3 1 3
10 8 5 2 1
Code 3 (1)
12 10 2 1 1
Code 4 (2)
14 8 1 1 3
15 8 7 5 1
I would like to average the numbers in the third column for each code block. The example below is what the output should look like.
8.67
4
2
4
Attempt 1
awk '$3~/^[[:digit:]]/ {i++; sum+=$3; print $3} $3!~/[[:digit:]]/ {print sum/i; sum=0;i=0}' in.txt
Returned fatal: division by zero attempted.
Attempt 2
awk -v OFS='\t' '/^Code/ { if (NR > 1) {i++; sum+=$3;} {print sum/i;}}' in.txt
Returned another division by zero error.
Attempt 3
awk -v OFS='\t' '/^Code/ { if (NR > 1) { print s/i; s=0; i=0; } else { s += $3; i += 1; }}' in.txt
Returned 1 value: 0.
Attempt 4
awk -v OFS='\t' '/^Code/ {
if (NR > 1)
i++
print sum += $3/i
}
END {
i++
print sum += $3/i
}'
Returned:
0
0
0
0.3
I am not sure where that last number is coming from, but this has been the closest solution so far. I am getting a number for each block, but not the average.
Could you please try following.
awk '
/^Code/{
if(value!=0 && value){
print sum/value
}
sum=value=""
next
}
{
sum+=$NF;
value++
}
END{
if(value!=0 && value){
print sum/value
}
}
' Input_file

How to add numbers from files to computation?

I need to get results of this formula - a column of numbers
{x = ($1-T1)/Fi; print (x-int(x))}
from inputs file1
4 4
8 4
7 78
45 2
file2
0.2
3
2
1
From this files should be 4 outputs.
$1 is the first column from file1, T1 is the first line in first column of the file1 (number 4) - it is alway this number, Fi, where i = 1, 2, 3, 4 are numbers from the second file. So I need a cycle for i from 1 to 4 and compute the term one times with F1=0.2, the second output with F2=3, then third output with F3=2 and the last output will be for F4=1. How to express T1 and Fi in this way and how to do a cycle?
awk 'FNR == NR { F[++n] = $1; next } FNR == 1 { T1 = $1 } { for (i = 1; i <= n; ++i) { x = ($1 - T1)/F[i]; print x - int(x) >"output" FNR} }' file2 file1
This gives more than 4 outputs. What is wrong please?
FNR == 1 { T1 = $1 } is being run twice, when file2 is started being read T1 is set to 0.2,
>"output" FNR is problematic, you should enclose the output name expression in parentheses.
Here's how I'd do it:
awk '
NR==1 {t1=$1}
NR==FNR {f[NR]=$1; next}
{
fn="output"FNR
for(i in f) {
x=(f[i]-t1)/$1
print x-int(x) >fn
}
close(fn)
}
' file1 file2

Merge two rows with condition AWK

I have question. I would like to merge two or three rows with condition into one row with specific printing.
INPUT: File has 6 row and tab delimited
LOL h/h 2 a b c
LOLA h/h 3 b b b
SERP w/w 4 c c c
DARD s/s 5 d d d
GIT w/w 6 a b c
GIT h/h 6 a a b
GIT d/d 6 a b b
LOL h/h 7 a a a
Output: there are 2 conditions: if ($1s are the same and $3s are the same) merge rows together with specific printing
LOL h/h 2 a b c
LOLA h/h 3 b b b
SERP w/w 4 c c c
DARD s/s 5 d d d
GIT w/w 6 a b c h/h 6 a a b d/d 6 a b b
LOL h/h 7 a a a
I have this code:
awk -F'\t' -v OFS="\t" 'NF>1{a[$1] = a[$1]"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6};END{for(i in a){print i""a[i]}}'
But it is merging by 1st column only and I am not sure if it is good to use this code.
In awk:
$ awk '($1 FS $3) in a{k=$1 FS $3; $1=""; a[k]=a[k] $0;next} {a[$1 FS $3]=$0} END {for(i in a) print a[i]}' file
SERP w/w 4 c c c
LOL h/h 2 a b c
LOLA h/h 3 b b b
DARD s/s 5 d d d
LOL h/h 7 a a a
GIT w/w 6 a b c h/h 6 a a b d/d 6 a b b
Explained:
($1 FS $3) in a { # if keys already seen in array a
k=$1 FS $3
$1="" # remove $1
a[k]=a[k] $0 # append to existing
next
}
{ a[$1 FS $3]=$0 } # if keys not seen, see them
END {
for(i in a) # for all stored keys
print a[i] # print
}
Here is answer for gawk v4 which supports multi-dimensional array. One columns from first file are stored in a multi dimensional array, things are easy to compare with second file column. My solution show an example printf which you can modify as per your needs.
#!/bin/gawk -f
NR==FNR { # for first file
a[$1][0] = $2; # Store columns in
a[$1][1] = $3; # multi dimensional
a[$1][2] = $4; # array
a[$1][3] = $5;
a[$1][4] = $6;
next;
}
$1 in a && $3 == a[$1][1] {
printf("%s\t%s\n", $2, a[$1,0])
}
Answer using gawk v3 where I cannot use multi-dimensional array
#!/bin/gawk -f
NR==FNR {
a[$1]
b[$1] = $2;
c[$1] = $3;
d[$1] = $4;
e[$1] = $5;
f[$1] = $6;
next;
}
$1 in a && $3 == c[$1] {
print $0
}
One-liner
gawk 'NR==FNR {a[$1]; b[$1] = $2; c[$1] = $3; d[$1] = $4; e[$1] = $5; f[$1] = $6; next; } $1 in a && $3 == c[$1] { print $0 }' /tmp/file1 /tmp/file2

Remove data using AWK

I have a file of values that I wish to plot using gnuplot. The problem is that there are some values that I wish to remove.
Here is an example of my data:
1 52
2 3
3 0
4 4
5 1
6 1
7 1
8 0
9 0
I want to remove any row in which the right column is 0, so the data above would end up looking like this:
1 52
2 3
4 4
5 1
6 1
7 1
Let's just check field 2:
awk '$2' file
If the 2nd field has a True value, that is, not 0 or empty, the condition is True. In such case, awk performs its default action: print $0, meaning print the current line.
Updated, shorter:
awk '$2 == 0 { next; } { print; }'
awk '{ if ($2 == 0) { next; } else { print; } }'