transpose column and rows using gawk

transpose column and rows using gawk - awk

I am trying to transpose a really long file and I am concerned that it will not be transposed entirely.
My data looks something like this:
Thisisalongstring12345678 1 AB abc 937 4.320194
Thisisalongstring12345678 1 AB efg 549 0.767828
Thisisalongstring12345678 1 AB hi 346 -4.903441
Thisisalongstring12345678 1 AB jk 193 7.317946
I want my data to look like this:
Thisisalongstring12345678 Thisisalongstring12345678 Thisisalongstring12345678 Thisisalongstring12345678
1 1 1 1
AB AB AB AB
abc efg hi jk
937 549 346 193
4.320194 0.767828 -4.903441 7.317946
Would the length of the first string prove to be an issue? My file is much longer than this approx 2000 lines long. Also is it possible to change the name of the first string to Thisis234, and then transpose?

I don't see why it will not be - unless you don't have enough memory. Try the below and see if you run into problems.
Input:
$ cat inf.txt
a b c d
1 2 3 4
. , + -
A B C D
Awk program:
$ cat mkt.sh
awk '
{
for(c = 1; c <= NF; c++) {
a[c, NR] = $c
}
if(max_nf < NF) {
max_nf = NF
}
}
END {
for(r = 1; r <= NR; r++) {
for(c = 1; c <= max_nf; c++) {
printf("%s ", a[r, c])
}
print ""
}
}
' inf.txt
Run:
$ ./mkt.sh
a 1 . A
b 2 , B
c 3 + C
d 4 - D
Credits:
http://www.chemie.fu-berlin.de/chemnet/use/info/gawk/gawk_12.html#SEC121
Hope this helps.

This can be done with the rs BSD command:
http://www.unix.com/man-page/freebsd/1/rs/
Check out the -T option.

I tried icyrock.com's answer, but found that I had to change:
for(r = 1; r <= NR; r++) {
for(c = 1; c <= max_nf; c++) {
to
for(r = 1; r <= max_nf; r++) {
for(c = 1; c <= NR; c++) {
to get the NR columns and max_nf rows. So icyrock's code becomes:
$ cat mkt.sh
awk '
{
for(c = 1; c <= NF; c++) {
a[c, NR] = $c
}
if(max_nf < NF) {
max_nf = NF
}
}
END {
for(r = 1; r <= max_nf; r++) {
for(c = 1; c <= NR; c++) {
printf("%s ", a[r, c])
}
print ""
}
}
' inf.txt
If you don't do that and use an asymmetrical input, like:
a b c d
1 2 3 4
. , + -
You get:
a 1 .
b 2 ,
c 3 +
i.e. still 3 rows and 4 columns (the last of which is blank).

For # ScubaFishi and # icyrock code:
"if (max_nf < NF)" seems unnecessary. I deleted it, and the code works just fine.

Related

Using awk to count number of row group

I have a data set: (file.txt)
X Y
1 a
2 b
3 c
10 d
11 e
12 f
15 g
20 h
25 i
30 j
35 k
40 l
41 m
42 n
43 o
46 p
I want to add two columns which are Up10 and Down10,
Up10: From (X) to (X-10) count of row.
Down10 : From (X) to (X+10)
count of row
For example:
X Y Up10 Down10
35 k 3 5
For Up10; 35-10 X=35 X=30 X=25 Total = 3 row
For Down10; 35+10 X=35 X=40 X=41 X=42 X=42 Total = 5 row
Desired Output:
X Y Up10 Down10
1 a 1 5
2 b 2 5
3 c 3 4
10 d 4 5
11 e 5 4
12 f 5 3
15 g 4 3
20 h 5 3
25 i 3 3
30 j 3 3
35 k 3 5
40 l 3 5
41 m 3 4
42 n 4 3
43 o 5 2
46 p 5 1
This is the Pierre François' solution: Thanks again #Pierre François
awk '
BEGIN{OFS="\t"; print "X\tY\tUp10\tDown10"}
(NR == FNR) && (FNR > 1){a[$1] = $1 + 0}
(NR > FNR) && (FNR > 1){
up = 0; upl = $1 - 10
down = 0; downl = $1 + 10
for (i in a) { i += 0 # tricky: convert i to integer
if ((i >= upl) && (i <= $1)) {up++}
if ((i >= $1) && (i <= downl)) {down++}
}
print $1, $2, up, down;
}
' file.txt file.txt > file-2.txt
But when i use this command for 13GB data, it takes too long.
I have used this way for 13GB data again:
awk 'BEGIN{ FS=OFS="\t" }
NR==FNR{a[NR]=$1;next} {x=y=FNR;while(--x in a&&$1-10<a[x]){} while(++y in a&&$1+10>a[y]){} print $0,FNR-x,y-FNR}
' file.txt file.txt > file-2.txt
When file-2.txt reaches 1.1GB it is frozen. I am waiting several hours, but i can not see finish of command and final output file.
Note: I am working on Gogole cloud. Machine type
e2-highmem-8 (8 vCPUs, 64 GB memory)

A single pass awk that keeps the sliding window of 10 last records and uses that to count the ups and downs. For symmetricy's sake there should be deletes in the END but I guess a few extra array elements in memory isn't gonna make a difference:
$ awk '
BEGIN {
FS=OFS="\t"
}
NR==1 {
print $1,$2,"Up10","Down10"
}
NR>1 {
a[NR]=$1
b[NR]=$2
for(i=NR-9;i<=NR;i++) {
if(a[i]>=a[NR]-10&&i>=2)
up[NR]++
if(a[i]<=a[NR-9]+10&&i>=2)
down[NR-9]++
}
}
NR>10 {
print a[NR-9],b[NR-9],up[NR-9],down[NR-9]
delete a[NR-9]
delete b[NR-9]
delete up[NR-9]
delete down[NR-9]
}
END {
for(nr=NR+1;nr<=NR+9;nr++) {
for(i=nr-9;i<=nr;i++)
if(a[i]<=a[nr-9]+10&&i>=2&&i<=NR)
down[nr-9]++
print a[nr-9],b[nr-9],up[nr-9],down[nr-9]
}
}' file
Output:
X Y Up10 Down10
1 a 1 5
2 b 2 5
...
35 k 3 5
...
43 o 5 2
46 p 5 1

Another single pass approach with a sliding window
awk '
NR == 1 { next } # skip the header
NR == 2 { min = max = cur = 1; X[cur] = $1; Y[cur] = $2; next }
{ X[++max] = $1; Y[max] = $2
if (X[cur] >= $1 - 10) next
for (; X[cur] + 10 < X[max]; ++cur) {
for (; X[min] < X[cur] - 10; ++min) {
delete X[min]
delete Y[min]
}
print X[cur], Y[cur], cur - min + 1, max - cur
}
}
END {
for (; cur <= max; ++cur) {
for (; X[min] < X[cur] - 10; ++min);
for (i = max; i > cur && X[cur] + 10 < X[i]; --i);
print X[cur], Y[cur], cur - min + 1, i - cur + 1
}
}
' file
The script assumes the X column is ordered numerically.

AWK Convert Decimal to Binary

I want to use AWK to convert a list of decimal numbers in a file to binary but there seems to be no built-in method. Sample file is as below:
134218506
134218250
134217984
1610612736
16384
33554432

Here is an awk way, functionized for your pleasure:
awk '
function d2b(d, b) {
while(d) {
b=d%2b
d=int(d/2)
}
return(b)
}
{
print d2b($0)
}' file
Output of the first three records:
1000000000000000001100001010
1000000000000000001000001010
1000000000000000000100000000

You can try Perl one-liner
$ cat hamdani.txt
134218506
134218250
134217984
134217984
1610612736
16384
33554432
$ perl -nle ' printf("%b\n",$_) ' hamdani.txt
1000000000000000001100001010
1000000000000000001000001010
1000000000000000000100000000
1000000000000000000100000000
1100000000000000000000000000000
100000000000000
10000000000000000000000000
$

You can try with dc :
# -f infile : Use infile for data
# after -e , it is there are the dc command
dc -f infile -e '
z # number of values
sa # keep in register a
2
o # set the output radix to 2 : binary
[
Sb # keep all the value of infile in the register b
# ( b is use here as a stack)
z
0 <M # until there is no more value
] sM # define macro M in [ and ]
lMx # execute macro M to populate stack b
[
Lb # get all values one at a time from stack b
p # print this value in binary
la # get the number of value
1
- # decremente it
d # duplicate
sa # keep one in register a
0<N # the other is use here
]sN # define macro N
lNx' # execute macro N to print each values in binary

Here's an approach that works by first converting the decimal to hex and then converting each hex character to it's binary equivalent:
$ cat dec2bin.awk
BEGIN {
h2b["0"] = "0000"; h2b["8"] = "1000"
h2b["1"] = "0001"; h2b["9"] = "1001"
h2b["2"] = "0010"; h2b["a"] = "1010"
h2b["3"] = "0011"; h2b["b"] = "1011"
h2b["4"] = "0100"; h2b["c"] = "1100"
h2b["5"] = "0101"; h2b["d"] = "1101"
h2b["6"] = "0110"; h2b["e"] = "1110"
h2b["7"] = "0111"; h2b["f"] = "1111"
}
{ print dec2bin($0) }
function hex2bin(hex, n,i,bin) {
n = length(hex)
for (i=1; i<=n; i++) {
bin = bin h2b[substr(hex,i,1)]
}
sub(/^0+/,"",bin)
return bin
}
function dec2bin(dec, hex, bin) {
hex = sprintf("%x\n", dec)
bin = hex2bin(hex)
return bin
}
$ awk -f dec2bin.awk file
1000000000000000001100001010
1000000000000000001000001010
1000000000000000000100000000
1100000000000000000000000000000
100000000000000
10000000000000000000000000

# gawk binary number functions
# RPC 09OCT2022
# convert an 8 bit binary number to an integer
function bin_to_n(i)
{
n = 0;
#printf(">> %s:", i);
for (k = 1; k < 9; k++) {
n = n * 2;
b = substr(i, k, 1);
if (b == "1") {
n = n + 1;
}
}
return (n);
}
# convert a number to a binary number
function dectobin(n)
{
printf("dectobin: n in %d ",n);
binstring = "0b"; # some c compilers allow 0bXXXXXXXX format numbers
bn = 128;
for(k=0;k<8;k++) {
if (n >= bn) {
binstring = binstring "1";
n = n - bn;
} else {
binstring = binstring "0"
}
printf(" bn %d",bn);
bn = bn / 2;
}
return binstring;
}
BEGIN {
FS = " ";
# gawk (I think) has no atoi() funciton or equiv. So a table of all
# chars (well 256 ascii) can be used with the index function to get
# round this
for (i = 0; i < 255; i++) {
table = sprintf("%s%c", table, i);
}
}
{
# assume on stdin a buffer of 8 bit binary numbers "01000001 01000010" is AB etc
for (i = 1; i <= NF; i++)
printf("bin-num#%d: %x --> %c\n", i, bin_to_n($i), bin_to_n($i));
s = "ABC123string to test";
for (i = 0; i < length(s); i++) {
nn = index(table, substr(s,i+1,1))-1;
printf("substr :%s:%x:",ss,nn);
printf(" :%d: %s\n", i, dectobin(nn));
}
}

on top of what others have already mentioned, this function has a rapid shortcut for non-negative integer powers of 2
—- (since they always have a binary pattern of /^[1][0]*$/ )
version 1 : processing in 3-bit chunks instead of bit-by-bit :
{m,g}awk '
BEGIN {
1 CONVFMT="%.250g"
1 _^=OFMT="%.25g"
}
($++NF=________v1($_))^!_
function ________v1(__,___,_,____,_____)
{
6 if (+__==(_+=_^=____="")^(___=log(__)/log(_))) { # 2
2 return \
___<=_^_^_ \
? (_+_*_*_)^___ \
: sprintf("%.f%0*.f",--_,___,--_)
}
4 ___=(!_!_!_!!_) (_^((_____=_*_*_)+_)-_^_^_+(++_))
4 gsub("..", "&0&1", ___)
41 while(__) {
41 ____ = substr(___,
__%_____*_+(__=int(__/_____))^!_,_)____
}
4 return substr(__=____, index(__, _^(! _)))
}'
version 2 : first use sprintf() to convert to octals, before mapping to binary
function ________v2(__,___,_,____,_____)
{
6 if (+__==(_+=_^=____="")^(___=log(__)/log(_))) { # 2
2 return \
___<=_^_^_ \
? (_+_*_*_)^___ \
: sprintf("%.f%0*.f",--_,___,--_)
}
4 ___=(!_!_!_!!_) (_^((_____=_*_*_)+_)-_^_^_+(++_))
4 gsub("..", "&0&1", ___)
4 _____=___
4 __=sprintf("%o%.*o", int(__/(___=++_^(_*--_+_))),
_*_+!!_, __%___)
4 sub("^[0]+", "", __)
41 for (___=length(__); ___; ___--) {
41 ____ = substr(_____, substr(__,
___,!!_)*_ + !!_,_)____
}
4 return substr(____, index(____,!!_))
}
|
134218506 1000000000000000001100001010
134218250 1000000000000000001000001010
134217984 1000000000000000000100000000
1610612736 1100000000000000000000000000000
16384 100000000000000
33554432 10000000000000000000000000
version 3 : reasonably zippy (29.5 MB/s throughput on mawk2) version by using a caching array and processing 8-bits each round
ouputs are zero-padded to minimum 8 binary digits wide
.
{m,g,n}awk '
1 function ________(_______,_, __,____,______)
{
1 split(_=__=____=______="", _______, _)
2 for (_^=_<_; -_<=+_; _--) {
4 for (__^=_<_; -__<=+__; __--) {
8 for (____^=_<_; -____<=+____; ____--) {
16 for (______^=_<_; -______<=+______; ______--) {
16 _______[_+_+_+_+_+_+_+_+__+__+\
__+__+____+____+______]=\
(_)__ (____)______
}
}
}
}
1 return _^(_<_)
}
BEGIN {
1 CONVFMT = "%." ((_+=(_^=_<_)+(_+=_))*_)(!_)"g"
1 OFMT = "%." (_*_) "g"
1 _ = ________(_____)
}
($++NF=___($_))^!_
function ___(__,____,_,______)
{
6 if ((__=int(__))<(______=\
(_*=_+=_+=_^=____="")*_)) {
return _____[int(__/_)]_____[__%_]
}
16 do { ____=_____[int(__/_)%_]_____[__%_]____
} while (______<=(__=int(__/______)))
6 return int(_____[int(__/_)%_]\
_____[ (__) %_])____
}

You should not use awk for this but bc:
$ bc <<EOF
ibase=10
obase=2
$(cat file)
EOF
or
bc <<< $(awk 'BEGIN{ print "ibase=10; obase=2"}1' file)

summarizing a text file in awk

I have a sequence of characters in which I would like to split each sequence into 3-characters class from the beginning to the end. and the get the count of each class. here is a small example of sequences of characters for 2 IDs.
>ID1
ATGTCCAAGGGGATCCTGCAGGTGCATCCTCCGATCTGCGACTGCCCGGGCTGCCGAATA
TCCTCCCCGGTGAACCGGGGGCGGCTGGCAGACAAGAGGACAGTCGCCCTGCCTGCCGCC
>ID2
ATGAAACTTTCACCTGCGCTCCCGGGAACAGTTTCTGCTCGGACTCCTGATCGTTCACCT
CCCTGTTTTCCCGACAGCGAGGACTGTCTTTTCCAACCCGACATGGATGTGCTCCCAATG
ACCTGCCCGCCACCACCAGTTCCAAAGTTTGCACTCCTTAAGGATTATAGGCCTTCAGCT
and here is a small example of output for ID1. I want to get the same output for all IDs in the input file (the lines of characters belong each ID is in the next line). the counts for the next ID comes just after the first and so on.
ID1_3nt count
ATG 1
TCC 3
AAG 2
GGG 2
ATC 2
CTG 3
CAG 1
GTG 2
CAT 1
CCT 2
CCG 3
TGC 3
GAC 2
GGC 1
CGA 1
ATA 1
AAC 1
CGG 2
GCA 1
AGG 1
GCC 3
ACA 1
GTC 1
I tried this code:
awk '{i=0; printf ">%s\n",$2; while(i<=length($1)) {printf "%s\n", substr($1,i,3);i+=3}} /,substr,/ {count++}' | awk 'END { printf(" ID_3nt: %d",count)}
but did not return what I want. do you know how to improve it?

How about this patsplit()-based implementation?
#! /usr/bin/awk -f
# initialize publicly scoped vars...
function init() {
split("", idx) # index of our class (for ordering)
split("", cls) # our class name
split("", cnt) # num of classes we have seen
sz = 0 # number of classes for this ID
}
# process a class record
function proc( i, n, x) {
# split on each 3 characters
n = patsplit($0, a, /.../)
for (i=1; i<=n; ++i) {
x = a[i]
if (x in idx) {
# if this cls exists, just increment the count
++cnt[idx[x]]
} else {
# if this cls doesn't exist, index it in
cls[sz] = x
cnt[sz] = 1
idx[x] = sz++
}
}
}
# spit out class summary
function flush( i) {
if(!sz)
return
for(i=0; i<sz; ++i)
print cls[i], cnt[i]
init()
}
BEGIN {
init()
}
/^>ID/ {
flush()
sub(/^>/, "")
print $0 "_3nt count"
next
}
{
# we could have just inlined proc(), but using a function
# provides us with locally scoped variables
proc()
}
END {
flush()
}

$ cat tst.awk
sub(/^>/,"") { if (NR>1) prt(); name=$0; next }
{ rec = rec $0 }
END { prt() }
function prt( cnt, class) {
while ( rec != "" ) {
cnt[substr(rec,1,3)]++
rec = substr(rec,4)
}
print name "_3nt count"
for (class in cnt) {
print class, cnt[class]
}
}
.
$ awk -f tst.awk file
ID1_3nt count
ACA 1
AAC 1
CGA 1
CAT 1
GTG 2
CAG 1
GGG 2
CCG 3
CCT 2
GCA 1
ATA 1
GAC 2
AAG 2
GCC 3
ATC 2
TCC 3
CGG 2
CTG 3
GTC 1
AGG 1
GGC 1
TGC 3
ATG 1
ID2_3nt count
AAA 1
CCC 3
ACA 1
GTG 1
TTT 2
TGT 2
GTT 2
ACC 1
CCG 2
CTC 3
CCT 4
GCA 1
AAG 2
GAC 3
TCA 3
AGC 1
ACT 1
CGT 1
CGG 1
CTT 3
TAT 1
CAA 1
GAG 1
GAT 3
GGA 1
AGG 1
TGC 1
CCA 5
TTC 1
GCT 2
TCT 1
GCG 1
ATG 3

Taking similar consecutive rows and appending them into one longer row with AWK

I've got a big 7-column text file with sorted rows like this:
gi|352964122|gb|JH286168.1| 00884 C C 14 1.00 u
gi|352964122|gb|JH286168.1| 00884 C C 26 0.76 p
gi|352964122|gb|JH286168.1| 00884 C C 33 0.89 f
gi|352964122|gb|JH286168.1| 00885 G G 14 1.00 u
gi|352964122|gb|JH286168.1| 00885 A A 30 0.84 f
gi|352964122|gb|JH286168.1| 00886 T T 31 0.81 f
What I'm needing to do is, if the first two columns are the same in consecutive rows, append the rest of the columns to the first row. There can be 1, 2, or 3 "similar" rows, and I need placeholders to keep columns intact if less than 3. So the above would look like this:
gi|352964122|gb|JH286168.1| 00884 C C 14 1.00 u C C 26 0.76 p C C 33 0.89 f
gi|352964122|gb|JH286168.1| 00885 G G 14 1.00 u - - - ------------ G G 33 0.89 f
gi|352964122|gb|JH286168.1| 00886 T T 31 0.81 f - - - ---- - - - ------ - - -- ----- - -
I've tried many approaches with AWK but can't quite get it. How might this be done?

I'm unsure about how you get your second row but this might match at least how I understand the goal:
awk '
{
head=$1 " " $2
tail=$3 " " $4 " " $5 " " $6 " "$7
if(previous!=head) {
if(previous!="") printf("%s %s %s %s\n",previous,p[1],p[2],p[3])
previous=head
i=1
p[i]=tail
p[2]=p[3]="- - - -"
} else {
i=i+1
p[i]=tail
}
}
END { printf("%s %s %s %s\n",previous,p[1],p[2],p[3]) }'
Output:
gi|352964122|gb|JH286168.1| 00884 C C 14 1.00 u C C 26 0.76 p C C 33 0.89 f
gi|352964122|gb|JH286168.1| 00885 G G 14 1.00 u A A 30 0.84 f - - - -
gi|352964122|gb|JH286168.1| 00886 T T 31 0.81 f - - - - - - - -

This should do it:
(Edit: I didn't notice you needed placeholders. I'll look into it....)
awk '
$1 == last1 && $2 == last2 {
printf " %s %s %s %s %s",$3,$4,$5,$6,$7;
last1 = $1; last2 = $2;
next;
}
{
$1 = $1; # normalize spacing
printf "%s%s", NR==1?"":"\n", $0;
last1 = $1; last2 = $2;
}
END { print ""; }
' file

$ cat tst.awk
BEGIN { maxRecs = 3 }
function prta( i, dflt) {
dflt = a[1]
gsub(/[^[:space:]]+/,"-",dflt)
printf "%s ", key
for (i=1; i<=maxRecs; i++) {
printf "%s%s", (i in a ? a[i] : dflt), (i<maxRecs ? OFS : ORS)
delete a[i]
}
numRecs = 0
}
{ key = $1 FS $2 }
prev && (key != prev) { prta() }
{
$1 = $1
sub(/([^[:space:]]+[[:space:]]+){2}/,"")
a[++numRecs] = $0
prev = key
}
END { prta() }
$
$ awk -f tst.awk file
gi|352964122|gb|JH286168.1| 00885 C C 14 1.00 u C C 26 0.76 p C C 33 0.89 f
gi|352964122|gb|JH286168.1| 00886 G G 14 1.00 u A A 30 0.84 f - - - - -
gi|352964122|gb|JH286168.1| 00886 T T 31 0.81 f - - - - - - - - - -

compare file and print class

I have
file1:
id position
a1 21
a1 39
a1 77
b1 88
b1 122
c1 22
file 2
id class position1 position2
a1 Xfact 1 40
a1 Xred 41 66
a1 xbreak 69 89
b1 Xbreak 77 133
b1 Xred 140 199
c1 Xfact 1 15
c1 Xbreak 19 35
I want something like this
output:
id position class
a1 21 Xfact
a1 39 Xfact
a1 77 Xbreak
b1 88 Xbreak
b1 122 Xbreak
c1 22 Xbreak
I need a simple awk script , which print id and position from file1, take position from file1 and compare it to file 2 positions. if position in file 1 lies in range of position 1 and 2 in file two. print corresponding class

One way using awk. It's not a simple script. The process explained in short: The key point is the variable 'all_ranges', when reset reads from file of ranges saving its data, and when set, stop that process and begin reading from 'id-position'
file, checks position in the data of the array and prints if matches the range. I've tried to avoid to process the file of ranges many times and do it by chunks, which made it more complex.
EDIT to add that I assume id field in both files are sorted. Otherwise this script will fail miserably and you will need another approach.
Content of script.awk:
BEGIN {
## Arguments:
## ARGV[0] = awk
## ARGV[1] = <first_input_argument>
## ARGV[2] = <second_input_argument>
## ARGC = 3
f2 = ARGV[ --ARGC ];
all_ranges = 0
## Read first line from file with ranges to get 'class' header.
getline line <f2
split( line, fields )
class_header = fields[2];
}
## Special case for the header.
FNR == 1 {
printf "%s\t%s\n", $0, class_header;
next;
}
## Data.
FNR > 1 {
while ( 1 ) {
if ( ! all_ranges ) {
## Read line from file with range positions.
ret = getline line <f2
## Check error.
if ( ret == -1 ) {
printf "%s\n", "ERROR: " ERRNO
close( f2 );
exit 1;
}
## Check end of file.
if ( ret == 0 ) {
break;
}
## Split line in spaces.
num = split( line, fields )
if ( num != 4 ) {
printf "%s\n", "ERROR: Bad format of file " f2;
exit 2;
}
range_id = fields[1];
if ( $1 == fields[1] ) {
ranges[ fields[3], fields[4] ] = fields[2];
continue;
}
else {
all_ranges = 1
}
}
if ( range_id == $1 ) {
delete ranges;
ranges[ fields[3], fields[4] ] = fields[2];
all_ranges = 0;
continue;
}
for ( range in ranges ) {
split( range, pos, SUBSEP )
if ( $2 >= pos[1] && $2 <= pos[2] ) {
printf "%s\t%s\n", $0, ranges[ range ];
break;
}
}
break;
}
}
END {
for ( range in ranges ) {
split( range, pos, SUBSEP )
if ( $2 >= pos[1] && $2 <= pos[2] ) {
printf "%s\t%s\n", $0, ranges[ range ];
break;
}
}
}
Run it like:
awk -f script.awk file1 file2 | column -t
With following result:
id position class
a1 21 Xfact
a1 39 Xfact
a1 77 xbreak
b1 88 Xbreak
b1 122 Xbreak
c1 22 Xbreak

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

transpose column and rows using gawk - awk

This can be done with the rs BSD command: http://www.unix.com/man-page/freebsd/1/rs/ Check out the -T option.

For # ScubaFishi and # icyrock code: "if (max_nf < NF)" seems unnecessary. I deleted it, and the code works just fine.

Related

Using awk to count number of row group

AWK Convert Decimal to Binary

summarizing a text file in awk

Taking similar consecutive rows and appending them into one longer row with AWK

compare file and print class

Categories

Resources