How to subtract a constant number from a column - awk
Is there a way to subtract the smallest value from all the values of a column? I need to subtract the first number in the 1st column from all other numbers in the first column.
I wrote this script, but it's not giving the right result:
$ awk '{$1 = $1 - 1280449530}' file
1280449530 452
1280449531 2434
1280449531 2681
1280449531 2946
1280449531 1626
1280449532 3217
1280449532 4764
1280449532 4501
1280449532 3372
1280449533 4129
1280449533 6937
1280449533 6423
1280449533 4818
1280449534 4850
1280449534 8980
1280449534 8078
1280449534 6788
1280449535 5587
1280449535 10879
1280449535 9920
1280449535 8146
1280449536 6324
1280449536 12860
1280449536 11612
What you have essentially works, you're just not outputting it. This will output what you want:
awk '{print ($1 - 1280449530) " " $2}' file
You can also be slightly cleverer and not hardcode the shift amount:
awk '{
if(NR == 1) {
shift = $1
}
print ($1 - shift) " " $2
}' file
You were on the right track:
awk '{$1 = $1 - 1280449530; print}' file
Here is a simplified version of Michael's second example:
awk 'NR == 1 {origin = $1} {$1 = $1 - origin; print}' file
bash shell script
#!/bin/bash
exec 4<"file"
read col1 col2<&4
while read -r n1 n2 <&4
do
echo $((n1-$col1))
# echo "scale=2;$n1 - $col1" | bc # dealing with decimals..
done
exec >&4-
In vim you can select the column with
and go to the bottom of the page with
G
then
e
to go to the end of the number
then you may enter the number like 56
56
this will add 56 to the column
Related
awk conditional statement based on a value between colon
I was just introduced to awk and I'm trying to retrieve rows from my file based on the value on column 10. I need to filter the data based on the value of the third value if ":" was used as a separator in column 10 (last column). Here is an example data in column 10. 0/1:1,9:10:15:337,0,15. I was able to extract the third value using this command awk '{print $10}' file.txt | awk -F ":" '/1/ {print $3}' This returns the value 10 but how can I return other rows (not just the value in column 10) if this third value is less than or greater than a specific number? I tried this awk '{if($10 -F ":" "/1/ ($3<10))" print $0;}' file.txt but it returns a syntax error. Thanks!
Your code: awk '{print $10}' file.txt | awk -F ":" '/1/ {print $3}' should be just 1 awk script: awk '$10 ~ /1/ { split($10,f,/:/); print f[3] }' file.txt but I'm not sure that code is doing what you think it does. If you want to print the 3rd value of all $10s that contain :s, as it sounds like from your text, that'd be: awk 'split($10,f,/:/) > 1 { print f[3] }' file.txt and to print the rows where that value is less than 7 would be: awk '(split($10,f,/:/) > 1) && (f[3] < 7)' file.txt
awk multiple row and printing results
I would like to print some specific parts of a results with awk, after multiple pattern selection. What I have is (filetest): A : 1 B : 2 I expect to have: 1 - B : 2 So, only the result of the first row, then the whole second row. The dash was added by me. I have this: awk -F': ' '$1 ~ /A|B/ { printf "%s", $2 "-" }' filetest Result: 1 -2 - And I cannot get the full second row, without failing in showing just the result of the first one awk -F': ' '$1 ~ /A|B/ { printf "%s", $2 "$1" }' filetest Result: 1 - A 2 - B Is there any way to print in the same line, exactly the column/row that I need with awk? In my case R1C2 - R2C1: R2C2? Thanks!
This will do what you are expecting: awk -F: '/^A/{printf "%s -", $2}/^B/{print}' filetest
$ awk -F: 'NR%2 {printf "%s - ", $2; next}1' filetest 1 - B : 2
You can try this awk -F: 'NR%2==1{a=$2; } NR%2==0{print a " - " $0}' file output 1 - B : 2
I'd probably go with #jas's answer as it's clear, simple, and not coupled to your data values but just to show an alternative approach: $ awk '{printf "%s", (NR%2 ? $3 " - " : $0 ORS)}' file 1 - B : 2
tried on gnu awk awk -F':' 'NR==1{s=$2;next}{FS="";s=s" - "$0;print s}' filetest
grouping and summarizing the rows in a big text file using awk
I have a big text file like this example: example: chr11 314980 314981 63 IFITM1 -131 chr11 315025 315026 54 IFITM1 -86 chr5 315085 315086 118 AHRR -53011 chr16 316087 316088 56 ITFG3 -86 chr16 316088 316089 90 ITFG3 -131 chr11 319672 319673 213 IFITM3 -131 chr11 319674 319675 514 IFITM3 -164 I want to group the rows based on the 6th column and sum the values from the 4th column for every group. the new file would have 2 columns. 1st column would be the group and the 2nd column would be sum (sum of values from column 4 from similar groups). the expected output would look like this: expected output: -131 366 -86 110 -53011 118 -164 514 I am trying to do that in awk using the following code. sort myfile.txt | awk -F'\t' '{ sub(/..$/,"**",$6) }1' OFS='\t' | awk '{print $1 "\t" $2}' > outfile.txt but actually it returns an empty file. do you know how to fix it?
I have no idea what you are thinking with your code: why you are replacing the last 2 chars on the line with asterisks? why aren't you doing any addition anywhere? why do you sort (by column 1) first? awk -F'\t' ' {sum[$6] += $4} END {for (key in sum) {print key, sum[key]}} ' file | column -t
Use an associative array: awk '{a[$NF]+=$4}END{for (i in a){print i, a[i]}}' file
If you're ok with sorted output, you don't need arrays: sort -k6n file | awk -F'\t' ' grp != $6 { grp = $6 printf "%s%s%s%s", sum, sep, grp, FS sum = 0 sep = ORS } { sum += $4 } END { print sum }'
linux csv file concatenate columns into one column
I've been looking to do this with sed, awk, or cut. I am willing to use any other command-line program that I can pipe data through. I have a large set of data that is comma delimited. The rows have between 14 and 20 columns. I need to recursively concatenate column 10 with column 11 per row such that every row has exactly 14 columns. In other words, this: a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p will become: a,b,c,d,e,f,g,h,i,jkl,m,n,o,p I can get the first 10 columns. I can get the last N columns. I can concatenate columns. I cannot think of how to do it in one line so I can pass a stream of endless data through it and end up with exactly 14 columns per row. Examples (by request): How many columns are in the row? sed 's/[^,]//g' | wc -c Get the first 10 columns: cut -d, -f1-10 Get the last 4 columns: rev | cut -d, -f1-4 | rev Concatenate columns 10 and 11, showing columns 1-10 after that: awk -F',' ' NF { print $1","$2","$3","$4","$5","$6","$7","$8","$9","$10$11}'
Awk solution: awk 'BEGIN{ FS=OFS="," } { diff = NF - 14; for (i=1; i <= NF; i++) printf "%s%s", $i, (diff > 1 && i >= 10 && i < (10+diff)? "": (i == NF? ORS : ",")) }' file The output: a,b,c,d,e,f,g,h,i,jkl,m,n,o,p
With GNU awk for the 3rd arg to match() and gensub(): $ cat tst.awk BEGIN{ FS="," } match($0,"(([^,]+,){9})(([^,]+,){"NF-14"})(.*)",a) { $0 = a[1] gensub(/,/,"","g",a[3]) a[5] } { print } $ awk -f tst.awk file a,b,c,d,e,f,g,h,i,jkl,m,n,o,p
If perl is okay - can be used just like awk for stream processing $ cat ip.txt a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p 1,2,3,4,5,6,3,4,2,4,3,4,3,2,5,2,3,4 1,2,3,4,5,6,3,4,2,4,a,s,f,e,3,4,3,2,5,2,3,4 $ awk -F, '{print NF}' ip.txt 16 18 22 $ perl -F, -lane '$n = $#F - 4; print join ",", (#F[0..8], join("", #F[9..$n]), #F[$n+1..$#F]) ' ip.txt a,b,c,d,e,f,g,h,i,jkl,m,n,o,p 1,2,3,4,5,6,3,4,2,43432,5,2,3,4 1,2,3,4,5,6,3,4,2,4asfe3432,5,2,3,4 -F, -lane split on , results saved in #F array $n = $#F - 4 magic number, to ensure output ends with 14 columns. $#F gives the index of last element of array (won't work if input line has less than 14 columns) join helps to stitch array elements together with specified string #F[0..8] array slice with first 9 elements #F[9..$n] and #F[$n+1..$#F] the other slices as needed Borrowing from Ed Morton's regex based solution $ perl -F, -lape '$n=$#F-13; s/^([^,]*,){9}\K([^,]*,){$n}/$&=~tr|,||dr/e' ip.txt a,b,c,d,e,f,g,h,i,jkl,m,n,o,p 1,2,3,4,5,6,3,4,2,43432,5,2,3,4 1,2,3,4,5,6,3,4,2,4asfe3432,5,2,3,4 $n=$#F-13 magic number ^([^,]*,){9}\K first 9 fields ([^,]*,){$n} fields to change $&=~tr|,||dr use tr to delete the commas e this modifier allows use of Perl code in replacement section this solution also has the added advantage of working even if input field is less than 14
You can try this gnu sed sed -E ' s/,/\n/9g :A s/([^\n]*\n)(.*)(\n)(([^\n]*\n){4})/\1\2\4/ tA s/\n/,/g ' infile
First variant - with awk awk -F, ' { for(i = 1; i <= NF; i++) { OFS = (i > 9 && i < NF - 4) ? "" : "," if(i == NF) OFS = "\n" printf "%s%s", $i, OFS } }' input.txt Second variant - with sed sed -r 's/,/#/10g; :l; s/#(.*)((#[^#]){4})/\1\2/; tl; s/#/,/g' input.txt or, more straightforwardly (without loop) and probably faster. sed -r 's/,(.),(.),(.),(.)$/#\1#\2#\3#\4/; s/,//10g; s/#/,/g' input.txt Testing Input a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u Output a,b,c,d,e,f,g,h,i,jkl,m,n,o,p a,b,c,d,e,f,g,h,i,jklmn,o,p,q,r a,b,c,d,e,f,g,h,i,jklmnopq,r,s,t,u
Solved a similar problem using csvtool. Source file, copied from one of the other answers: $ cat input.txt a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p 1,2,3,4,5,6,3,4,2,4,3,4,3,2,5,2,3,4 1,2,3,4,5,6,3,4,2,4,a,s,f,e,3,4,3,2,5,2,3,4 Concatenating columns: $ cat input.txt | csvtool format '%1,%2,%3,%4,%5,%6,%7,%8,%9,%10%11%12,%13,%14,%15,%16,%17,%18,%19,%20,%21,%22\n' - a,b,c,d,e,f,g,h,i,jkl,m,n,o,p,,,,,, 1,2,3,4,5,6,3,4,2,434,3,2,5,2,3,4,,,, 1,2,3,4,5,6,3,4,2,4as,f,e,3,4,3,2,5,2,3,4 anatoly#anatoly-workstation:cbs$ cat input.txt
awk script for finding smallest value from column
I am beginner in AWK, so please help me to learn it. I have a text file with name snd and it values are 1 0 141 1 2 223 1 3 250 1 4 280 I want to print the entire row when the third column value is minimu
This should do it: awk 'NR == 1 {line = $0; min = $3} NR > 1 && $3 < min {line = $0; min = $3} END{print line}' file.txt EDIT: What this does is: Remember the 1st line and its 3rd field. For the other lines, if the 3rd field is smaller than the min found so far, remember the line and its 3rd field. At the end of the script, print the line. Note that the test NR > 1 can be skipped, as for the 1st line, $3 < min will be false. If you know that the 3rd column is always positive (not negative), you can also skip the NR == 1 ... test as min's value at the beginning of the script is zero. EDIT2: This is shorter: awk 'NR == 1 || $3 < min {line = $0; min = $3}END{print line}' file.txt
You don't need awk to do what you want. Use sort sort -nk 3 file.txt | head -n 1 Results: 1 0 141
I think sort is an excellent answer, unless for some reason what you're looking for is the awk logic to do this in a larger script, or you want to avoid the extra pipes, or the purpose of this question is to learn more about awk. $ awk 'NR==1{x=$3;line=$0} $3<x{line=$0} END{print line}' snd Broken out into pieces, this is: NR==1 {x=$3;line=$0} -- On the first line, set an initial value for comparison and store the line. $3<x{line=$0} - On each line, compare the third field against our stored value, and if the condition is true, store the line. (We could make this run only on NR>1, but it doesn't matter. END{print line} -- At the end of our input, print whatever line we've stored. You should read man awk to learn about any parts of this that don't make sense.
a short answer for this would be: sort -k3,3n temp|head -1 since you have asked for awk: awk '{if(min>$3||NR==1){min=$3;a[$3]=$0}}END{print a[min]}' your_file But i prefer the shorter one always.
For calculating the smallest value in any column , let say last column awk '(FNR==1){a=$NF} {a=$NF < a?$NF:a} END {print a}' this will only print the smallest value of the column. In case if complete line is needed better to use sort: sort -r -n -t [delimiter] -k[column] [file name]
awk -F ";" '(NR==1){a=$NF;b=$0} {a=$NF<a?$NF:a;b=$NF>a?b:$0} END {print b}' filename this will print the line with smallest value which is encountered first.
awk 'BEGIN {OFS=FS=","}{if ( a[$1]>$2 || a[$1]=="") {a[$1]=$2;} if (b[$1]<$2) {b[$1]=$2;} } END {for (i in a) {print i,a[i],b[i]}}' input_file We use || a[$1]=="" because when 1st value of field 1 is encountered it will have null in a[$1].