awk: how to delete first and last value on entire column - awk

I have a data that is comprised of several columns. On one column I would like to delete two commas that are each located in beginning and the end of entire column. My data looks something like this:
a ,3,4,3,2,
b ,3,4,5,1,
c ,1,5,2,4,5,
d ,3,6,24,62,3,54,
Can someone teach me how to delete the first and last commas on this data? I would appreciate it.

$ awk '{gsub(/^,|,$/,"",$NF)}1' file
a 3,4,3,2
b 3,4,5,1
c 1,5,2,4,5
d 3,6,24,62,3,54

awk '{sub(/,/,"",$0); print substr($0,0,length($0)-1)}' input.txt
Output:
a 3,4,3,2,
b 3,4,5,1,
c 1,5,2,4,5,
d 3,6,24,62,3,54

You can do it with sed too:
sed -e 's/,//' -e 's/,$//' file
That says "substitue the first comma on the line with nothing" and then "substitute a comma followed by end of line with nothing".
If you want it to write a new file, do this:
sed -e 's/,//' -e 's/,$//' file > newfile.txt

Related

Count b or B in even lines

I need count the number of times in the even lines of the file.txt the letter 'b' or 'B' appears, e.g. for the file.txt like:
everyB or gbnBra
uitiakB and kanapB bodddB
Kanbalis astroBominus
I got the first part but I need to count these b or B letters and I do not know how to count them together
awk '!(NR%2)' file.txt
$ awk '!(NR%2){print gsub(/[bB]/,"")}' file
4
Could you please try following, one more approach with awk written on mobile will try it in few mins should work but.
awk -F'[bB]' 'NR%2 == 0{print (NF ? NF - 1 : 0)}' Input_file
Thanks to #Ed sir for solving zero matches found line problem in comments.
In a single awk:
awk '!(NR%2){gsub(/[^Bb]/,"");print length}' file.txt
gsub(/[^Bb]/,"") deletes every character in the line the line except for B and b.
print length prints the length of the resulting string.
awk '!(NR%2)' file.txt | tr -cd 'Bb' | wc -c
Explanation:
awk '!(NR%2)' file.txt : keep only even lines from file.txt
tr -cd 'Bb' : keep only B and b characters
wc -c : count characters
Example:
With file bellow, the result is 4.
everyB or gbnBra
uitiakB and kanapB bodddB
Kanbalis astroBominus
Here is another way
$ sed -n '2~2s/[^bB]//gp' file | wc -c

Pipe Command Output into Awk

I want to pipe the output of a command into awk. I want to add that number to every row of a new column in an existing .txt file. The new column should be at the end, and won't necessarily be column 2.
$command 1
4512438
$ input.txt
A
B
C
D
$ desired_ouput.txt
A 4512438
B 4512438
C 4512438
D 4512438
I think I need to do something along the lines of the following. I'm not sure how to designate that the pipe goes into the new column - this awk command will simply add integers to the column.
$ command1 | awk -F, '{$(NF+1)=++i;}1' OFS=, input.txt > desired_ouput.txt
It doesn't seem like you really want to pipe the value to awk. Instead, you want to pass it as a parameter. You could read it from the pipe with something like:
cmd1 | awk 'NR==FNR{a=$0} NR!=FNR{print $0,a}' - input.txt
but it seems much more natural to do:
awk '{print $0,a}' a="$(cmd1)" input.txt

awk: identify column by condition, change value, and finally print all columns

I want to extract the value in each row of a file that comes after AA. I can do this like so:
awk -F'[;=|]' '{for(i=1;i<=NF;i++)if($i=="AA"){print toupper($(i+1));next}}'
This gives me the exact information I need and converts to uppercase, which is exactly what I want to do. How can I do this and then print the entire row with this altered value in its previous position? I am essentially trying to do a find and replace where the value is changed to uppercase.
EDIT:
Here is a sample input line:
11 128196 rs576393503 A G 100 PASS AC=453;AF=0.0904553;AN=5008;NS=2504;DP=5057;EAS_AF=0.0159;AMR_AF=0.0259;AFR_AF=0.3071;EUR_AF=0.006;SAS_AF=0.0072;AA=g|||;VT=SNP
and here is a how I would like the output to look:
11 128196 rs576393503 A G 100 PASS AC=453;AF=0.0904553;AN=5008;NS=2504;DP=5057;EAS_AF=0.0159;AMR_AF=0.0259;AFR_AF=0.3071;EUR_AF=0.006;SAS_AF=0.0072;AA=G|||;VT=SNP
All that has changed is the g after AA= is changed to uppercase.
Following awk may help you on same.
awk '
{
match($0,/AA=[^|]*/);
print substr($0,1,RSTART+2) toupper(substr($0,RSTART+3,RLENGTH-3)) substr($0,RSTART+RLENGTH)
}
' Input_file
With GNU sed and perl, using word boundaries
$ echo 'SAS_AF=0.0072;AA=g|||;VT=SNP' | sed 's/\bAA=[^;=|]*\b/\U&/'
SAS_AF=0.0072;AA=G|||;VT=SNP
$ echo 'SAS_AF=0.0072;AA=g|||;VT=SNP' | perl -pe 's/\bAA=[^;=|]*\b/\U$&/'
SAS_AF=0.0072;AA=G|||;VT=SNP
\U will uppercase string following it until end or \E or another case-modifier
use g modifier if there can be more than one match per line

Awk pattern matching on rows that have a value at specific column. No delimiter

I would like to search a file, using awk, to output rows that have a value commencing at a specific column number. e.g.
I looking for 979719 starting at column number 10:
moobaaraa**979719**
moobaaraa123456
moo**979719**123456
moobaaraa**979719**
moobaaraa123456
As you can see, there are no delimiters. It is a raw data text file. I would like to output rows 1 and 4. Not row 3 which does contain the pattern but not at the desired column number.
awk '/979719$/' file
moobaaraa979719
moobaaraa979719
An simple sed approach.
$ cat file
moobaaraa979719
moobaaraa123456
moo979719123456
moobaaraa979719
moobaaraa123456
Just search for a pattern, that end's up with 979719 and print the line:
$ sed -n '/^.*979719$/p' file
moobaaraa979719
moobaaraa979719
This code works:
awk 'length($1) == 9' FS="979719" raw-text-file
This code sets 979719 as the field separator, and checks whether the first field has a length of 9 characters. Then prints the line (as default action).
awk 'substr($0,10,6) == 979719' file
You can drop the ,6 if you want to search from the 10th char to the end of each line.

Its possible use awk for create a extra field in column that count the position of line of table?

for example in this file
a;b;c
a;b;v
a;b;f
and with the command output
a;b;c;1
a;b;v;2
a;b;f;3
please help me
try this -
awk '{print $0,NR}' OFS=";" f
a;b;c;1
a;b;v;2
a;b;f;3