How to remove 0's from the second column - awk

I have a file that looks like this :
k141_173024,001
k141_173071,002
k141_173527,021
k141_173652,034
k141_173724,041
...
How do I remove 0's from each line of the second field?
The desired result is :
k141_173024,1
k141_173071,2
k141_173527,21
k141_173652,34
k141_173724,41
...
What I've tied was
cut -f 2 -d ',' file | awk '{print $1 + 0} > file2
cut -f 1 -d ',' file > file1
paste file1 file2 > final_file
This was an inefficient way to edit it.
Thank you.

awk 'BEGIN{FS=OFS=","} {print $1 OFS $2+0}' Input.txt
Force to Integer value by adding 0

If it's only the zeros following the comma (,001 to ,1 but ,010 to ,10; it's not remove 0's from the second column but the example doesn't clearly show the requirement), you could replace the comma and zeros with another comma:
$ awk '{gsub(/,0+/,",")}1' file
k141_173024,1
k141_173071,2
k141_173527,21
k141_173652,34
k141_173724,41

Could you please try following.
awk 'BEGIN{FS=OFS=","} {gsub(/0/,"",$2)}1' Input_file
EDIT: To remove only leading zeros try following.
awk 'BEGIN{FS=OFS=","} {sub(/^0+/,"",$2)}1' Input_file

If the second field is a number, you can do this to remove the leading zeroes:
awk 'BEGIN{FS=OFS=","} {print $1 OFS int($2)}' file
As per #Inian's suggestion, this can be further simplified to:
awk -F, -v OFS=, '{$2=int($2)}1' file

This might work for you (GNU sed):
sed 's/,0\+/,/' file
This removes leading zeroes from the second column by replacing a comma followed by one or more zeroes by a comma.
P.S. I guess the OP did not mean to remove zeroes that are part of the number.

Related

AWK remove leading and trailing whitespaces from fields

I need to remove leading and trailing whitespace characters from an input like this below using awk:
27 mA; 25 mA ; 24 mA ; 22 mA;
Required output:
27 mA;25 mA;24 mA;22 mA;
What I've tried:
awk -F";" '{$1=$1}1': doesn't remove the whitespaces, but removes ;s
awk -F";" '{gsub(/^[ \t]+|[ \t]+$/,""); print;}': doesn't remove the whitespaces
How can I modify these commands above to remove all leading and trailing whitespace (0x20) characters?
Update
Input might contain leading whitespace(s) at the first field:
27 mA; 25 mA ; 24 mA ; 22 mA;
With your shown samples, please try following awk code. Written and tested with GNU awk, should work in any awk.
awk -F'[[:space:]]*;[[:space:]]*' -v OFS=";" '{sub(/^[[:space:]]+/,"");$1=$1} 1' Input_file
OR
awk -F'[[:blank:]]*;[[:blank:]]*' -v OFS=";" '{sub(/^[[:blank:]]+/,"");$1=$1} 1' Input_file
Explanation: Simple explanation would be, making field separator for each line as spaces(0 or more occurrences) followed by ; which is further followed by 0 or more occurrences of spaces. Setting OFS as ;. In main program, re-assigning 1st field, then simply printing line.
2nd solution: As per #jhnc nice suggestion in comments, you could try with your shown samples.
awk '{sub(/^[[:space:]]+/,"");gsub(/ *; */,";")} 1' Input_file
A simple sed:
sed -E 's/[[:blank:]]*;[[:blank:]]*/;/g; s/^[[:blank:]]+|[[:blank:]]+$//' file
27 mA;25 mA;24 mA;22 mA;
or this awk:
awk '{gsub(/[[:blank:]]*;[[:blank:]]*/, ";"); gsub(/^[[:blank:]]+|[[:blank:]]+$/, "")} 1' file

Filtering rows based on column values of csv file

I have a dataset with 1000 rows and 10 columns. Here is the sample dataset
A,B,C,D,E,F,
a,b,c,d,e,f,
g,h,i,j,k,l,
m,n,o,p,q,r,
s,t,u,v,w,x,
From this dataset I want to copy the rows whose has value of column A as 'a' or 'm' to a new csv file. Also I want the header to get copied.
I have tried using awk. It copied all the rows but not the header.
awk '{$1~/a//m/ print}' inputfile.csv > outputfile.csv
How can I copy the header also into the new outputfile.csv?
Thanks in advance.
Considering that your header will be on 1st row, could you please try following.
awk 'BEGIN{FS=OFS=","} FNR==1{print;next} $1 ~ /^a$|^m$/' Input_file > outputfile.csv
OR as per Cyrus sir's comment adding following:
awk 'BEGIN{FS=OFS=","} FNR==1{print;next} $1 ~ /^(a|m)$/' Input_file > outputfile.csv
OR as per Ed sir's comment try following:
awk -F, 'NR==1 || $1~/^[am]$/' Input_file > outputfile.csv
Added corrections in OP's attempt:
Added FS and OFS as , here for all lines since lines are comma delimited.
Added FNR==1 condition which means it is checking 1st line here and printing it simply, since we want to print headers in out file. It will print very first line and then next will skip all further statements from here.
Used a better regex for checking 1st field's condition $1 ~ /^a$|^m$/
This might work for you (GNU sed):
sed '1b;/^[am],/!d' oldFile >newFile
Always print the first line and delete any other line that does not beging a, or m,.
Alternative:
awk 'NR==1 || /^[am],/' oldFile >newFile
With awk. Set field separator (FS) to , and output current row if it's first row or if its first column contains a or m.
awk 'NR==1 || $1=="a" || $1=="m"' FS=',' in.csv >out.csv
Output to out.csv:
A,B,C,D,E,F,
a,b,c,d,e,f,
m,n,o,p,q,r,
$ awk -F, 'BEGIN{split("a,m",tmp); for (i in tmp) tgts[tmp[i]]} NR==1 || $1 in tgts' file
A,B,C,D,E,F,
a,b,c,d,e,f,
m,n,o,p,q,r,
It appears that awk's default delimiter is whitespace. Link
Changing the delimiter can be denoted by using the FS variable:
awk 'BEGIN { FS = "," } ; { print $2 }'

How to remove field separators in awk when printing $0?

eg, each row of the file is like :
1, 2, 3, 4,..., 1000
How can print out
1 2 3 4 ... 1000
?
If you just want to delete the commas, you can use tr:
$ tr -d ',' <file
1 2 3 4 1000
If it is something more general, you can set FS and OFS (read about FS and OFS) in your begin block:
awk 'BEGIN{FS=","; OFS=""} ...' file
You need to set OFS (the output field separator). Unfortunately, this has no effect unless you also modify the string, leading the rather cryptic:
awk '{$1=$1}1' FS=, OFS=
Although, if you are happy with some additional space being added, you can leave OFS at its default value (a single space), and do:
awk -F, '{$1=$1}1'
and if you don't mind omitting blank lines in the output, you can simplify further to:
awk -F, '$1=$1'
You could also remove the field separators:
awk -F, '{gsub(FS,"")} 1'
Set FS to the input field separators. Assigning to $1 will then reformat the field using the output field separator, which defaults to space:
awk -F',\s*' '{$1 = $1; print}'
See the GNU Awk Manual for an explanation of $1 = $1

awk to split variable length record and add unique number on each group of records

i have a file which has variable length columns
x|y|XREC|DELIMITER|ab|cd|ef|IREC|DELIMITER|j|a|CREC|
p|q|IREC|DELIMITER|ww|xx|ZREC|
what i would like is
1|x|y|XREC|
1|ab|cd|ef|IREC|
1|j|a|CREC|
2|p|q|IREC|
2|ww|xx|ZREC|
So far i just managed to get seq number at the beginning
awk '{printf "%d|%s\n", NR, $0}' oldfile > with_seq.txt
Any help?
You could set the delimiter to DELIMITER:
$ awk -F 'DELIMITER[|]' '{for (i=1;i<=NF;i++)print NR"|"$i}' file
1|x|y|XREC|
1|ab|cd|ef|IREC|
1|j|a|CREC|
2|p|q|IREC|
2|ww|xx|ZREC|
Using awk
awk -F "DELIMITER" '{for(i=1;i<=NF;i++)print NR "|" $i}' file|sed 's/||/|/g'
1|x|y|XREC|
1|ab|cd|ef|IREC|
1|j|a|CREC|
2|p|q|IREC|
2|ww|xx|ZREC|

awk command to change field seperator from tilde to tab

I want to replace the delimter tilde into tab space in awk command, I have mentioned below how I would have expect.
input
~1~2~3~
Output
1 2 3
this wont work for me
awk -F"~" '{ OFS ="\t"; print }' inputfile
It's really a job for tr:
tr '~' '\t'
but in awk you just need to force the record to be recompiled by assigning one of the fields to its own value:
awk -F'~' -v OFS='\t' '{$1=$1}1'
awk NF=NF FS='~' OFS='\t'
Result
1 2 3
Code for sed:
$echo ~1~2~3~|sed 'y/~/\t/'
1 2 3