Replace end of line with comma and put parenthesis in sed/awk - awk
I am trying to process the contents of a file from this format:
this1,EUR
that2,USD
other3,GBP
to this format:
this1(EUR),that2(USD),other3(GBP)
The result should be a single line.
As of now I have come up with this circuit of commands that works fine:
cat myfile | sed -e 's/,/\(/g' | sed -e 's/$/\)/g' | tr '\n' , | awk '{print substr($0, 0, length($0)- 1)}'
Is there a simpler way to do the same by just an awk command?
Another awk:
$ awk -F, '{ printf "%s%s(%s)", c, $1, $2; c = ","} END { print ""}' file
1(EUR),2(USD),3(GBP)
Following awk may help you on same.
awk -F, '{val=val?val OFS $1"("$2")":$1"("$2")"} END{print val}' OFS=, Input_file
Toying around with separators and gsub:
$ awk 'BEGIN{RS="";ORS=")\n"}{gsub(/,/,"(");gsub(/\n/,"),")}1' file
this1(EUR),that2(USD),other3(GBP)
Explained:
$ awk '
BEGIN {
RS="" # record ends in an empty line, not newline
ORS=")\n" # the last )
}
{
gsub(/,/,"(") # replace commas with (
gsub(/\n/,"),") # and newlines with ),
}1' file # output
Using paste+sed
$ # paste -s will combine all input lines to single line
$ seq 3 | paste -sd,
1,2,3
$ paste -sd, ip.txt
this1,EUR,that2,USD,other3,GBP
$ # post processing to get desired format
$ paste -sd, ip.txt | sed -E 's/,([^,]*)(,?)/(\1)\2/g'
this1(EUR),that2(USD),other3(GBP)
Related
Need to retrieve a value from an HL7 file using awk
In a Linux script program, I've got the following awk command for other purposes and to rename the file. cat $edifile | awk -F\| ' { OFS = "|" print $0 } ' | tr -d "\012" > $newname.hl7 While this is happening, I'd like to grab the 5th field of the MSH segment and save it for later use in the script. Is this possible? If no, how could I do it later or earlier on? Example of the segment. MSH|^~\&|business1|business2|/u/tmp/TR0049-GE-1.b64|routing|201811302126||ORU^R01|20181130212105810|D|2.3 What I want to do is retrieve the path and file name in MSH 5 and concatenate it to the end of the new file. I've used this to capture the data but no luck. If fpth is getting set, there is no evidence of it and I don't have the right syntax for an echo within the awk phrase. cat $edifile | awk -F\| ' { OFS = "|" {fpth=$(5)} print $0 } ' | tr -d "\012" > $newname.hl7 any suggestions? Thank you!
Try filename=`awk -F'|' '{print $5}' $edifile | head -1` You can skip the piping through head if the file is a single line
First of all, it must be mentioned that the awk line in your first piece of code, has zero use: $ cat $edifile | awk -F\| ' { OFS = "|"; print $0 }' | tr -d "\012" > $newname.hl7 This is totally equivalent to $ cat $edifile | tr -d "\012" > $newname.hl7 because OFS is only used to redefine $0 if you redefine a field. Example: $ echo "a|b|c" | awk -F\| '{OFS="/"; print $0}' a|b|c $ echo "a|b|c" | awk -F\| '{OFS="/"; $1=$1; print $0}' a/b/c I understand that you have a hl7 file in which you have a single line starting with the string "MSH". From this line you want to store the 5th field: this is achieved in the following way: fpth=$(awk -v outputfile="${newname}.hl7" ' BEGIN{FS="|"; ORS="" } ($1 == "MSH"){ print $5 } { print $0 > outputfile }' $edifile) I have replaced ORS to an empty character set, as it is equivalent to tr -d "\012". The above will work very nicely if you only have a single MSH in your file.
Add delimiters at end of each line
I've a csv file like below. id,id1,id2,id3,id4,id5 1,101,102,103,104 2,201,202,203 3,301,302 Now what i want to add comma(,) to each line to make all line with same number of delimiters. So desired output should be. id,id1,id2,id3,id4,id5 1,101,102,103,104, 2,201,202,203,, 3,301,302,,, Using awk -F "," ' { print NF-1 } ' file.csv | sort -r | head -1 I am able to find the max occurance of delimiter but not sure how to compare each line and append comma if its less than max.
With GNU awk (as I do not know if this works for other implementations) $ # simply assign value to NF $ awk -F, -v OFS=',' '{NF=6} 1' ip.txt id,id1,id2,id3,id4,id5 1,101,102,103,104, 2,201,202,203,, 3,301,302,,, If first line determines number of fields required: $ awk -F, -v OFS=',' 'NR==1{f=NF} {NF=f} 1' ip.txt id,id1,id2,id3,id4,id5 1,101,102,103,104, 2,201,202,203,, 3,301,302,,, If any line determines max field: $ cat ip.txt id,id1,id2 1,101,102,103 2,201,202,203,204 3,301,302 $ awk -F, -v OFS=',' 'NR==FNR{f=(!f || NF>f) ? NF : f; next} {NF=f} 1' ip.txt ip.txt id,id1,id2,, 1,101,102,103, 2,201,202,203,204 3,301,302,,
awk -F"," '{i=NF;c="";while (i++ < 6) {c=c","};print $0""c}' file Output: id,id1,id2,id3,id4,id5 1,101,102,103,104, 2,201,202,203,, 3,301,302,,,
You are already using the variable NF which indicates how many fields there are on a line. awk -F , 'NF<6 { OFS=FS; for (i=NF+1; i<=6; i++) $i="" }1' filename We start looping at the first undefined field and set it to an empty string, until we have six fields. Then the 1 at the end takes care of printing the now fully populated line. The OFS=FS is necessary to make the output field separator also be a comma (it is a space by default).
Following awk may also help you on same. awk -F, ' FNR==1{ val=NF; print; next } { count=NF; while(count<val){ value=value","; count++}; print $0 value; value=count="" } ' Input_file Output will be as follows: id,id1,id2,id3,id4,id5 1,101,102,103,104, 2,201,202,203,, 3,301,302,,,
Unified awk approach (based on number of fields of the 1st header line): awk -F',' 'NR==1{ max_nf=NF; print } NR>1{ printf "%s%.*s\n", $0, max_nf-NF, ",,,,,,,,," }' file The output: id,id1,id2,id3,id4,id5 1,101,102,103,104, 2,201,202,203,, 3,301,302,,, Or via loop: awk -F',' 'NR==1{ max_nf=NF; print } NR>1{ n=max_nf-NF; r=""; while (n--) r=r","; print $0 r }' file
How to merge these codes, awk then cut
I am using awk in Debian. input 11.22.33.44#55878: 11.22.33.43#55879: ... ... (smtp:55.66.77.88) (smtp:55.66.77.89) ... ... cpe-33-22-11-99.buffalo.res.rr.com[99.11.22.33] cpe-34-22-11-99.buffalo.res.rr.com[99.11.22.34] ... Parts of sh codes (running in Debian) awk '/#/ {print > "file1";next} \ /smtp/ {print > "file2";next} \ {print > "file7"}' input # if [ -s file1 ] ; then #IP type => 11.22.33.44#55878: cut -d'#' -f1 file1 >> output rm -f file1 fi # if [ -s file2 ] ; then #IP type => (smtp:55.66.77.88) cut -d':' -f2 file2 | cut -d')' -f1 >> output rm -f file2 fi # if [ -s file7 ] ; then #IP type => cpe-33-22-11-99.buffalo.res.rr.com[99.11.22.33] cut -d'[' -f2 file7 | cut -d']' -f1 >> output rm -f file7 fi then output 11.22.33.44 11.22.33.43 55.66.77.88 55.66.77.89 99.11.22.33 99.11.22.34 Is it possible to merge these codes only with awk , something like awk '/#/ {print | cut -d'#' -f1 > "file1";next} \ /smtp/ {print | cut -d':' -f2 | cut -d')' -f1 > "file2";next} \ {print | cut -d'[' -f2 file7 | cut -d']' > "file7"}' input I am newbie and have no idea for this, After search questions, still no help. any hint? Thanks. Best Regard.
$ awk -F'[][()#]|smtp:' '/#/{print $1;next} /smtp/{print $3;next} /\[/{print $2}' input 11.22.33.44 11.22.33.43 55.66.77.88 55.66.77.89 99.11.22.33 99.11.22.34 To save this in the file output: awk -F'[][()#]|smtp:' '/#/{print $1;next} /smtp/{print $3;next} /\[/{print $2}' input >output How it works -F'[][()#]|smtp:' This sets the field separator to (a) any of the characters ][()# or (b) the string smtp:. /#/{print $1;next} If the line contains #, then print the first field and skip to the next line. /smtp/{print $3;next} If the line contains smtp, then print the third field and skip to the next line. /\[/{print $2} If the line contains [, then print the second field. Variation There is more than one way to solve this problem, For example, using a slightly different field separator, we can still get the desired output: $ awk -F'[][()#:]' '/#/{print $1;next} /smtp/{print $3;next} /\[/{print $2}' input 11.22.33.44 11.22.33.43 55.66.77.88 55.66.77.89 99.11.22.33 99.11.22.34
multiple field separator in awk
I'm trying to process an input which has two field seperators ; and space. I'm able to parse the input with one separator using: echo "10.23;7.15;6.23" | awk -v OFMF="%0.2f" 'BEGIN{FS=OFS=";"} {print $1,$2,$3}' 10.23;7.15;6.23 For an input with two seperators, I tried this and it doesn't parse both the seperators: echo "10.23;7.15 6.23" | awk -v OFMF="%0.2f" 'BEGIN{FS=OFS=";" || " "} {print $1,$2,$3}'
You want to set FS to a character list: awk -F'[; ]' 'script' file and the other builtin variable you're trying to set is named OFMT, not OFMF: $ echo "10.23;7.15 6.23" | awk -F'[; ]' -v OFMT="%0.2f" '{print $1,$2,$3}' 10.23 7.15 6.23 $ echo "10.23;7.15 6.23" | awk 'BEGIN{FS="[; ]"; OFS=";"; OFMT="%0.2f"} {print $1,$2,$3}' 10.23;7.15;6.23
Delete lines from file -- awk
I have a file file.dat containing numbers, for example 4 6 7 I would like to use the numbers of this file to delete lines of another file. Is there any way to pass this numbers as parameters to awk and delete these lines of another file? I have this awk solution, but do not like it too much... awk 'BEGIN { while( (getline x < "./file.dat" ) > 0 ) a[x]=0; } NR in a { next; }1' /path/to/another/file Can you suggest something more elegant?
using NR==FNR to test which file awk is reading: $ awk '{if(NR==FNR)idx[$0];else if(!(FNR in idx))print}' idx.txt data.txt Or $ awk 'NR==FNR{idx[$0]; next}; !(FNR in idx)' idx.txt data.txt put index in idx.txt put data in data.txt
I would use sed instead of awk: $ sed $(for i in $(<idx.txt);do echo " -e ${i}d";done) file.txt