Replace end of line with comma and put parenthesis in sed/awk - awk

I am trying to process the contents of a file from this format:
this1,EUR
that2,USD
other3,GBP
to this format:
this1(EUR),that2(USD),other3(GBP)
The result should be a single line.
As of now I have come up with this circuit of commands that works fine:
cat myfile | sed -e 's/,/\(/g' | sed -e 's/$/\)/g' | tr '\n' , | awk '{print substr($0, 0, length($0)- 1)}'
Is there a simpler way to do the same by just an awk command?

Another awk:
$ awk -F, '{ printf "%s%s(%s)", c, $1, $2; c = ","} END { print ""}' file
1(EUR),2(USD),3(GBP)

Following awk may help you on same.
awk -F, '{val=val?val OFS $1"("$2")":$1"("$2")"} END{print val}' OFS=, Input_file

Toying around with separators and gsub:
$ awk 'BEGIN{RS="";ORS=")\n"}{gsub(/,/,"(");gsub(/\n/,"),")}1' file
this1(EUR),that2(USD),other3(GBP)
Explained:
$ awk '
BEGIN {
RS="" # record ends in an empty line, not newline
ORS=")\n" # the last )
}
{
gsub(/,/,"(") # replace commas with (
gsub(/\n/,"),") # and newlines with ),
}1' file # output

Using paste+sed
$ # paste -s will combine all input lines to single line
$ seq 3 | paste -sd,
1,2,3
$ paste -sd, ip.txt
this1,EUR,that2,USD,other3,GBP
$ # post processing to get desired format
$ paste -sd, ip.txt | sed -E 's/,([^,]*)(,?)/(\1)\2/g'
this1(EUR),that2(USD),other3(GBP)

Related

Need to retrieve a value from an HL7 file using awk

In a Linux script program, I've got the following awk command for other purposes and to rename the file.
cat $edifile | awk -F\| '
{ OFS = "|"
print $0
} ' | tr -d "\012" > $newname.hl7
While this is happening, I'd like to grab the 5th field of the MSH segment and save it for later use in the script. Is this possible?
If no, how could I do it later or earlier on?
Example of the segment.
MSH|^~\&|business1|business2|/u/tmp/TR0049-GE-1.b64|routing|201811302126||ORU^R01|20181130212105810|D|2.3
What I want to do is retrieve the path and file name in MSH 5 and concatenate it to the end of the new file.
I've used this to capture the data but no luck. If fpth is getting set, there is no evidence of it and I don't have the right syntax for an echo within the awk phrase.
cat $edifile | awk -F\| '
{ OFS = "|"
{fpth=$(5)}
print $0
} ' | tr -d "\012" > $newname.hl7
any suggestions?
Thank you!
Try
filename=`awk -F'|' '{print $5}' $edifile | head -1`
You can skip the piping through head if the file is a single line
First of all, it must be mentioned that the awk line in your first piece of code, has zero use:
$ cat $edifile | awk -F\| ' { OFS = "|"; print $0 }' | tr -d "\012" > $newname.hl7
This is totally equivalent to
$ cat $edifile | tr -d "\012" > $newname.hl7
because OFS is only used to redefine $0 if you redefine a field.
Example:
$ echo "a|b|c" | awk -F\| '{OFS="/"; print $0}'
a|b|c
$ echo "a|b|c" | awk -F\| '{OFS="/"; $1=$1; print $0}'
a/b/c
I understand that you have a hl7 file in which you have a single line starting with the string "MSH". From this line you want to store the 5th field: this is achieved in the following way:
fpth=$(awk -v outputfile="${newname}.hl7" '
BEGIN{FS="|"; ORS="" }
($1 == "MSH"){ print $5 }
{ print $0 > outputfile }' $edifile)
I have replaced ORS to an empty character set, as it is equivalent to tr -d "\012". The above will work very nicely if you only have a single MSH in your file.

Add delimiters at end of each line

I've a csv file like below.
id,id1,id2,id3,id4,id5
1,101,102,103,104
2,201,202,203
3,301,302
Now what i want to add comma(,) to each line to make all line with same number of delimiters. So desired output should be.
id,id1,id2,id3,id4,id5
1,101,102,103,104,
2,201,202,203,,
3,301,302,,,
Using
awk -F "," ' { print NF-1 } ' file.csv | sort -r | head -1
I am able to find the max occurance of delimiter but not sure how to compare each line and append comma if its less than max.
With GNU awk (as I do not know if this works for other implementations)
$ # simply assign value to NF
$ awk -F, -v OFS=',' '{NF=6} 1' ip.txt
id,id1,id2,id3,id4,id5
1,101,102,103,104,
2,201,202,203,,
3,301,302,,,
If first line determines number of fields required:
$ awk -F, -v OFS=',' 'NR==1{f=NF} {NF=f} 1' ip.txt
id,id1,id2,id3,id4,id5
1,101,102,103,104,
2,201,202,203,,
3,301,302,,,
If any line determines max field:
$ cat ip.txt
id,id1,id2
1,101,102,103
2,201,202,203,204
3,301,302
$ awk -F, -v OFS=',' 'NR==FNR{f=(!f || NF>f) ? NF : f; next} {NF=f} 1' ip.txt ip.txt
id,id1,id2,,
1,101,102,103,
2,201,202,203,204
3,301,302,,
awk -F"," '{i=NF;c="";while (i++ < 6) {c=c","};print $0""c}' file
Output:
id,id1,id2,id3,id4,id5
1,101,102,103,104,
2,201,202,203,,
3,301,302,,,
You are already using the variable NF which indicates how many fields there are on a line.
awk -F , 'NF<6 { OFS=FS; for (i=NF+1; i<=6; i++) $i="" }1' filename
We start looping at the first undefined field and set it to an empty string, until we have six fields. Then the 1 at the end takes care of printing the now fully populated line. The OFS=FS is necessary to make the output field separator also be a comma (it is a space by default).
Following awk may also help you on same.
awk -F, '
FNR==1{
val=NF;
print;
next
}
{
count=NF;
while(count<val){
value=value",";
count++};
print $0 value;
value=count=""
}
' Input_file
Output will be as follows:
id,id1,id2,id3,id4,id5
1,101,102,103,104,
2,201,202,203,,
3,301,302,,,
Unified awk approach (based on number of fields of the 1st header line):
awk -F',' 'NR==1{ max_nf=NF; print }
NR>1{ printf "%s%.*s\n", $0, max_nf-NF, ",,,,,,,,," }' file
The output:
id,id1,id2,id3,id4,id5
1,101,102,103,104,
2,201,202,203,,
3,301,302,,,
Or via loop:
awk -F',' 'NR==1{ max_nf=NF; print }
NR>1{ n=max_nf-NF; r=""; while (n--) r=r","; print $0 r }' file

How to merge these codes, awk then cut

I am using awk in Debian.
input
11.22.33.44#55878:
11.22.33.43#55879:
...
...
(smtp:55.66.77.88)
(smtp:55.66.77.89)
...
...
cpe-33-22-11-99.buffalo.res.rr.com[99.11.22.33]
cpe-34-22-11-99.buffalo.res.rr.com[99.11.22.34]
...
Parts of sh codes (running in Debian)
awk '/#/ {print > "file1";next} \
/smtp/ {print > "file2";next} \
{print > "file7"}' input
#
if [ -s file1 ] ; then
#IP type => 11.22.33.44#55878:
cut -d'#' -f1 file1 >> output
rm -f file1
fi
#
if [ -s file2 ] ; then
#IP type => (smtp:55.66.77.88)
cut -d':' -f2 file2 | cut -d')' -f1 >> output
rm -f file2
fi
#
if [ -s file7 ] ; then
#IP type => cpe-33-22-11-99.buffalo.res.rr.com[99.11.22.33]
cut -d'[' -f2 file7 | cut -d']' -f1 >> output
rm -f file7
fi
then output
11.22.33.44
11.22.33.43
55.66.77.88
55.66.77.89
99.11.22.33
99.11.22.34
Is it possible to merge these codes only with awk , something like
awk '/#/ {print | cut -d'#' -f1 > "file1";next} \
/smtp/ {print | cut -d':' -f2 | cut -d')' -f1 > "file2";next} \
{print | cut -d'[' -f2 file7 | cut -d']' > "file7"}' input
I am newbie and have no idea for this,
After search questions, still no help.
any hint?
Thanks.
Best Regard.
$ awk -F'[][()#]|smtp:' '/#/{print $1;next} /smtp/{print $3;next} /\[/{print $2}' input
11.22.33.44
11.22.33.43
55.66.77.88
55.66.77.89
99.11.22.33
99.11.22.34
To save this in the file output:
awk -F'[][()#]|smtp:' '/#/{print $1;next} /smtp/{print $3;next} /\[/{print $2}' input >output
How it works
-F'[][()#]|smtp:'
This sets the field separator to (a) any of the characters ][()# or (b) the string smtp:.
/#/{print $1;next}
If the line contains #, then print the first field and skip to the next line.
/smtp/{print $3;next}
If the line contains smtp, then print the third field and skip to the next line.
/\[/{print $2}
If the line contains [, then print the second field.
Variation
There is more than one way to solve this problem, For example, using a slightly different field separator, we can still get the desired output:
$ awk -F'[][()#:]' '/#/{print $1;next} /smtp/{print $3;next} /\[/{print $2}' input
11.22.33.44
11.22.33.43
55.66.77.88
55.66.77.89
99.11.22.33
99.11.22.34

multiple field separator in awk

I'm trying to process an input which has two field seperators ; and space. I'm able to parse the input with one separator using:
echo "10.23;7.15;6.23" | awk -v OFMF="%0.2f" 'BEGIN{FS=OFS=";"} {print $1,$2,$3}'
10.23;7.15;6.23
For an input with two seperators, I tried this and it doesn't parse both the seperators:
echo "10.23;7.15 6.23" | awk -v OFMF="%0.2f" 'BEGIN{FS=OFS=";" || " "} {print $1,$2,$3}'
You want to set FS to a character list:
awk -F'[; ]' 'script' file
and the other builtin variable you're trying to set is named OFMT, not OFMF:
$ echo "10.23;7.15 6.23" | awk -F'[; ]' -v OFMT="%0.2f" '{print $1,$2,$3}'
10.23 7.15 6.23
$ echo "10.23;7.15 6.23" | awk 'BEGIN{FS="[; ]"; OFS=";"; OFMT="%0.2f"} {print $1,$2,$3}'
10.23;7.15;6.23

Delete lines from file -- awk

I have a file file.dat containing numbers, for example
4
6
7
I would like to use the numbers of this file to delete lines of another file.
Is there any way to pass this numbers as parameters to awk and delete these lines of another file?
I have this awk solution, but do not like it too much...
awk 'BEGIN { while( (getline x < "./file.dat" ) > 0 ) a[x]=0; } NR in a { next; }1' /path/to/another/file
Can you suggest something more elegant?
using NR==FNR to test which file awk is reading:
$ awk '{if(NR==FNR)idx[$0];else if(!(FNR in idx))print}' idx.txt data.txt
Or
$ awk 'NR==FNR{idx[$0]; next}; !(FNR in idx)' idx.txt data.txt
put index in idx.txt
put data in data.txt
I would use sed instead of awk:
$ sed $(for i in $(<idx.txt);do echo " -e ${i}d";done) file.txt