Convert multiple lines to a line separated by brackets and "|" - awk

I have the following data in multiple lines:
1
2
3
4
5
6
7
8
9
10
I want to convert them to lines separated by "|" and "()":
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|10
I made a mistake. I'm sorry,I want to convert them to lines separated by "|" and "()":
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)
What I have tried is:
seq 10 | sed -r 's/(.*)/(\1)/'|paste -sd"|"
What's the best unix one-liner to do that?

This might work for you (GNU sed):
sed 's/.*/(&)/;H;1h;$!d;x;s/\n/|/g' file
Surround each line by parens.
Append all lines to the hold space except for the first line which replaces the hold space.
Delete all lines except the last.
On the last line, swap to the hold space and replace all newlines by |'s.
N.B. When a line is deleted no further commands are invoked and the command cycle begins again. That is why the last two commands are only executed on the last line of the file.
Alternative:
sed -z 's/\n$//;s/.*/(&)/mg;y/\n/|/' file

With your shown samples please try following awk code. This should work in any version of awk.
awk -v OFS="|" '{val=(val?val OFS:"") "("$0")"} END{print val}' Input_file

Using GNU sed
$ sed -Ez ':a;s/([0-9]+)\n/(\1)|/;ta;s/\|$/\n/' input_file
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)

Here is another simple awk command:
awk 'NR>1 {printf "%s|", p} {p="(" $0 ")"} END {print p}' file
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)

Here it is:
sed -z 's/^/(/;s/\n/)|(/g;s/|($//' your_input
where -z allows you to treat the whole file as a single string with embedded \ns.
In detail, the sed script above consists of 3 commands separated by ;s:
s/^/(/ inserts a ( at the beginning of the whole file,
s/\n/)|(/g changes every \n to )|(;
s/|($// removes the trailing |( resulting from the \n at EOF, that is likely in your file since you are on linux.

With perl:
$ seq 10 | perl -pe 's/.*/($&)/; s/\n/|/ if !eof'
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)
s/.*/($&)/ to surround input lines with ()
s/\n/|/ if !eof will change newline to | except for the last input line.
Here's a solution with paste (just for fun):
$ seq 10 | paste -d'()' /dev/null - /dev/null | paste -sd'|'
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)

Using any awk:
$ seq 10 | awk '{printf "%s(%s)", sep, $0; sep="|"} END{print ""}'
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)

Related

How can i merge every block of 3 lines together while ignoring lower numbers of consecutive lines?

I have a text file like the following, contains blocks of text, blocks are in multiples of 3 lines or just 1 line:
AAAAAAAAAAAAA
BBBBBBBBBBBBB
CCCCCCCCCCCCC
DDDDDDDDDDDDD
EEEEEEEEEEEEE
FFFFFFFFFFFFF
GGGGGGGGGGGGG
HHHHHHHHHHHHH
IIIIIIIIIIIII
JJJJJJJJJJJJJ
KKKKKKKKKKKKK
LLLLLLLLLLLLL
MMMMMMMMMMMMM
NNNNNNNNNNNNN
OOOOOOOOOOOOO
PPPPPPPPPPPPP
QQQQQQQQQQQQQ
RRRRRRRRRRRRR
SSSSSSSSSSSSS
TTTTTTTTTTTTT
UUUUUUUUUUUUU
VVVVVVVVVVVVV
WWWWWWWWWWWWW
XXXXXXXXXXXXX
YYYYYYYYYYYYY
ZZZZZZZZZZZZZ
1111111111111
I would like to merge every block of 3 consecutive lines together, starting with the first in the block. I want to ignore lines that are in less then a block of 3 consecutive lines.
Characters and lengths of lines are always different. ( i have made the lines the same size in the example so it doesn't look too ugly).
So the output would be
AAAAAAAAAAAAA BBBBBBBBBBBBB CCCCCCCCCCCCC
DDDDDDDDDDDDD EEEEEEEEEEEEE FFFFFFFFFFFFF
GGGGGGGGGGGGG
HHHHHHHHHHHHH IIIIIIIIIIIII JJJJJJJJJJJJJ
KKKKKKKKKKKKK
LLLLLLLLLLLLL MMMMMMMMMMMMM NNNNNNNNNNNNN
OOOOOOOOOOOOO PPPPPPPPPPPPP QQQQQQQQQQQQQ
RRRRRRRRRRRRR SSSSSSSSSSSSS TTTTTTTTTTTTT
UUUUUUUUUUUUU
VVVVVVVVVVVVV WWWWWWWWWWWWW XXXXXXXXXXXXX
YYYYYYYYYYYYY ZZZZZZZZZZZZZ 1111111111111
I have tried to use
xargs -n3
However im not sure how to ignore singular lines
How can i acheive this?
With GNU awk for gensub():
$ awk -v RS= -v ORS='\n\n' '{$1=$1; print gensub(/(([^ ]+ ){2}[^ ]+) /,"\\1\n","g")}' file
AAAAAAAAAAAAA BBBBBBBBBBBBB CCCCCCCCCCCCC
DDDDDDDDDDDDD EEEEEEEEEEEEE FFFFFFFFFFFFF
GGGGGGGGGGGGG
HHHHHHHHHHHHH IIIIIIIIIIIII JJJJJJJJJJJJJ
KKKKKKKKKKKKK
LLLLLLLLLLLLL MMMMMMMMMMMMM NNNNNNNNNNNNN
OOOOOOOOOOOOO PPPPPPPPPPPPP QQQQQQQQQQQQQ
RRRRRRRRRRRRR SSSSSSSSSSSSS TTTTTTTTTTTTT
UUUUUUUUUUUUU
VVVVVVVVVVVVV WWWWWWWWWWWWW XXXXXXXXXXXXX
YYYYYYYYYYYYY ZZZZZZZZZZZZZ 1111111111111
In awk:
$ awk -v FS="\n" -v RS="" '{for(i=1;i<=NF;i+=3)print $i,$(i+1),$(i+2);print ""}' file
Output:
AAAAAAAAAAAAA BBBBBBBBBBBBB CCCCCCCCCCCCC
DDDDDDDDDDDDD EEEEEEEEEEEEE FFFFFFFFFFFFF
GGGGGGGGGGGGG
HHHHHHHHHHHHH IIIIIIIIIIIII JJJJJJJJJJJJJ
...
Update Version that won't leave trailing space:
$ awk -v FS="\n" -v RS="" '{for(i=1;i<=NF;i++)printf "%s%s",$i,(i%3==0||i==NF?ORS:OFS);print ""}' file
Please see discussion on some features in the comments. Thanks to the commentators for the constructive feedback.
Here is a different which will always work:
awk '(NF==0){print rec ORS; rec="";c=0; next}
{rec = rec (c ? (c%3==0 ? ORS : OFS) : "") $0; c++ }
END {print rec}' file
This might work for you (GNU sed):
sed '/\S/{N;/\n\s*$/b;N;//b;s/\n/ /g}' file
If the current line is not empty, append the next line.
If the appended line is not empty, append the next line.
If that line is also not empty, replace the newlines by spaces.
In all other cases print the line(s) as is.
An alternative, that is more programmatic:
sed ':a;N;s/\n/&/2;Ta;/^\s*$/M{P;D};s/\n/ /g' file

How to add N blank lines between all rows of a text file?

I have a file that looks
a
b
c
d
Suppose I want to add N lines (in the example 3, but I actually need 20 or 100 depending on the file)
a
b
c
d
I can add one blank line between all of them with sed
sed -i '0~1 a\\' file
But sed -i '0~3 a\\' file inserts one line every 3 rows.
You may use with GNU sed:
sed -i 'G;G;G' file
The G;G;G will append three empty lines below each non-final line.
Or, awk:
awk 'BEGIN{ORS="\n\n\n"};1'
See an online sed and awk demo.
If you need to set the number of newlines dynamically use
nl="
"
awk -v nl="$nl" 'BEGIN{for(c=0;c<3;c++) v=v""nl;ORS=v};1' file > newfile
With GNU awk:
awk -i inplace -v lines=3 '{print; for(i=0;i<lines;i++) print ""}' file
Update with Ed's hints (see comments):
awk -i inplace -v lines=3 '{print; for(i=1;i<=lines;i++) print ""}' file
Update (without trailing empty lines):
awk -i inplace -v lines=3 'NR==1; NR>1{for(i=1;i<=lines;i++) print ""; print}' file
Output to file:
a
b
c
d
With sed and corutils:
N=4
sed "\$b;$(yes G\; | head -n$N)" infile
Similar trick with awk:
N=4
awk 1 RS="$(yes \\n | head -n$N | tr -d '\n')" infile
This might work for you (GNU sed):
sed ':a;G;s/\n/&/2;Ta' file
This will add 2 blank lines following each line.
Change 2 to what ever number you desire between each line.
An alternative (more efficient?):
sed '1{x;:a;/^.\{2\}/!s/^/\n/;ta;s/.//;x};G' file

How to delete top and last non empty lines of the file

I want to delete top and last non empty line of the file.
Example:
cat test.txt
//blank_line
abc
def
xyz
//blank_line
qwe
mnp
//blank_line
Then output should be:
def
xyz
//blank_line
qwe
I have tried with commands
sed "$(awk '/./{line=NR} END{print line}' test.txt)d" test.txt
to remove last non empty line. At here there are two command, (1) sed and (2) awk. But I want to do by single command.
Reading the whole file in memory at once with GNU sed for -E and -z:
$ sed -Ez 's/^\s*\S+\n//; s/\n\s*\S+\s*$/\n/' test.txt
def
xyz
qwe
or with GNU awk for multi-char RS:
$ awk -v RS='^$' '{gsub(/^\s*\S+\n|\n\S+\s*$/,"")} 1' test.txt
def
xyz
qwe
Both GNU tools accept \s and \S as shorthand for [[:space:]] and [^[:space:]] respectively and GNU sed accepts the non-POSIX-sed-standard \n as meaning newline.
This is a double pass method:
awk '(NR==FNR) { if(NF) {t=FNR;if(!h) h=FNR}; next}
(h<FNR && FNR<t)' file file
The integers h and t keep track of the head and the tail. In this case, empty lines can also contain blanks. You could replace if(NF) by if(length($0)==0) to be more strict.
This one reads everything into memory and does a simple replace at the end:
$ awk '{b=b RS $0}
END{ sub(/^[[:blank:]\n]*[^\n]+\n/,"",b);
sub(/\n[^\n]+[[:blank:]\n]*$,"",b);
print b }' file
A single-pass, fast and relatively memory-efficient approach utilising a buffer:
awk 'f {
if(NF) {
printf "%s",buf
buf=""
}
buf=(buf $0 ORS)
next
}
NF {
f=1
}' file
here is a golfed version of #kvantour's solution
$ awk 'NR==(n=FNR){e=!NF?e:n;b=!b?e:b}b<n&&n<e' file{,}
This might work for you (GNU sed):
sed -E '0,/\S/d;H;$!d;x;s/.(.*)\n.*\S.*/\1/' file
Use a range to delete upto and including the first line containing a non-space character. Then copy the remains of the file into the hold space and at the end of file use substitution to remove the last line containing a non-space character and any empty lines to the end of the file.
Alternative:
sed '0,/\S/d' file | tac | sed '0,/\S/d'| tac

Convert data format using awk?

There is a file which contains data in a 'n*1' format:
1
2
3
4
5
6
Is there any way to convert it to a 'n*3' format like:
1,2,3
4,5,6
via awk rather than using for loop ?
Really no idea about this..Any help or key word is appreciated.
Using awk
$ awk '{printf "%s%s",$0,(NR%3==0?ORS:",")}' File
1,2,3
4,5,6
The command printf "%s%s",$0,(NR%3==0?ORS:",") tells awk to print two strings. The first is $0 which is the current line. The second string is NR%3==0?ORS:"," which is either ORS the output record separator (if the line number is a multiple of three) or else , for all other line numbers.
Using sed
$ sed 'N;N;s/\n/,/g' File
1,2,3
4,5,6
By default, sed reads in each line from the file one by one. N tells sed to read in another line, appending the line to the current one, separated by a newline. N;N tells sed to do that twice so that we have a total of three lines in the pattern space. s/\n/,/g tells sed to replace those two separator newlines with commas. The result is then printed.
The above assumes that we are using GNU sed. With minor modifications, this can be made to work with BSD/OSX sed.
The most simple one - paste command:
paste -d, - - - <file
The output:
1,2,3
4,5,6
Following may help you on same.
xargs -n3 < Input_file | sed 's/ /,/g'
Try this:
awk 'NR%3==0{print;next}{printf "%s,",$0}' file
or decomposed :
NR%3==0 # condition, modulo 3 == 0
{print;next} # then print and skip to the first line
{printf "%s,",$0} # printf to not print newlines but current int + ,
$ awk '{ORS=(NR%3?",":RS)}1' file
1,2,3
4,5,6

awk to transpose lines of a text file

A .csv file that has lines like this:
20111205 010016287,1.236220,1.236440
It needs to read like this:
20111205 01:00:16.287,1.236220,1.236440
How do I do this in awk? Experimenting, I got this far. I need to do it in two passes I think. One sub to read the date&time field, and the next to change it.
awk -F, '{print;x=$1;sub(/.*=/,"",$1);}' data.csv
Use that awk command:
echo "20111205 010016287,1.236220,1.236440" | \
awk -F[\ \,] '{printf "%s %s:%s:%s.%s,%s,%s\n", \
$1,substr($2,1,2),substr($2,3,2),substr($2,5,2),substr($2,7,3),$3,$4}'
Explanation:
-F[\ \,]: sets the delimiter to space and ,
printf "%s %s:%s:%s.%s,%s,%s\n": format the output
substr($2,0,3): cuts the second firls ($2) in the desired pieces
Or use that sed command:
echo "20111205 010016287,1.236220,1.236440" | \
sed 's/\([0-9]\{8\}\) \([0-9]\{2\}\)\([0-9]\{2\}\)\([0-9]\{2\}\)\([0-9]\{3\}\)/\1 \2:\3:\4.\5/g'
Explanation:
[0-9]\{8\}: first match a 8-digit pattern and save it as \1
[0-9]\{2\}...: after a space match 3 times a 2-digit pattern and save them to \2, \3 and \4
[0-9]\{3\}: and at last match 3-digit pattern and save it as \5
\1 \2:\3:\4.\5: format the output
sed is better suited to this job since it's a simple substitution on single lines:
$ sed -r 's/( ..)(..)(..)/\1:\2:\3./' file
20111205 01:00:16.287,1.236220,1.236440
but if you prefer here's GNU awk with gensub():
$ awk '{print gensub(/( ..)(..)(..)/,"\\1:\\2:\\3.","")}' file
20111205 01:00:16.287,1.236220,1.236440