Sed command issue - awk

I have this file inside a mariaDB that looks like this
name callerid secret context type host
1000 Omar Al-Ani <1000> op1000DIR MANAGEMENT friend dynamic
1001 Ammar Zigderly <1001> 1001 MANAGEMENT peer dynamic
1002 Lubna COO Office <1002> 1002 ELdefault peer dynamic
i want to convert it using sed and awk to look like this format
[1000]
callerid=Omar Al-Ani <1000>
secret=op1000DIR
context=MANAGEMENT
type=friend
host=dynamic
[1001]
callerid=Ammar Zigderly <1001>
secret=1001
context=MANAGEMENT
type=peer
host=dynamic
[1002]
callerid=Lubna COO Office <1002>
secret=1002
context=ELdefault
type=peer
host=dynamic
This is the output of this command head -3 filename | od -c on the input file
0000000 n a m e \t c a l l e r i d \t s e
0000020 c r e t \t c o n t e x t \t t y p
0000040 e \t h o s t \n 1 0 0 0 \t O m a
0000060 r A l - A n i < 1 0 0 0 >
0000100 \t o p 1 0 0 0 D I R \t M A N A
0000120 G E M E N T \t f r i e n d \t d y
0000140 n a m i c \n 1 0 0 1 \t A m m
0000160 a r Z i g d e r l y < 1 0 0
0000200 1 > \t 1 0 0 1 \t M A N A G E
0000220 M E N T \t p e e r \t d y n a m i
0000240 c \n
0000243
Any idea would be helpfull !

I think awk is going to be a bit simpler and easier (?) to modify if requirements change:
awk -F'\t' '
BEGIN { labels[2]="callerid"
labels[3]="secret"
labels[4]="context"
labels[5]="type"
labels[6]="host"
}
FNR>1 { gsub(/ /,"",$1) # remove spaces from 1st column
printf "[%s]\n",$1
for (i=2;i<=6;i++)
printf "\t%s=%s\n", labels[i],$i
print ""
}
' names.dat
This generates:
[1000]
callerid=Omar Al-Ani <1000>
secret=op1000DIR
context=MANAGEMENT
type=friend
host=dynamic
[1001]
callerid=Ammar Zigderly <1001>
secret=1001
context=MANAGEMENT
type=peer
host=dynamic
[1002]
callerid=Lubna COO Office <1002>
secret=1002
context=ELdefault
type=peer
host=dynamic

assuming tab separated fields
$ awk -F'\t' 'NR==1 {split($0,h); next}
{print "[" $1 "]";
for(i=2;i<=NF;i++) print "\t" h[i] ":" $i}' file.tcv
[1000]
callerid:Omar Al-Ani <1000>
secret:op1000DIR
context:MANAGEMENT
type:friend
host:dynamic
[1001]
callerid:Ammar Zigderly <1001>
secret:1001
context:MANAGEMENT
type:peer
host:dynamic
[1002]
callerid:Lubna COO Office <1002>
secret:1002
context:ELdefault
type:peer
host:dynamic

Related

Find and replace and move a line that contains a specific string

Assuming I have the following text file:
a b c d 1 2 3
e f g h 1 2 3
i j k l 1 2 3
m n o p 1 2 3
How do I replace '1 2 3' with '4 5 6' in the line that contains the letter (e) and move it after the line that contains the letter (k)?
N.B. the line that contains the letter (k) may come in any location in the file, the lines are not assumed to be in any order
My approach is
Remove the line I want to replace
Find the lines before the line I want to move it after
Find the lines after the line I want to move it after
append the output to a file
grep -v 'e' $original > $file
grep -B999 'k' $file > $output
grep 'e' $original | sed 's/1 2 3/4 5 6/' >> $output
grep -A999 'k' $file | tail -n+2 >> $output
rm $file
mv $output $original
but there is a lot of issues in this solution:
a lot of grep commands that seems unnecessary
the argument -A999 and -B999 are assuming the file would not contain lines more than 999, it would be better to have another way to get lines before and after the matched line
I am looking for a more efficient way to achieve that
Using sed
$ sed '/e/{s/1 2 3/4 5 6/;h;d};/k/{G}' input_file
a b c d 1 2 3
i j k l 1 2 3
e f g h 4 5 6
m n o p 1 2 3
Here is a GNU awk solution:
awk '
/\<e\>/{
s=$0
sub("1 2 3", "4 5 6", s)
next
}
/\<k\>/ && s {
printf("%s\n%s\n",$0,s)
next
} 1
' file
Or POSIX awk:
awk '
function has(x) {
for(i=1; i<=NF; i++) if ($i==x) return 1
return 0
}
has("e") {
s=$0
sub("1 2 3", "4 5 6", s)
next
}
has("k") && s {
printf("%s\n%s\n",$0,s)
next
} 1
' file
Either prints:
a b c d 1 2 3
i j k l 1 2 3
e f g h 4 5 6
m n o p 1 2 3
This works regardless of the order of e and k in the file:
awk '
function has(x) {
for(i=1; i<=NF; i++) if ($i==x) return 1
return 0
}
has("e") {
s=$0
sub("1 2 3", "4 5 6", s)
next
}
FNR<NR && has("k") && s {
printf("%s\n%s\n",$0,s)
s=""
next
}
FNR<NR
' file file
This awk should work for you:
awk '
/(^| )e( |$)/ {
sub(/1 2 3/, "4 5 6")
p = $0
next
}
1
/(^| )k( |$)/ {
print p
p = ""
}' file
a b c d 1 2 3
i j k l 1 2 3
e f g h 4 5 6
m n o p 1 2 3
This might work for you (GNU sed):
sed -n '/e/{s/1 2 3/4 5 6/;s#.*#/e/d;/k/s/.*/\&\\n&/#p};' file | sed -f - file
Design a sed script by passing the file twice and applying the sed instructions from the first pass to the second.
Another solution is to use ed:
cat <<\! | ed file
/e/s/1 2 3/4 5 6/
/e/m/k/
wq
!
Or if you prefer:
<<<$'/e/s/1 2 3/4 5 6/\n.m/k/\nwq' ed -s file

How to print out lines starting with keyword and connected with backslash with sed or awk

For example, I'd like to print out the line starting with set_2 and connected with \ like this. I'd like to know whether it's possible do to it with sed, awk or any other text process command lines.
< Before >
set_1 abc def
set_2 a b c d\
e f g h\
i j k l\
m n o p
set_3 ghi jek
set_2 aaa bbb\
ccc ddd\
eee fff
set_4 1 2 3 4
< After text process >
set_2 a b c d\
e f g h\
i j k l\
m n o p
set_2 aaa bbb\
ccc ddd\
eee fff
Try the following:
awk -v st="set_2" '/^set/ {set=$1} /\\$/ && set==st { prnt=1 } prnt==1 { print } !/\\$/ { prnt=0 }' file
Explanation:
awk -v st="set_2" ' # Pass the set to track as a variable st
/^set/ {
set=$1 # When the line begins with "set", track the set in the variable set
}
/\\$/ && set==st {
prnt=1 # When we are in the required set block and the line ends with "/", set a print marker (prnt) to 1
}
prnt==1 {
print # When the print marker is 1, print the line
}
!/\\$/ {
prnt=0 # If the line doesn't end with "/". set the print marker to 0
}' file
Would you try the sed solution:
sed -nE '
/^set_2/ { ;# if the line starts with "set_2" execute the block
:a ;# define a label "a"
/\\[[:space:]]*$/! {p; bb} ;# if the line does not end with "\", print the pattern space and exit the block
N ;# append the next line to the pattern space
ba ;# go to label "a"
} ;# end of the block
:b ;# define a label "b"
' file
Please note the character class [[:space:]]* is inserted just because the OP's posted example contains whitespaces after the slash.
[Alternative]
If perl is your option, following will also work:
perl -ne 'print if /^set_2/..!/\\\s*$/' file
This simple awk command should do the job:
awk '!/^[[:blank:]]/ {p = ($1 == "set_2")} p' file
set_2 a b c d\
e f g h\
i j k l\
m n o p
set_2 aaa bbb\
ccc ddd\
eee fff
And with this awk :
awk -F'[[:blank:]]*' '$1 == "set_2" || $NF ~ /\$/ {print $0;f=1} f && $1 == ""' file
set_2 a b c d\
e f g h\
i j k l\
m n o p
set_2 aaa bbb\
ccc ddd\
eee fff
This might work for you (GNU sed):
sed ':a;/set_2/{:b;n;/set_/ba;bb};d' file
If a line contains set_2 print it and go on printing until another line containing set_ then repeat the first test.
Otherwise delete the line.

add filename without the extension at certain columns using awk

I would like to leave empty first four columns, then I want to add filename without extension in the last 4 columns. I have files as file.frq and goes on. Later I will apply this to the 200 files in loop.
input
CHR POS REF ALT AF HOM Het Number of animals
1 94980034 C T 0 0 0 5
1 94980057 C T 0 0 0 5
Desired output
file file file file
CHR POS REF ALT AF HOM Het Number of animals
1 94980034 C T 0 0 0 5
1 94980057 C T 0 0 0 5
I tried this from Add file name and empty column to existing file in awk
awk '{$0=(NR==1? " \t"" \t"" \t"" \t":FILENAME"\t") "\t" $0}7' file2.frq
But it gave me this:
CHR POS REF ALT AF HOM Het Number of animals
file2.frq 1 94980034 C T 0 0 0 5
file2.frq 1 94980057 C T 0 0 0 5
file2.frq 1 94980062 G C 0 0 0 5
and I also tried this
awk -v OFS="\t" '{print FILENAME, $1=" ",$2=" ",$3=" ", $4=" ",$5 - end}' file2.frq
but it gave me this
CHR POS REF ALT AF HOM Het Number of animals
file2.frq 1 94980034 C T 0 0 0 5
file2.frq 1 94980057 C T 0 0 0 5
any help will be appreciated!
Assuming your input is tab-separated like your desired output:
awk '
BEGIN { FS=OFS="\t" }
NR==1 {
orig = $0
fname = FILENAME
sub(/\.[^.]*$/,"",fname)
$1=$2=$3=$4 = ""
$5=$6=$7=$8 = fname
print
$0 = orig
}
1' file.txt
file file file file
CHR POS REF ALT AF HOM Het Number of animals
1 94980034 C T 0 0 0 5
1 94980057 C T 0 0 0 5
To see it in table format:
$ awk '
BEGIN { FS=OFS="\t" }
NR==1 {
orig = $0
fname = FILENAME
sub(/\.[^.]*$/,"",fname)
$1=$2=$3=$4 = ""
$5=$6=$7=$8 = fname
print
$0 = orig
}
1' file.txt | column -s$'\t' -t
file file file file
CHR POS REF ALT AF HOM Het Number of animals
1 94980034 C T 0 0 0 5
1 94980057 C T 0 0 0 5

Counter in in awk if else loop

can you explain to me why this simple onliner does not work? Thanks for your time.
awk 'BEGIN{i=1}{if($2 == i){print $0} else{print "0",i} i=i+1}' check
input text file with name "check":
a 1
b 2
c 3
e 5
f 6
g 7
desired output:
a 1
b 2
c 3
0 4
e 5
f 6
g 7
output received:
a 1
b 2
c 3
0 4
0 5
0 6
awk 'BEGIN{i=1}{ if($2 == i){print $0; } else{print "0",i++; print $0 } i++ }' check
increment i one more time in the else (you are inserting a new line)
print the currentline in the else, too
this works only if there is only one line missing between the present lines, otherwise you need a loop printing the missing lines
Or simplified:
awk 'BEGIN{i=1}{ if($2 != i){print "0",i++; } print $0; i++ }' check
Yours is broken because:
you read the next line ("e 5"),
$2 is not equal to your counter,
you print the placeholder line and increment your counter (to 5),
you do not print the current line
you read the next line ("f 6")
goto 2
A while loop is warranted here -- that will also handle the case when you have gaps greater than a single number.
awk '
NR == 1 {prev = $2}
{
while ($2 > prev+1)
print "0", ++prev
print
prev = $2
}
' check
or, if you like impenetrable one-liners:
awk 'NR==1{p=$2}{while($2>p+1)print "0",++p;p=$2}1' check
All you need is:
awk '{while (++i<$2) print 0, i}1' file
Look:
$ cat file
a 1
b 2
c 3
e 5
f 6
g 7
k 11
n 14
$ awk '{while (++i<$2) print 0, i}1' file
a 1
b 2
c 3
0 4
e 5
f 6
g 7
0 8
0 9
0 10
k 11
0 12
0 13
n 14

awk merge columns from multiple files, append different values and remove same values

I have two files:
try7.txt
a 32145
b eioue
c 32654895
d bdefgac
e kkloi
f 6549465
g test123452
h est0124358
try8.txt
a 32145562
b eioueddf
c 32654
d bdefgac
e kkloi
f 6549465dww
g test123
h est0124358df
i 63574968fd
j dfsdfcd5
desired output:
a 32145562 32145
b eioueddf eioue
c 32654 32654895
d bdefgac 0
e kkloi 0
f 6549465dww 6549465
g test123 test123452
h est0124358df est0124358
i 63574968fd 0
j dfsdfcd5 0
actual output:
a 32145562 32145
b eioueddf eioue
c 32654 32654895
d bdefgac bdefgac
e kkloi kkloi
f 6549465dww 6549465
g test123 test123452
h est0124358df est0124358
i 63574968fd 0
j dfsdfcd5 0
The codes I found:
awk 'NR==FNR{a[$1]=$2;next}
{if($1 in a){print $0,a[$1];delete a[$1]}
else print $0,"0"}
END{for(x in a)print x,"0",a[x]}' try7.txt try8.txt|sort -n|column -t
How do I modify these codes to meet my requirement?
Bit lengthy
awk 'FNR==NR{a[$1]=$2; next}
($1 in a) && a[$1] != $2{print $1,$2,a[$1]}
($1 in a) && a[$1] == $2 {print $1, $2,"0"}
!($1 in a ){print $1, $2, 0}' try7 try8
Will give an output as
a 32145562 32145
b eioueddf eioue
c 32654 32654895
d bdefgac 0
e kkloi 0
f 6549465dww 6549465
g test123 test123452
h est0124358df est0124358
i 63574968fd 0
j dfsdfcd5 0