How to merge these codes, awk then cut

How to merge these codes, awk then cut - awk

I am using awk in Debian.
input
11.22.33.44#55878:
11.22.33.43#55879:
...
...
(smtp:55.66.77.88)
(smtp:55.66.77.89)
...
...
cpe-33-22-11-99.buffalo.res.rr.com[99.11.22.33]
cpe-34-22-11-99.buffalo.res.rr.com[99.11.22.34]
...
Parts of sh codes (running in Debian)
awk '/#/ {print > "file1";next} \
/smtp/ {print > "file2";next} \
{print > "file7"}' input
#
if [ -s file1 ] ; then
#IP type => 11.22.33.44#55878:
cut -d'#' -f1 file1 >> output
rm -f file1
fi
#
if [ -s file2 ] ; then
#IP type => (smtp:55.66.77.88)
cut -d':' -f2 file2 | cut -d')' -f1 >> output
rm -f file2
fi
#
if [ -s file7 ] ; then
#IP type => cpe-33-22-11-99.buffalo.res.rr.com[99.11.22.33]
cut -d'[' -f2 file7 | cut -d']' -f1 >> output
rm -f file7
fi
then output
11.22.33.44
11.22.33.43
55.66.77.88
55.66.77.89
99.11.22.33
99.11.22.34
Is it possible to merge these codes only with awk , something like
awk '/#/ {print | cut -d'#' -f1 > "file1";next} \
/smtp/ {print | cut -d':' -f2 | cut -d')' -f1 > "file2";next} \
{print | cut -d'[' -f2 file7 | cut -d']' > "file7"}' input
I am newbie and have no idea for this,
After search questions, still no help.
any hint?
Thanks.
Best Regard.

$ awk -F'[][()#]|smtp:' '/#/{print $1;next} /smtp/{print $3;next} /\[/{print $2}' input
11.22.33.44
11.22.33.43
55.66.77.88
55.66.77.89
99.11.22.33
99.11.22.34
To save this in the file output:
awk -F'[][()#]|smtp:' '/#/{print $1;next} /smtp/{print $3;next} /\[/{print $2}' input >output
How it works
-F'[][()#]|smtp:'
This sets the field separator to (a) any of the characters ][()# or (b) the string smtp:.
/#/{print $1;next}
If the line contains #, then print the first field and skip to the next line.
/smtp/{print $3;next}
If the line contains smtp, then print the third field and skip to the next line.
/\[/{print $2}
If the line contains [, then print the second field.
Variation
There is more than one way to solve this problem, For example, using a slightly different field separator, we can still get the desired output:
$ awk -F'[][()#:]' '/#/{print $1;next} /smtp/{print $3;next} /\[/{print $2}' input
11.22.33.44
11.22.33.43
55.66.77.88
55.66.77.89
99.11.22.33
99.11.22.34

Related

Replace end of line with comma and put parenthesis in sed/awk

I am trying to process the contents of a file from this format:
this1,EUR
that2,USD
other3,GBP
to this format:
this1(EUR),that2(USD),other3(GBP)
The result should be a single line.
As of now I have come up with this circuit of commands that works fine:
cat myfile | sed -e 's/,/\(/g' | sed -e 's/$/\)/g' | tr '\n' , | awk '{print substr($0, 0, length($0)- 1)}'
Is there a simpler way to do the same by just an awk command?

Another awk:
$ awk -F, '{ printf "%s%s(%s)", c, $1, $2; c = ","} END { print ""}' file
1(EUR),2(USD),3(GBP)

Following awk may help you on same.
awk -F, '{val=val?val OFS $1"("$2")":$1"("$2")"} END{print val}' OFS=, Input_file

Toying around with separators and gsub:
$ awk 'BEGIN{RS="";ORS=")\n"}{gsub(/,/,"(");gsub(/\n/,"),")}1' file
this1(EUR),that2(USD),other3(GBP)
Explained:
$ awk '
BEGIN {
RS="" # record ends in an empty line, not newline
ORS=")\n" # the last )
}
{
gsub(/,/,"(") # replace commas with (
gsub(/\n/,"),") # and newlines with ),
}1' file # output

Using paste+sed
$ # paste -s will combine all input lines to single line
$ seq 3 | paste -sd,
1,2,3
$ paste -sd, ip.txt
this1,EUR,that2,USD,other3,GBP
$ # post processing to get desired format
$ paste -sd, ip.txt | sed -E 's/,([^,]*)(,?)/(\1)\2/g'
this1(EUR),that2(USD),other3(GBP)

How to grep the outputs of awk, line by line?

Let's say I have the following text file:
$ cat file1.txt outputs
MarkerName Allele1 Allele2 Freq1 FreqSE P-value Chr Pos
rs2326918 a g 0.8510 0.0001 0.5255 6 130881784
rs2439906 c g 0.0316 0.0039 0.8997 10 6870306
rs10760160 a c 0.5289 0.0191 0.8107 9 123043147
rs977590 a g 0.9354 0.0023 0.8757 7 34415290
rs17278013 t g 0.7498 0.0067 0.3595 14 24783304
rs7852050 a g 0.8814 0.0006 0.7671 9 9151167
rs7323548 a g 0.0432 0.0032 0.4555 13 112320879
rs12364336 a g 0.8720 0.0015 0.4542 11 99515186
rs12562373 a g 0.7548 0.0020 0.6151 1 164634379
Here is an awk command which prints MarkerName if Pos >= 11000000
$ awk '{ if($8 >= 11000000) { print $1 }}' file1.txt
This command outputs the following:
MarkerName
rs2326918
rs10760160
rs977590
rs17278013
rs7323548
rs12364336
rs12562373
Question: I would like to feed this into a grep statement to parse another text file, textfile2.txt. Somehow, one pipes the output from the previous awk command into grep AWKOUTPUT textfile2.txt
I would like each row of the awk command above to be grepped against textfile2.txt, i.e.
grep "rs2326918" textfile2.txt
## and then
grep "rs10760160" textfile2.txt
### and then
...
Naturally, I would save all resulting rows from textfile2.txt into a final file, i.e.
$ awk '{ if($8 >= 11000000) { print $1 }}' file1.txt | grep PIPE_OUTPUT_BY_ROW textfile2.txt > final.txt
How does one grep from a pipe line by line?
EDIT: To clarify, the one constraint I have is that file1.txt is actually the output of a previous pipe. (I'm trying to simplify the question somewhat.) How would that change the answer?

awk + grep solution:
grep -f <(awk '$8 >= 11000000{ print $1 }' file1.txt) textfile2.txt > final.txt
-f file - obtain patterns from file, one per line

You can use bash to do this:
bash-3.1$ echo "rs2326918" > filename2.txt
bash-3.1$ (for i in `awk '{ if($8 >= 11000000) { print $1 }}' file1.txt |
grep -v MarkerName`; do grep $i filename2.txt; done) > final.txt
bash-3.1$ cat final.txt
rs2326918
Alternatively,
bash-3.1$ cat file1.txt | (for i in `awk '{ if($8 >= 11000000) { print $1 }}' |
grep -v MarkerName`; do grep $i filename2.txt; done) > final.txt
The switch grep -v tells grep to reverse its usual activity and print all lines that do not match the pattern. This switch "inVerts" the match.

only using awk can do this for you:
$ awk 'NR>1 && NR==FNR {if ($8 >= 110000000) a[$1]++;next} \
{ for(i in a){if($0~i) print}}' file1.txt file2.txt> final.txt

Merge commands line

I have these command lines:
grep -e "[0-9] ERROR" /home/aa/lab/utb/cic/nova-all.log | awk '{ print $6 }' | awk -F'-' '{print $3""$2""$1}' | cut -c 1-4,7-8 > part1date.txt
grep -e "[0-9] ERROR" /home/aa/lab/utb/cic/nova-all.log | awk '{ print $3" "$4" "$5" "$9 }' > part1rest.txt
grep -e "[0-9] ERROR" /home/aa/lab/utb/cic/nova-all.log | awk '{ s = ""; for (i = 15; i <= NF; i++) s = s $i " "; print s}' > part1end.txt
paste -d \ part1date.txt part1rest.txt part1end.txt > temp.txt
rm part1*
cat temp.txt
The first 3 lines will save its output in a text file.
Then I merged the columns of these texts in one file to show the output.
Can someone help me to use same command in one line without saving them in textfile?
This command used to change the standard output:
sep 10 11:13:55 node-20 nova-scheduler 2014-10-12 10:36:55.675 3817 ERROR nova.scheduler....
to this format:
ddmmyy hh:mm:ss node-xx PROCESS LOGLEVEL MESSAGE
that means change place of columns and change the format of the date.

awk '/[0-9] ERROR/{gsub("-","",$6);$2=$6;$6=$9;for(i=0;++i<=NF;)$i=i<6?$(i+1):$(i+9);NF-=9;print}' file

multiple field separator in awk

I'm trying to process an input which has two field seperators ; and space. I'm able to parse the input with one separator using:
echo "10.23;7.15;6.23" | awk -v OFMF="%0.2f" 'BEGIN{FS=OFS=";"} {print $1,$2,$3}'
10.23;7.15;6.23
For an input with two seperators, I tried this and it doesn't parse both the seperators:
echo "10.23;7.15 6.23" | awk -v OFMF="%0.2f" 'BEGIN{FS=OFS=";" || " "} {print $1,$2,$3}'

You want to set FS to a character list:
awk -F'[; ]' 'script' file
and the other builtin variable you're trying to set is named OFMT, not OFMF:
$ echo "10.23;7.15 6.23" | awk -F'[; ]' -v OFMT="%0.2f" '{print $1,$2,$3}'
10.23 7.15 6.23
$ echo "10.23;7.15 6.23" | awk 'BEGIN{FS="[; ]"; OFS=";"; OFMT="%0.2f"} {print $1,$2,$3}'
10.23;7.15;6.23

Sort and print in line

Input:
54578787 -58 1
6578999 -658- 3
1352413 -541- 11
4564564 -23- 11
654564 -65- 3
6543564 -65- 1
Desired output:
column3 = 1,3,11
Using:
a=$(awk '{print $3}' text | sort -u | paste -s -d,) && paste <(echo "column3 =") <(echo $a)
I only get:
column3 = [large blank] 1,11,3
Other issue: If I remove all hyphens on the second column, I get
column3 = [large blank] ,1,11,3
I think it's a paste command issue.
Last but not least: why do I have 1,11,3 instead of 1,3,11?

I would just use awk:
$ awk '{a[$3]} END {printf "column3 = "; for (i in a) {printf "%d%s", i, (++v==length(a)?"\n":",")}}' file
column3 = 1,3,11
Explanation
a[$3] populate the a[] array with the 3rd column. This way, any new value will create a new index.
END {} perform commands after processing the whole file.
printf "column3 = " prints "column3 =".
for (i in a) {printf "%d%s", i, (++v==length(a)?"\n":",")} loop through the stored values and print them comma separated, unless it is the last one.
Your current solution would work like this:
$ paste -d" " <(echo "column3 =") <(awk '{print $3}' file | sort -u | paste -s -d,)
column3 = 1,11,3
Note there is no need to store in $a. And to have just one space, use paste -d" ".
And to have it sorted numerically? Just add -n to your sort:
$ paste -d" " <(echo "column3 =") <(awk '{print $3}' file | sort -nu | paste -s -d,)
column3 = 1,3,11
With this command you get the same output, no matter the hyphens.

You can do something like
echo "column3 = $(awk '{print $3}' test.txt |sort -nu | paste -s -d, )"
gives me
column3 = 1,3,11
One key element is to sort with the -n option to do numerical sorting.
It also works with the hyphens deleted:
echo "column3 = $(tr -d - < test.txt| awk '{print $3}' |sort -nu | paste -s -d, )"
also outputs
column3 = 1,3,11

If perl is acceptable:
perl -lanE '
$c3{$F[2]} = 1;
END {say "column3 = ", join(",", sort {$a <=> $b} keys %c3)}
' file

my gawk line looks like:
awk '{a[$3]} END{c=asorti(a,d,"#val_num_asc"); printf "column3 = ";
for(x=1;x<=c;x++)printf "%d%s", d[x],(c==x?"\n":",")}' file
output:
column3 = 1,3,11
Note
you need gawk to run that (asorti function)
sorting ascending as numbers
output in single line.

Assuming you truly want the numbers sorted and not just reproduced in the order they are first seen:
$ awk '{print $3}' file | sort -nu | awk '{s=(NR>1?s",":"")$0} END{print "column3 =",s}'
column3 = 1,3,11
You were getting 1,11,3 because without the -n arg for sort you are sorting alphabetically instead of numerically and the first char of 11 (i.e. 1) comes before the first char of 3.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to merge these codes, awk then cut - awk

Related

Replace end of line with comma and put parenthesis in sed/awk

How to grep the outputs of awk, line by line?

Merge commands line

multiple field separator in awk

Sort and print in line

Categories

Resources