find pattern from log file using awk - awk

[QFJ Timer]:2014-07-02 06:19:09,030:bla.all.com.bla.bla.ppp.xxx.abcsedf:
i would like to extract the date and time.
so the date is no problem :
cat bla.log |awk -F: '{print $2}'|awk '{print $1}'
now the issue is with the time.
if i do : cat bla.log |awk '{print $3}' so i get:
06:19:09,030:bla.all.com.bla.bla.ppp.xxx.abcsedf:
which mean that i need another grep here right?
but i did so many tries using also 'FS' and didn't get only the time.
Can someone please advise?
Thank you.

In the GNU version of awk FS can be a regexp:
echo "[QFJ Timer]:2014-07-02 06:19:09,030:bla.all.com.bla.bla.ppp.xxx.abcsedf:" |
awk -vFS=":|," '{ print $2":"$3":"$4;}'
which spits out
2014-07-02 06:19:09
Your left separator is ':' and the right is ',', and unfortunately hours, minutes and seconds are also separated by your left separator. That is solved by printing $3 and $4. Quick and dirty solution, but it isn't going to be be very robust.

You could use sed for this purpose,
$ echo '[QFJ Timer]:2014-07-02 06:19:09,030:bla.all.com.bla.bla.ppp.xxx.abcsedf:' | sed 's/^[^:]*:\([^,]*\).*/\1/g'
2014-07-02 06:19:09

cat bla.log |awk -F":" '{print $2":"$3":"$4}' | awk -F"," '{print $1}'
Which gets you:
2014-07-02 06:19:09

You can use grep, since it is meant for that:
grep -o '[0-9]\{4\}\(-[0-9]\{2\}\)\{2\}\(\( \|:\)[0-9]\{2\}\)\{3\}' log.file
or, a little bit simpler, egrep:
egrep -o '[0-9]{4}(-[0-9]{2}){2}(( |:)[0-9]{2}){3}' log.file

Related

Is using awk at least 'awk -F' always will be fine?

What is the difference on Ubuntu between awk and awk -F? For example to display the frequency of the cpu core 0 we use the command
cat /proc/cpuinfo | grep -i "^ cpu MHz" | awk -F ":" '{print $ 2}' | head -1
But why it uses awk -F? We could put awk without the -F and it would work of course (already tested).
Because without -F , we couldn't find from wath separator i will begin the calculation and print the right result. It's like a way to specify the kind of separator for this awk's using. Without it, it will choose the trivial separator in the line like if i type on the terminal: ps | grep xeyes | awk '{print $1}' ; in this case it will choose the space ' ' as a separator to print the first value: pid OF the process xeyes. I found it in https://www.shellunix.com/awk.html. Thanks for all.

Regexp in gawk matches multiples ways

I have some text I need to split up to extract the relevant argument, and my [g]awk match command does not behave - I just want to understand why?! (I have written a less elegant way around it now...).
So the string is blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header
I want to output just the contents of msgcontent1=, so did
echo "blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header" | gawk '{ if (match($0,/msgcontent1=(.*)[|]/,a)) { print a[1]; } }'
Trouble instead of getting
HeaderUUIiewConsenFlagPSMessage
I get the match with everything from there to the last pipe of the string HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002
Now I accept this is because the regexp in /msgcontent1=(.*)[|]/ can match multiple ways, but HOW do I make it match the way I want it to??
With your shown samples please try following. Written and tested in GNU awk this will print only contents from msgcontent1= till | first occurrence.
awk 'match($0,/msgcontent1=[^|]*/){print substr($0,RSTART+12,RLENGTH-12)}' Input_file
OR with echo + awk try:
echo "blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header" |
awk 'match($0,/msgcontent1=[^|]*/){print substr($0,RSTART+12,RLENGTH-12)}'
With FPAT option in GNU awk:
awk -v FPAT='msgcontent1=[^|]*' '{sub(/.*=/,"",$1);print $1}' Input_file
This is your input:
s='blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header'
You may use gnu awk like this to extract value after msgcontent1=:
awk -F= -v RS='|' '$1 == "msgcontent1" {print $2}' <<< "$s"
HeaderUUIiewConsenFlagPSMessage
or using this sed:
sed -E 's/^(.*\|)?msgcontent1=([^|]+).*/\2/' <<< "$s"
HeaderUUIiewConsenFlagPSMessage
Or using this gnu grep:
grep -oP '(^|\|)msgcontent1=\K[^|]+' <<< "$s"
HeaderUUIiewConsenFlagPSMessage
echo "blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header" | awk '{ if (match($0,/msgcontent1=([^\|]*)/,a)) print a[1] }'
this prints HeaderUUIiewConsenFlagPSMessage
The reason your regex match msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002 is that matching is 'hungry' so it allways finds the longest possible match
Also with awk:
echo 'blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header' | awk -v FS='[=|]' '$2 == "msgcontent1" {print $3}'
HeaderUUIiewConsenFlagPSMessage

Rearrange lines to tabular format

I would like to grab the part after "-" and combine it with the following letter-string into a tab-output. I tried something like cut -d "*-" -f 2 <<< "$your_str" but I am not sure how to do the whole shuffling.
Input:
>1-395652
TATTGCACTTGTCCCGGCCTGT
>2-369990
TATTGCACTCGTCCCGGCCTCC
>3-132234
TATTGCACTCGTCCCGGCCTC
>4-122014
TATTGCACTTGTCCCGGCCTGTAA
>5-118616
Output:
TATTGCACTTGTCCCGGCCTGT 395652
TATTGCACTCGTCCCGGCCTCC 369990
awk to the rescue!
awk -F- '/^>/{k=$2; next} {print $0, k}' file
With GNU sed:
sed -nE 'N;s/.*-([0-9]+)\n(.*)/\2\t\1/p' file
Output:
TATTGCACTTGTCCCGGCCTGT 395652
TATTGCACTCGTCCCGGCCTCC 369990
TATTGCACTCGTCCCGGCCTC 132234
TATTGCACTTGTCCCGGCCTGTAA 122014
Portable sed:
sed -n 's/.*-//;x;n;G;s/\n/ /p' inputfile
Output:
TATTGCACTTGTCCCGGCCTGT 395652
TATTGCACTCGTCCCGGCCTCC 369990
TATTGCACTCGTCCCGGCCTC 132234
TATTGCACTTGTCCCGGCCTGTAA 122014

AWK pipe output to SED to replace

I'm trying to replace a string using AWK pipe out to SED
grep pdo_user /html/data/_default_/configs/application.ini | awk '{print $3}' | sed -i 's/$1/"username"/g' /html/data/_default_/configs/application.ini
but found string is not replaced
Output for
grep pdo_user /html/data/_default_/configs/application.ini | awk '{print $3}'
is
"root"
Any tips on that?
I suggest to use awk and mv:
awk '/pdo_user/ && $3=="\"root\"" {$3="\"username\""}1' /path/to/application.ini > /path/to/application.tmp
mv /path/to/application.tmp /path/to/application.ini
Working solution based on Shelter's tip using AWK and SED
sed -i 's/'$(awk '/pdo_user/{print $3}' /path/to/application.ini)'/"username"/' /path/to/application.ini

use awk to print a column, adding a comma

I have a file, from which I want to retrieve the first column, and add a comma between each value.
Example:
AAAA 12345 xccvbn
BBBB 43431 fkodks
CCCC 51234 plafad
to obtain
AAAA,BBBB,CCCC
I decided to use awk, so I did
awk '{ $1=$1","; print $1 }'
Problem is: this add a comma also on the last value, which is not what I want to achieve, and also I get a space between values.
How do I remove the comma on the last element, and how do I remove the space? Spent 20 minutes looking at the manual without luck.
$ awk '{printf "%s%s",sep,$1; sep=","} END{print ""}' file
AAAA,BBBB,CCCC
or if you prefer:
$ awk '{printf "%s%s",(NR>1?",":""),$1} END{print ""}' file
AAAA,BBBB,CCCC
or if you like golf and don't mind it being inefficient for large files:
$ awk '{r=r s $1;s=","} END{print r}' file
AAAA,BBBB,CCCC
awk {'print $1","$2","$3'} file_name
This is the shortest I know
Why make it complicated :) (as long as file is not too large)
awk '{a=NR==1?$1:a","$1} END {print a}' file
AAAA,BBBB,CCCC
For better porability.
awk '{a=(NR>1?a",":"")$1} END {print a}' file
You can do this:
awk 'a++{printf ","}{printf "%s", $1}' file
a++ is interpreted as a condition. In the first row its value is 0, so the comma is not added.
EDIT:
If you want a newline, you have to add END{printf "\n"}. If you have problems reading in the file, you can also try:
cat file | awk 'a++{printf ","}{printf "%s", $1}'
awk 'NR==1{printf "%s",$1;next;}{printf "%s%s",",",$1;}' input.txt
It says: If it is first line only print first field, for the other lines first print , then print first field.
Output:
AAAA,BBBB,CCCC
In this case, as simple cut and paste solution
cut -d" " -f1 file | paste -s -d,
In case somebody as me wants to use awk for cleaning docker images:
docker image ls | grep tag_name | awk '{print $1":"$2}'
Surpised that no one is using OFS (output field separator). Here is probably the simplest solution that sticks with awk and works on Linux and Mac: use "-v OFS=," to output in comma as delimiter:
$ echo '1:2:3:4' | awk -F: -v OFS=, '{print $1, $2, $4, $3}' generates:
1,2,4,3
It works for multiple char too:
$ echo '1:2:3:4' | awk -F: -v OFS=., '{print $1, $2, $4, $3}' outputs:
1.,2.,4.,3
Using Perl
$ cat group_col.txt
AAAA 12345 xccvbn
BBBB 43431 fkodks
CCCC 51234 plafad
$ perl -lane ' push(#x,$F[0]); END { print join(",",#x) } ' group_col.txt
AAAA,BBBB,CCCC
$
This can be very simple like this:
awk -F',' '{print $1","$1","$2","$3}' inputFile
where input file is : 1,2,3
2,3,4 etc.
I used the following, because it lists the api-resource names with it, which is useful, if you want to access it directly. I also use a label "application" to find specific apps in a namespace:
kubectl -n ops-tools get $(kubectl api-resources --no-headers=true --sort-by=name | awk '{printf "%s%s",sep,$1; sep=","}') -l app.kubernetes.io/instance=application