AWK explanation example - awk

I have a file:
AWK question about the example
This command that works well:
awk '{ gsub(/...../, "&\n" ) ; print}' file
AWK q
uesti
on ab
out t
he ex
ample
Why this command does not print the same result?
awk '{ gsub(/.{5}/, "&\n" ) ; print}' file
AWK question about the example
Why this command does not print the same result?
awk -v WIDTH=5 '{ gsub(".{"WIDTH"}", "&\n"); print }' file
AWK question about the example

To use {5} you need to enable re-interval like this:
awk --re-interval '{ gsub(/.{5}/, "&\n" ) ; print}' file
awk -v WIDTH=5 --re-interval '{ gsub(".{"WIDTH"}", "&\n"); print }' file
You could also use --posix too, but it will disable other functions in awk
awk -v WIDTH=5 --posix '{ gsub(".{"WIDTH"}", "&\n"); print }' file

You can use the fold command instead of awk:
fold -w 5 input
or if you don't have the input in a file:
echo 'AWK question about the example' | fold -w 5
Both Give:
AWK q
uesti
on ab
out t
he ex
ample

Related

How to use awk or sed to get text between two words

I have string lists :
./SolutionController.php core/app/Http/Controllers/Admin/SolutionController.php
./ContentController.php core/app/Http/Controllers/Frontpage/ContentController.php
./country-flag vendor/country-flag
I wish I could get the final value between the './' sign and the 'space'
Output:
SolutionController.php
ContentController.php
country-flag
This code with bash script:
#!/bin/bash
tanggal=$(date +%d-%m-%Y)
filename="./update/$tanggal/lists.md"
n=1
tanggalWaktu=$(date +"%d-%m-%Y %H:%M:%S")
mkdir -p ./logs
while read line; do
fileName=$(awk -F'[/ ]' '{print $2}' $line)
echo "file -> $fileName"
done < $filename
Output:
awk: can't open file ./SolutionController.php
source line number
Please help me
Using awk :
awk -F'[/ ]' '{print $2}' string.txt
Using gawk:
awk '{print gensub(/\.\/(.*) (.*)/,"\\1","g")}' string.txt
Test Results:
$ cat string.txt
./SolutionController.php core/app/Http/Controllers/Admin/SolutionController.php
./ContentController.php core/app/Http/Controllers/Frontpage/ContentController.php
./country-flag vendor/country-flag
$ awk -F'[/ ]' '{print $2}' string.txt
SolutionController.php
ContentController.php
country-flag
$ awk '{print gensub(/\.\/(.*) (.*)/,"\\1","g")}' string.txt
SolutionController.php
ContentController.php
country-flag
You can do it
echo "./SolutionController.php core/app/Http/Controllers/Admin/SolutionController.php" | sed -r 's/\.\/(.*) .*/\1/'
If you stored it in the file.
sed -r 's/\.\/(.*) .*/\1/' strings.txt

Regexp in gawk matches multiples ways

I have some text I need to split up to extract the relevant argument, and my [g]awk match command does not behave - I just want to understand why?! (I have written a less elegant way around it now...).
So the string is blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header
I want to output just the contents of msgcontent1=, so did
echo "blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header" | gawk '{ if (match($0,/msgcontent1=(.*)[|]/,a)) { print a[1]; } }'
Trouble instead of getting
HeaderUUIiewConsenFlagPSMessage
I get the match with everything from there to the last pipe of the string HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002
Now I accept this is because the regexp in /msgcontent1=(.*)[|]/ can match multiple ways, but HOW do I make it match the way I want it to??
With your shown samples please try following. Written and tested in GNU awk this will print only contents from msgcontent1= till | first occurrence.
awk 'match($0,/msgcontent1=[^|]*/){print substr($0,RSTART+12,RLENGTH-12)}' Input_file
OR with echo + awk try:
echo "blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header" |
awk 'match($0,/msgcontent1=[^|]*/){print substr($0,RSTART+12,RLENGTH-12)}'
With FPAT option in GNU awk:
awk -v FPAT='msgcontent1=[^|]*' '{sub(/.*=/,"",$1);print $1}' Input_file
This is your input:
s='blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header'
You may use gnu awk like this to extract value after msgcontent1=:
awk -F= -v RS='|' '$1 == "msgcontent1" {print $2}' <<< "$s"
HeaderUUIiewConsenFlagPSMessage
or using this sed:
sed -E 's/^(.*\|)?msgcontent1=([^|]+).*/\2/' <<< "$s"
HeaderUUIiewConsenFlagPSMessage
Or using this gnu grep:
grep -oP '(^|\|)msgcontent1=\K[^|]+' <<< "$s"
HeaderUUIiewConsenFlagPSMessage
echo "blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header" | awk '{ if (match($0,/msgcontent1=([^\|]*)/,a)) print a[1] }'
this prints HeaderUUIiewConsenFlagPSMessage
The reason your regex match msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002 is that matching is 'hungry' so it allways finds the longest possible match
Also with awk:
echo 'blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header' | awk -v FS='[=|]' '$2 == "msgcontent1" {print $3}'
HeaderUUIiewConsenFlagPSMessage

initialising field seperators on condition in awk

I know that initialising FS in BEGIN is the correct practice but what if i need different field seperators for different lines(lines containing a particular pattern)? eg: my awk script is
{if($0 ~ /.*youtube.*/){FS="=";print $2}}
This code is not processing the first line.How to fix this?
You can use split. Eks get the middle date from third field green
echo "on,cat ,blue|green|red,more" | awk -F, '{split($3,a,"|");print a[2]}'
green
And you BEGIN block is not only where you can set the Field Separator:
echo "on,two,three" | awk -F, '{print $2}'
echo "on,two,three" | awk '{print $2}' FS=,
echo "on,two,three" | awk 'BEGIN{FS=","} {print $2}'
echo "on,two,three" | awk -v FS=, '{print $2}'
All these will print two
But they may have some different impact in when they can be used.
awk -F, 'BEGIN{print FS}'
,
and this does not work and gives no output.
awk 'BEGIN{print FS}' FS=,
Back to your problem:
This:
awk '{if($0 ~ /.*youtube.*/){FS="=";print $2}}' file
should be:
awk '{if($0 ~ /.*youtube.*/){split($0,a,"=");print a[2]}}' file
You do not need to test for any characters before and after regex, so:
awk '{if($0 ~ /youtube/){split($0,a,"=");print a[2]}}' file
And this could even more be simplified:
awk '/youtube/ {split($0,a,"=");print a[2]}' file
If data is like this:
cat file
youtube=thisisyoutube1 //starts here
youtube=thisisyoutube2
youtube=thisisyoutube3
youtube=thisisyoutube4
yautube=thisisnottobeprinted
Then do like this:
awk -F= '/youtube/ {split($2,a," ");print a[1]}' file
thisisyoutube1
thisisyoutube2
thisisyoutube3
thisisyoutube4

pass and compare external variable in awk command

how to pass and compare an external variable in awk command? is it also dependent on unix shell that we are using
I am trying to do :
mgrid=`echo $file1 | awk -F'|' '{ print $40}' `
echo $mgrid
var=`/usr/bin/more $HOME/pwd_date_chk/file2.txt | awk -F'|' ' -v search="$mgrid" '{ $41 ~ search print $15}'`
echo $var
awk can read the input from file. No need to use more. You can try this,
mgrid=`echo $file1 | awk -F'|' '{ print $40}' `
echo $mgrid
var=`awk -F'|' -v search="$mgrid" '$41 ~ search {print $15}' $HOME/pwd_date_chk/file2.txt`
echo $var

awk command with search string variable

Configuration.xml has "mysearchstring" at position (line) 23.
If I use the following statement, it returns me line 23
awk '/"mysearchstring"/{print NR}' Configuration.xml
But if I use an assigned variable, it returns me nothing
str="mySearchString";awk '/$str/{print NR}' Configuration.xml
Can someone tell me what is incorrect in the second statement?
You need to pass the variable to awk with -v and then use the ~ comparison:
awk -v myvar="$str" '$0 ~ myvar {print NR}' Configuration.xml
Example
$ cat a
hello
how
are
you
$ awk '/e/ {print NR}' a <---- hardcoded
1
3
$ awk -v myvar="e" '$0~myvar {print NR}' a <---- through variable
1
3
You can use the command-line option -v to pass variables to awk:
awk -v searchstr="$str" '$0 ~ searchstr { print NR }'
Or you can pass variable like this, after code:
awk '$0~s {print NR}' s="$str" file