Using awk to print without double quotes - awk

I would like to get the right value of the following command as a string without double quotes.
$ grep '^VERSION=' /etc/os-release
VERSION="20.04.3 LTS (Focal Fossa)"
When I pipe it with the following awk, I don't get the desired output.
$ grep '^VERSION=' /etc/os-release | awk '{print $0}'
VERSION="20.04.3 LTS (Focal Fossa)"
$ grep '^VERSION=' /etc/os-release | awk '{print $1}'
VERSION="20.04.3
$ grep '^VERSION=' /etc/os-release | awk '{print $2}'
LTS
How can I fix that?

You may use this single awk command:
awk -F= '$1=="VERSION" {gsub(/"/, "", $2); print $2}' /etc/os-release
20.04.3 LTS (Focal Fossa)

1st solution: With your shown samples, please try following awk code.
awk 'match($0,/^VERSION="[^"]*/){print substr($0,RSTART+9,RLENGTH-9)' Input_file
Explanation: Simple explanation would be, using match function of awk to match starting VERSION=" till next occurrence of " and then printing the matched part(to get only desired output as per OP's shown samples).
2nd solution: Using GNU grep with PCRE regex enabled option try following.
grep -oP '^VERSION="\K[^"]*' Input_file
3rd solution: Using awk's capability to set different field separators and then check conditions accordingly and print values.
awk -F'"' '$1=="VERSION="{print $2}' Input_file

Assuming that "the right value" you want output is 20.04.3:
$ awk -F'[" ]' '/^VERSION=/{print $2}' file
20.04.3
or if it's the whole quoted string:
$ awk -F'"' '/^VERSION=/{print $2}' file
20.04.3 LTS (Focal Fossa)

You can use an awk command like
awk 'match($0, /^VERSION="([^"]*)"/, m) {print m[1]}' /etc/os-release
Here, ^VERSION="([^"]*)" matches VERSION=" at the start of the string (^), then captures into Group 1 any zero or more chars other than " (with ([^"]*)) and then matches ". The match is saved in m where m[1] holds the Group 1 value.
Or, sed like
sed -n '/^VERSION="\([^"]*\)".*/s//\1/p' /etc/os-release
See an online test:
s='VERSION="20.04.3 LTS (Focal Fossa)"'
awk 'match($0, /^VERSION="([^"]*)"/, m) {print m[1]}' <<< "$s"
sed -n '/^VERSION="\([^"]*\)".*/s//\1/p' <<< "$s"
Here, -n option suppresses the default line output, /^VERSION="\([^"]*\)".*/ matches a string starting with VERSION=", then capturing into Group 1 any zero or more chars other than ", and then matching " and the rest of the string, and replacing the whole match with the Group 1 value. // means the previous regex pattern must be used. p only prints the result of the substition.
Both output 20.04.3 LTS (Focal Fossa).

Since the file /etc/os-release conforms to a variable assignment in bash or the shell in general (POSIX), sourcing it should do the job.
source /etc/os-release; echo "$VERSION"
Using a subshell just in case one does not want the pollute the current env variables.
( source /etc/os-release; echo "$VERSION" )
Assigning it to a variable.
version=$( source /etc/os-release; echo "$VERSION" )
If the shell you're using does not conform to POSIX.
sh -c '. /etc/os-release; echo "$VERSION"'
See your local man page if available.
man 5 os-release

Related

Regex to extract multiple words from a paragraph

$ echo file.txt
NAME="Ubuntu" <--- some of this
VERSION="20.04.4 LTS (Focal Fossa)" <--- and some of this
ID=ubuntu
ID_LIKE=debian
VERSION_ID="20.04"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
I want this: Ubuntu 20.04.4 LTS.
I managed with two commands:
echo "$(grep '^NAME="' ./file.txt | sed -E 's/NAME="(.*)"/\1/') $(grep '^VERSION="' ./file.txt | sed -E 's/VERSION="(.*) \(.*"/\1/')"
How could I simplify this to one command using grep/sed or perl?
With your shown samples try following awk code.
awk -F'"' '/^NAME="/{name=$2;next} /^VERSION="/{print name,$2}' Input_file
Explanation:
Setting field separator as " for all the lines here.
Checking condition if line starts with Name= then create variable name which has 2nd field. next will skip all further statements from there of awk program, they needed not to be executed.
Then checking if a line starts from VERSION= then print name and 2nd field here as per requirement.
Here is another awk solution:
awk -F '=?"' '$1 == "NAME" {s = $2; next}
$1 == "VERSION" {sub(/ \(.*/, "", $2); print s, $2}' file
Ubuntu 20.04.4 LTS

Is using awk at least 'awk -F' always will be fine?

What is the difference on Ubuntu between awk and awk -F? For example to display the frequency of the cpu core 0 we use the command
cat /proc/cpuinfo | grep -i "^ cpu MHz" | awk -F ":" '{print $ 2}' | head -1
But why it uses awk -F? We could put awk without the -F and it would work of course (already tested).
Because without -F , we couldn't find from wath separator i will begin the calculation and print the right result. It's like a way to specify the kind of separator for this awk's using. Without it, it will choose the trivial separator in the line like if i type on the terminal: ps | grep xeyes | awk '{print $1}' ; in this case it will choose the space ' ' as a separator to print the first value: pid OF the process xeyes. I found it in https://www.shellunix.com/awk.html. Thanks for all.

Regexp in gawk matches multiples ways

I have some text I need to split up to extract the relevant argument, and my [g]awk match command does not behave - I just want to understand why?! (I have written a less elegant way around it now...).
So the string is blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header
I want to output just the contents of msgcontent1=, so did
echo "blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header" | gawk '{ if (match($0,/msgcontent1=(.*)[|]/,a)) { print a[1]; } }'
Trouble instead of getting
HeaderUUIiewConsenFlagPSMessage
I get the match with everything from there to the last pipe of the string HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002
Now I accept this is because the regexp in /msgcontent1=(.*)[|]/ can match multiple ways, but HOW do I make it match the way I want it to??
With your shown samples please try following. Written and tested in GNU awk this will print only contents from msgcontent1= till | first occurrence.
awk 'match($0,/msgcontent1=[^|]*/){print substr($0,RSTART+12,RLENGTH-12)}' Input_file
OR with echo + awk try:
echo "blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header" |
awk 'match($0,/msgcontent1=[^|]*/){print substr($0,RSTART+12,RLENGTH-12)}'
With FPAT option in GNU awk:
awk -v FPAT='msgcontent1=[^|]*' '{sub(/.*=/,"",$1);print $1}' Input_file
This is your input:
s='blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header'
You may use gnu awk like this to extract value after msgcontent1=:
awk -F= -v RS='|' '$1 == "msgcontent1" {print $2}' <<< "$s"
HeaderUUIiewConsenFlagPSMessage
or using this sed:
sed -E 's/^(.*\|)?msgcontent1=([^|]+).*/\2/' <<< "$s"
HeaderUUIiewConsenFlagPSMessage
Or using this gnu grep:
grep -oP '(^|\|)msgcontent1=\K[^|]+' <<< "$s"
HeaderUUIiewConsenFlagPSMessage
echo "blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header" | awk '{ if (match($0,/msgcontent1=([^\|]*)/,a)) print a[1] }'
this prints HeaderUUIiewConsenFlagPSMessage
The reason your regex match msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002 is that matching is 'hungry' so it allways finds the longest possible match
Also with awk:
echo 'blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header' | awk -v FS='[=|]' '$2 == "msgcontent1" {print $3}'
HeaderUUIiewConsenFlagPSMessage

using awk command to get the correct name

I want to get the filename from a long string in shell script.After reading some example from likegeeks.com,I write a simple solution:
#/bin/bash
cdnurl="http://download.example.com.cn/download/product/vpn/rules/vpn_patch_20190218162130_sign.pkg?wsSecret=9cadeddedfr7bb85a20a064510cd3f353&wsABSTime=5c6ea1e7"
echo ${cndurl}
url=`echo ${cdnurl} | awk -F'/' '{ print $NF }'`
result=`echo ${url} | awk -F '?' '{ print $1}'`
echo ${url}
echo ${result}
I just want to get vpn_patch_20190218162130_sign.pkg,and the it does.I wonder is there any smart ways (may be one line).
If behind pkg it's not ?,how to use pkg to get the filename,I am not sure if always ? after pkg,but the filename always be *.pkg.
You can try : this is more robust as compare to second awk command:
echo "$cdnurl"|awk -v FS='/' '{gsub(/?.*/,"",$NF);print $NF}'
vpn_patch_20190218162130_sign.pkg
#less robust
echo "$cdnurl"|awk -vFS=[?/] '{print $(NF-1)}'
You should use sed :
sed -r 's|.*/(.*.pkg).*|\1|g'

String concatenation doesn't work in gawk print instruction

I have the following grep and gawk line running in windows
grep ItemDischarged D:\systems\CmcComRouting.log | gawk -v OFS=, "{print $8}" | cut -d ">" -f 1 | uniq -c | gawk -v OFS=, "{print $1,$2}" > d:\03TotalItems.log
the output is as follows
59523,ItemDischargedTlg
What I want to do is add "Lower" to the end of "ItemDischargedTlg" but cannot figure out how to do it, I have tried
{print $1,$2"Lower"}
but it prints nothing.
Thanks
This might do the trick:
gawk -v OFS=, '{$2=$2"Lower";print $1,$2}'
When trying to concatenate strings and commas you should be careful. Commas and concatenation as argument of a print instruction don't go well together.
If on windows, be careful with " and '.