awk command with BEGIN does not work for me - awk

This is the simple awk command i am trying to write
grep "Inputs - " access.log | awk 'BEGIN { FS = "Inputs -" } ; { print $2 }'
i am trying to grep the file access.log for all the lines with "Input -" and trying to awk the part after the "Input -". This is giving the following error
awk: syntax error near line 1
awk: bailing out near line 1
I am confused what is the issue with this, this should work!!!!
I have also tried the following and it does not work
grep "Inputs - " L1Access.log | awk -F='Inputs' '{print $1}'
Here is a sample input text file
This is line number 1. I dont want this line to be part of grep output
This is line number 2. I want this line to be part of grep output. This has "Input -", I want to display only the part after "Input -" from this line using awk

your problem cannot be reproduced here:
kent$ cat f
foo - xxx
foo - yyy
foo - zzz
fooba
kent$ grep 'foo - ' f| awk 'BEGIN { FS = "foo -"};{print $2}'
xxx
yyy
zzz
There must be something wrong in your awk codes. Besides, if you want to do a grep and awk to extract the part after your Inputs - you can use grep to do it in single shot:
kent$ grep -Po 'foo - \K.*' f
xxx
yyy
zzz

Since you stated you want everything after the first instance "Inputs -", and since your grep is unnecessary:
nawk -F"Inputs -" 'BEGIN {OFS="Inputs -"} {line=""}; { for(i=2;i<=NF;i++) line=line OFS $i} {print line}' test
Your own answer will only print out the second element. In the event that you have more than one "Input -" you will be missing the remaining of the line. If you don't want the second (or third.. ) "Inputs -" in the output you could use:
nawk -F"Input -" '{ for(i=2;i<=NF;i++) print $i}' test

OK folks i see what my issue is. I am using solaris and in solaris the awk does not have capability for regex, meaning it does not support more than 1 charater in the field seperator. So i used nawk
Please refer to this post
Stackoverflow post
grep "Inputs - " L1Access.log | nawk 'BEGIN { FS = "Inputs -" } { print $2 }'
this worked.

You are not clear on what to get. Here is a sample file:
cat file
test Inputs - more data
Here is nothing to get
yes Inputs - This is what we need Inputs - but what about this?
You can then use awk to get data:
awk -F"Inputs - " 'NF>1 {print $2}' file
more data
This is what we need
or like this?
awk -F"Inputs - " 'NF>1 {print $NF}' file
more data
but what about this?
By setting separator to Inputs - and test for NF>1 it will only print lines with Inputs -

Related

Proper way to use variables in awk in a script? [duplicate]

I found some ways to pass external shell variables to an awk script, but I'm confused about ' and ".
First, I tried with a shell script:
$ v=123test
$ echo $v
123test
$ echo "$v"
123test
Then tried awk:
$ awk 'BEGIN{print "'$v'"}'
$ 123test
$ awk 'BEGIN{print '"$v"'}'
$ 123
Why is the difference?
Lastly I tried this:
$ awk 'BEGIN{print " '$v' "}'
$ 123test
$ awk 'BEGIN{print ' "$v" '}'
awk: cmd. line:1: BEGIN{print
awk: cmd. line:1: ^ unexpected newline or end of string
I'm confused about this.
#Getting shell variables into awk
may be done in several ways. Some are better than others. This should cover most of them. If you have a comment, please leave below.                                                                                    v1.5
Using -v (The best way, most portable)
Use the -v option: (P.S. use a space after -v or it will be less portable. E.g., awk -v var= not awk -vvar=)
variable="line one\nline two"
awk -v var="$variable" 'BEGIN {print var}'
line one
line two
This should be compatible with most awk, and the variable is available in the BEGIN block as well:
If you have multiple variables:
awk -v a="$var1" -v b="$var2" 'BEGIN {print a,b}'
Warning. As Ed Morton writes, escape sequences will be interpreted so \t becomes a real tab and not \t if that is what you search for. Can be solved by using ENVIRON[] or access it via ARGV[]
PS If you have vertical bar or other regexp meta characters as separator like |?( etc, they must be double escaped. Example 3 vertical bars ||| becomes -F'\\|\\|\\|'. You can also use -F"[|][|][|]".
Example on getting data from a program/function inn to awk (here date is used)
awk -v time="$(date +"%F %H:%M" -d '-1 minute')" 'BEGIN {print time}'
Example of testing the contents of a shell variable as a regexp:
awk -v var="$variable" '$0 ~ var{print "found it"}'
Variable after code block
Here we get the variable after the awk code. This will work fine as long as you do not need the variable in the BEGIN block:
variable="line one\nline two"
echo "input data" | awk '{print var}' var="${variable}"
or
awk '{print var}' var="${variable}" file
Adding multiple variables:
awk '{print a,b,$0}' a="$var1" b="$var2" file
In this way we can also set different Field Separator FS for each file.
awk 'some code' FS=',' file1.txt FS=';' file2.ext
Variable after the code block will not work for the BEGIN block:
echo "input data" | awk 'BEGIN {print var}' var="${variable}"
Here-string
Variable can also be added to awk using a here-string from shells that support them (including Bash):
awk '{print $0}' <<< "$variable"
test
This is the same as:
printf '%s' "$variable" | awk '{print $0}'
P.S. this treats the variable as a file input.
ENVIRON input
As TrueY writes, you can use the ENVIRON to print Environment Variables.
Setting a variable before running AWK, you can print it out like this:
X=MyVar
awk 'BEGIN{print ENVIRON["X"],ENVIRON["SHELL"]}'
MyVar /bin/bash
ARGV input
As Steven Penny writes, you can use ARGV to get the data into awk:
v="my data"
awk 'BEGIN {print ARGV[1]}' "$v"
my data
To get the data into the code itself, not just the BEGIN:
v="my data"
echo "test" | awk 'BEGIN{var=ARGV[1];ARGV[1]=""} {print var, $0}' "$v"
my data test
Variable within the code: USE WITH CAUTION
You can use a variable within the awk code, but it's messy and hard to read, and as Charles Duffy points out, this version may also be a victim of code injection. If someone adds bad stuff to the variable, it will be executed as part of the awk code.
This works by extracting the variable within the code, so it becomes a part of it.
If you want to make an awk that changes dynamically with use of variables, you can do it this way, but DO NOT use it for normal variables.
variable="line one\nline two"
awk 'BEGIN {print "'"$variable"'"}'
line one
line two
Here is an example of code injection:
variable='line one\nline two" ; for (i=1;i<=1000;++i) print i"'
awk 'BEGIN {print "'"$variable"'"}'
line one
line two
1
2
3
.
.
1000
You can add lots of commands to awk this way. Even make it crash with non valid commands.
One valid use of this approach, though, is when you want to pass a symbol to awk to be applied to some input, e.g. a simple calculator:
$ calc() { awk -v x="$1" -v z="$3" 'BEGIN{ print x '"$2"' z }'; }
$ calc 2.7 '+' 3.4
6.1
$ calc 2.7 '*' 3.4
9.18
There is no way to do that using an awk variable populated with the value of a shell variable, you NEED the shell variable to expand to become part of the text of the awk script before awk interprets it. (see comment below by Ed M.)
Extra info:
Use of double quote
It's always good to double quote variable "$variable"
If not, multiple lines will be added as a long single line.
Example:
var="Line one
This is line two"
echo $var
Line one This is line two
echo "$var"
Line one
This is line two
Other errors you can get without double quote:
variable="line one\nline two"
awk -v var=$variable 'BEGIN {print var}'
awk: cmd. line:1: one\nline
awk: cmd. line:1: ^ backslash not last character on line
awk: cmd. line:1: one\nline
awk: cmd. line:1: ^ syntax error
And with single quote, it does not expand the value of the variable:
awk -v var='$variable' 'BEGIN {print var}'
$variable
More info about AWK and variables
Read this faq.
It seems that the good-old ENVIRON awk built-in hash is not mentioned at all. An example of its usage:
$ X=Solaris awk 'BEGIN{print ENVIRON["X"], ENVIRON["TERM"]}'
Solaris rxvt
You could pass in the command-line option -v with a variable name (v) and a value (=) of the environment variable ("${v}"):
% awk -vv="${v}" 'BEGIN { print v }'
123test
Or to make it clearer (with far fewer vs):
% environment_variable=123test
% awk -vawk_variable="${environment_variable}" 'BEGIN { print awk_variable }'
123test
You can utilize ARGV:
v=123test
awk 'BEGIN {print ARGV[1]}' "$v"
Note that if you are going to continue into the body, you will need to adjust
ARGC:
awk 'BEGIN {ARGC--} {print ARGV[2], $0}' file "$v"
I just changed #Jotne's answer for "for loop".
for i in `seq 11 20`; do host myserver-$i | awk -v i="$i" '{print "myserver-"i" " $4}'; done
I had to insert date at the beginning of the lines of a log file and it's done like below:
DATE=$(date +"%Y-%m-%d")
awk '{ print "'"$DATE"'", $0; }' /path_to_log_file/log_file.log
It can be redirect to another file to save
Pro Tip
It could come handy to create a function that handles this so you dont have to type everything every time. Using the selected solution we get...
awk_switch_columns() {
cat < /dev/stdin | awk -v a="$1" -v b="$2" " { t = \$a; \$a = \$b; \$b = t; print; } "
}
And use it as...
echo 'a b c d' | awk_switch_columns 2 4
Output:
a d c b

How to move grep inside awk script?

In the below have I 3 grep commands that I would like to replace with awk's grep. so I have tried
! /000000000000/;
! /000000000000/ $0;
! /000000000000/ $3;
where I don't get an error, but testing with both the script below and
$ echo 000000000000 | awk '{ ! /000000000000/; print }'
000000000000
it doesn't skip the lines as expected.
Question
Can anyone explain why my "not grep" doesn't work in awk?
grep -v '^#' $hosts | grep -E '[0-9A-F]{12}\b' | grep -v 000000000000 | awk '{
print "host "$5" {"
print " option host-name \""$5"\";"
gsub(/..\B/,"&:", $3)
print " hardware ethernet "$3";"
print " fixed-address "$1";"
print "}"
print ""
}' > /etc/dhcp/reservations.conf
Could you please try changing your code to:
echo 000000000000 | awk '!/000000000000/'
Problem in your attempt: $ echo 000000000000 | awk '{ ! /000000000000/; print }' Since you are checking condition ! /000000000000/ which is having ; after it so that condition works well and DO NOT print anything. But then you have print after it which is NOT COMING under that condition so it simply prints that line.
awk works on pattern{action} if you are putting semi colon in between it means that condition ends before it and statement after ; is all together a new statements for awk.
EDIT: Adding possible solution by seeing OP's attempt here, not tested at all since no samples are shown by OP. Also I am using --re-interval since my awk version is old you could remove in case you have new version of awk in your box.
awk --re-interval '!/^#/ && !/000000000000/ && /[0-9A-Fa-f]{12}/{
print "host "$5" {"
print " option host-name \""$5"\";"
gsub(/..\B/,"&:", $3)
print " hardware ethernet "$3";"
print " fixed-address "$1";"
print "}"
print ""
}' "$host" > /etc/dhcp/reservations.conf
Taking a look at your code:
$ echo 000000000000 | awk '
{
! /000000000000/ # on given input this evaluates to false
# but since its in action, affects nothing
print # this prints the record regardless of whatever happened above
}'
Adding a print may help you understand:
$ echo 000000000000 | awk '{ print ! /000000000000/; print }'
0
000000000000
Removing the !:
$ echo 000000000000 | awk '{ print /000000000000/; print }'
1
000000000000
This is all I can help you with since there is not enough information for more.

Run command inside awk and store result inplace

I have a script that I need to run on every value. It basically return a number by taking an argument, like below
>>./myscript 4832
>>1100
my.csv contains the following:
123,4832
456,4833
789,4834
My command
cat my.csv | awk -F',' '{$3=system("../myscript $2");print $1,$2,$3'}
myscript is unable to understand that I'm passing the second input field $2 as argument. I need the output from the script to be added to the output as my 3rd column.
The expected output is
123,4832,1100
456,4833,17
789,4834,42
where the third field is the output from myscript with the second field as the argument.
If you are attempting to add a third field with the output from myscript $2 where $2 is the value of the second field, try
awk -F , '{ printf ("%s,%s,", $1, $2); system("../myscript " $2) }' my.csv
where we exploit the convenient fact that the output from myscript will complete the output without a newline with the calculated value and a newline.
This isn't really a good use of Awk; you might as well do
while IFS=, read -r first second; do
printf "%s,%s," "$first" "$second"
../mycript "$second"
done <my.csv
I'm assuming you require comma-separated output; changing this to space-separated is obviously a trivial modification.
The syntax you want is:
awk 'BEGIN{FS=OFS=","}
{
cmd = "./myscript \047" $2 "\047"
val = ( (cmd | getline line) > 0 ? line : "NaN" )
close(cmd)
print $0, val
}
' file
Tweak the getline part to do different error handling if you like and make sure you read and fully understand http://awk.freeshell.org/AllAboutGetline before using getline.
We can use in gnu-awk Two-Way Communications with Another Process
awk -F',' '{"../myscript "$2 |& getline v; print $1,$2,v}' my.csv
you get,
123 4832 1100
456 4833 17
789 4834 42
awk -F',' 'BEGIN { OFS=FS }{"../myscript "$2 |& getline v; print $1,$2,v}' my.csv
you get,
123,4832,1100
456,4833,17
789,4834,42
from GNU awk online documentation:
system: Execute the operating system command command and then return to the awk program. Return command’s exit status (see further on).
you need to use getline getline piped documentation
You need to specify the $2 separately in the string concatenation, that is
awk -F',' '{ system("echo \"echo " $1 "$(../myexecutable " $2 ") " $3 "\" | bash"); }' my.csv

use awk to print a column, adding a comma

I have a file, from which I want to retrieve the first column, and add a comma between each value.
Example:
AAAA 12345 xccvbn
BBBB 43431 fkodks
CCCC 51234 plafad
to obtain
AAAA,BBBB,CCCC
I decided to use awk, so I did
awk '{ $1=$1","; print $1 }'
Problem is: this add a comma also on the last value, which is not what I want to achieve, and also I get a space between values.
How do I remove the comma on the last element, and how do I remove the space? Spent 20 minutes looking at the manual without luck.
$ awk '{printf "%s%s",sep,$1; sep=","} END{print ""}' file
AAAA,BBBB,CCCC
or if you prefer:
$ awk '{printf "%s%s",(NR>1?",":""),$1} END{print ""}' file
AAAA,BBBB,CCCC
or if you like golf and don't mind it being inefficient for large files:
$ awk '{r=r s $1;s=","} END{print r}' file
AAAA,BBBB,CCCC
awk {'print $1","$2","$3'} file_name
This is the shortest I know
Why make it complicated :) (as long as file is not too large)
awk '{a=NR==1?$1:a","$1} END {print a}' file
AAAA,BBBB,CCCC
For better porability.
awk '{a=(NR>1?a",":"")$1} END {print a}' file
You can do this:
awk 'a++{printf ","}{printf "%s", $1}' file
a++ is interpreted as a condition. In the first row its value is 0, so the comma is not added.
EDIT:
If you want a newline, you have to add END{printf "\n"}. If you have problems reading in the file, you can also try:
cat file | awk 'a++{printf ","}{printf "%s", $1}'
awk 'NR==1{printf "%s",$1;next;}{printf "%s%s",",",$1;}' input.txt
It says: If it is first line only print first field, for the other lines first print , then print first field.
Output:
AAAA,BBBB,CCCC
In this case, as simple cut and paste solution
cut -d" " -f1 file | paste -s -d,
In case somebody as me wants to use awk for cleaning docker images:
docker image ls | grep tag_name | awk '{print $1":"$2}'
Surpised that no one is using OFS (output field separator). Here is probably the simplest solution that sticks with awk and works on Linux and Mac: use "-v OFS=," to output in comma as delimiter:
$ echo '1:2:3:4' | awk -F: -v OFS=, '{print $1, $2, $4, $3}' generates:
1,2,4,3
It works for multiple char too:
$ echo '1:2:3:4' | awk -F: -v OFS=., '{print $1, $2, $4, $3}' outputs:
1.,2.,4.,3
Using Perl
$ cat group_col.txt
AAAA 12345 xccvbn
BBBB 43431 fkodks
CCCC 51234 plafad
$ perl -lane ' push(#x,$F[0]); END { print join(",",#x) } ' group_col.txt
AAAA,BBBB,CCCC
$
This can be very simple like this:
awk -F',' '{print $1","$1","$2","$3}' inputFile
where input file is : 1,2,3
2,3,4 etc.
I used the following, because it lists the api-resource names with it, which is useful, if you want to access it directly. I also use a label "application" to find specific apps in a namespace:
kubectl -n ops-tools get $(kubectl api-resources --no-headers=true --sort-by=name | awk '{printf "%s%s",sep,$1; sep=","}') -l app.kubernetes.io/instance=application

How to pass a shell variable to awk in Bourne shell?

I'm a newbie to Bourne shell and want to do simple array simulation. This works:
COLORS='FF0000 0000FF 00FF00'
i=2
color=$(echo ${COLORS} | awk '{print $2}')
echo "color selected: $color"
What I want to do is to pass $i instead of the fixed $2 parameter in print (this will later be used in a loop). I spent hours figuring out the right combination of single and double quotes to do this, no luck.
The closest I got is
color=$("echo ${COLORS} | awk '{print "$"${i}}'")
The run result is:
+ COLORS=FF0000 0000FF 00FF00
+ i=2
+ echo FF0000 0000FF 00FF00 | awk '{print $2}'
./tempgraph.sh: ./tempgraph.sh: 37: echo FF0000 0000FF 00FF00 | awk '{print $2}': not found
+ color=
+ echo color selected:
color selected:
Any help is appreciated.
Don't waste your time trying to get the shell to expand the variable correctly in the awk command, just define a variable using -v:
echo $COLORS | awk -v col=2 '{ print $col }'
In terms of your i variable, this becomes:
i=1
echo $COLORS | awk -v col=$i '{ print $col }'
You can also get at your environment directly:
export COLORS='FF0000 0000FF 00FF00'
awk 'END {split(ENVIRON["COLORS"],colors);for(col in colors) { print "Color",col,"is",colors[col]}}' /dev/null
which gives the following output on this mac:
Color 2 is 0000FF
Color 3 is 00FF00
Color 1 is FF0000
I'd do it like this:
color=$(echo ${COLORS} | awk "{print \$$i}")
If you use '...', the content is not expanded. But you want the value of $i inserted in your script. So "..." is to be used, which does variable expanding. But you also want a $ in front of the number for AWK, so you've got to escape it (\$).
Variables assigned on invokation like -v foo=bar are available in the BEGIN where variable assigned with a simple baz=qux are not.
BEGIN { print foo, bar; }
{ print foo, bar; }
see the difference:
echo Don\'t Panic! | awk -f ./hello.awk -v foo=Hello bar=World
Hello
Hello World