AWK column from a script param? - awk

I want to call awk from a bash script like this:
#!/bin/bash
awk -vFPAT='[^ ]*|"[^"]*"|\\[[^]]*\\]' '{ print $2 }' $1
I want $2 to be a number that I specify. So if the script is named get-log-column I'd like to be able to call it this way: get-log-column /var/log/apache2/access.log 4
In this example, 4 would be the column so the output would be column 4 from access.log.
In other words, if access.log looked like this:
alpha beta orange apple snickers
paris john michael peace world
So the output would be:
apple
peace

Could you please try following.
#!/bin/bash
var="$1"
awk -v FPAT='[^ ]*|"[^"]*"|\\[[^]]*\\]' -v varcol="$var" '{ print $varcol }' Input_file
Explanation:
Have created a shell variable var which will have $1 value in it. Where $1 value is the argument passed to script. Now in awk we can't pass shell variables directly so created 1 awk variable named var_col which will have value of var in it. Now mentioning $varcol will print column value from current line as per OP's question. $ means field number and varcol is a variable which has user entered value in it.

This may work
#!/bin/bash
awk -v var="$1" -v FPAT='[^ ]*|"[^"]*"|\\[[^]]*\\]' '{ print $var }' $1
See this on how to use variable from shell in awk
How do I use shell variables in an awk script?

Related

awk command to read a key value pair from a file

I have a file input.txt which stores information in KEY:VALUE form. I'm trying to read GOOGLE_URL from this input.txt which prints only http because the seperator is :. What is the problem with my grep command and how should I print the entire URL.
SCRIPT
$> cat script.sh
#!/bin/bash
URL=`grep -e '\bGOOGLE_URL\b' input.txt | awk -F: '{print $2}'`
printf " $URL \n"
INPUT_FILE
$> cat input.txt
GOOGLE_URL:https://www.google.com/
OUTPUT
https
DESIRED_OUTPUT
https://www.google.com/
Since there are multiple : in your input, getting $2 will not work in awk because it will just give you 2nd field. You actually need an equivalent of cut -d: -f2- but you also need to check key name that comes before first :.
This awk should work for you:
awk -F: '$1 == "GOOGLE_URL" {sub(/^[^:]+:/, ""); print}' input.txt
https://www.google.com/
Or this non-regex awk approach that allows you to pass key name from command line:
awk -F: -v k='GOOGLE_URL' '$1==k{print substr($0, length(k FS)+1)}' input.txt
Or using gnu-grep:
grep -oP '^GOOGLE_URL:\K.+' input.txt
https://www.google.com/
Could you please try following, written and tested with shown samples in GNU awk. This will look for string GOOGLE_URL and will catch further either http or https value from url, in case you need only https then change http[s]? to https in following solution please.
awk '/^GOOGLE_URL:/{match($0,/http[s]?:\/\/.*/);print substr($0,RSTART,RLENGTH)}' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
/^GOOGLE_URL:/{ ##Checking condition if line starts from GOOGLE_URL: then do following.
match($0,/http[s]?:\/\/.*/) ##Using match function to match http[s](s optional) : till last of line here.
print substr($0,RSTART,RLENGTH) ##Printing sub string of matched value from above function.
}
' Input_file ##Mentioning Input_file name here.
2nd solution: In case you need anything coming after first : then try following.
awk '/^GOOGLE_URL:/{match($0,/:.*/);print substr($0,RSTART+1,RLENGTH-1)}' Input_file
Take your pick:
$ sed -n 's/^GOOGLE_URL://p' file
https://www.google.com/
$ awk 'sub(/^GOOGLE_URL:/,"")' file
https://www.google.com/
The above will work using any sed or awk in any shell on every UNIX box.
I would use GNU AWK following way for that task:
Let file.txt content be:
EXAMPLE_URL:http://www.example.com/
GOOGLE_URL:https://www.google.com/
KEY:GOOGLE_URL:
Then:
awk 'BEGIN{FS="^GOOGLE_URL:"}{if(NF==2){print $2}}' file.txt
will output:
https://www.google.com/
Explanation: GNU AWK FS might be pattern, so I set it to GOOGLE_URL: anchored (^) to begin of line, so GOOGLE_URL: in middle/end will not be seperator (consider 3rd line of input). With this FS there might be either 1 or 2 fields in each line - latter is case only if line starts with GOOGLE_URL: so I check number of fields (NF) and if this is second case I print 2nd field ($2) as first record in this case is empty.
(tested in gawk 4.2.1)
Yet another awk alternative:
gawk -F'(^[^:]*:)' '/^GOOGLE_URL:/{ print $2 }' infile

Extract fields from logs with awk and aggregate them for a new command

I have this kind of log:
2018-10-05 09:12:38 286 <190>1 2018-10-05T09:12:38.474640+00:00 app web - - Class uuid=uuid-number-one cp=xxx action='xxxx'
2018-10-05 10:11:23 286 <190>1 2018-10-05T10:11:23.474640+00:00 app web - - Class uuid=uuid-number-two cp=xxx action='xxxx'
I need to extract uuid and run a second query with:
./getlogs --search 'uuid-number-one OR uuid-number-two'
For the moment for the first query I do this to extract uuid:
./getlogs | grep 'uuid' | awk 'BEGIN {FS="="} { print $2 }' | cut -d' ' -f1
My three question :
I think I could get rid of grep and cut and use only awk?
How could I capture only the value of uuid. I tried awk '/uuid=\S*/{ print $1 }' or awk 'BEGIN {FS="uuid=\\S*"} { print $1 }' but it's a failure.
How could I aggregate the result and turn it into one shell variable that I can use after for the new command?
You could define two field separators:
$ awk -F['= '] '/uuid/{print $12}' file
Result:
uuid-number-one
uuid-number-two
Question 2:
The pattern part in awk just selects lines to process. It doesn't change the internal variables like $1 or NF. You need to do the replacement afterwards:
$ awk '/uuid=/{print gensub(/.*uuid=(\S*).*/, "\\1", "")}' file
Question 3:
var=$(awk -F['= '] '/uuid/{r=r","$12}END{print substr(r,2)}' file)
Implement the actual aggregation for each line (here r=r","$12).
Could you please try following(tested on shown samples and in BASH environment).
awk 'match($0,/uuid=[^ ]*/){print substr($0,RSTART+5,RLENGTH-5)}' Input_file
Solution 2nd: In case your uid is not having space in it then use following.
awk '{sub(/.*uuid=/,"");sub(/ .*/,"")} 1' Input_file
solution 3rd: using sed following may help you(considering that uid is not having any space in its values).
sed 's/\(.*uuid=\)\([^ ]*\)\(.*\)/\2/' Input_file
Solution 4th: using awk field separator method for shown samples.
awk -F'uuid=| cp' '{print $2}' Input_file
To concatenate all values into a shell variable use following.
shell_var=$(awk 'match($0,/uuid=[^ ]*/){val=val?val OFS substr($0,RSTART+5,RLENGTH-5):substr($0,RSTART+5,RLENGTH-5)} END{print val}' Input_file)

awk field separator not working for first line

echo 'NODE_1_length_317516_cov_18.568_ID_4005' | awk 'FS="_length" {print $1}'
Obtained output:
NODE_1_length_317516_cov_18.568_ID_4005
Expected output:
NODE_1
How is that possible? I'm missing something.
When you are going through lines using Awk, the field separator is interpreted before processing the record. Awk reads the record according the current values of FS and RS and goes ahead performing the operations you ask it for.
This means that if you set the value of FS while reading a record, this won't have effect for that specific record. Instead, the FS will have effect when reading the next one and so on.
So if you have a file like this:
$ cat file
1,2 3,4
5,6 7,8
And you set the field separator while reading one record, it takes effect from the next line:
$ awk '{FS=","} {print $1}' file
1,2 # FS is still the space!
5
So what you want to do is to set the FS before starting to read the file. That is, set it in the BEGIN block or via parameter:
$ awk 'BEGIN{FS=","} {print $1}' file
1,2 # now, FS is the comma
5
$ awk -F, '{print $1}' file
1
5
There is also another way: make Awk recompute the full record with {$0=$0}. With this, Awk will take into account the current FS and act accordingly:
$ awk '{FS=","} {$0=$0;print $1}' file
1
5
awk Statement used incorrectly
Correct way is
awk 'BEGIN { FS = "#{delimiter}" } ; { print $1 }'
In your case you can use
awk 'BEGIN { FS = "_length" } ; { print $1 }'
Inbuilt variables like FS, ORS etc must be set within a context i.e in 1 of the following blocks: BEGIN, condition blocks or END.
$ echo 'NODE_1_length_317516_cov_18.568_ID_4005' | awk 'BEGIN{FS="_length"} {print $1}'
NODE_1
$
You can also pass the delimiter using -F switch like this:
$ echo 'NODE_1_length_317516_cov_18.568_ID_4005' | awk -F "_length" '{print $1}'
NODE_1
$

how to search using variables in awk

I've this file test1 as follows
NAME1 04-03-2014
NAME2 04-04-2014
Now I'm able to get the o/p that I need when I execute from the command line but not when I use it as a variable. My awk version is 3.1.5
works from the cmd line: awk '$2 = /04-03-2014/' test1
doesn't work in the bash/ksh script:
export m=04 d=03 y=2014
awk -v m="$m" -v d="$d" -v y="$y" '$2 = /m-d-y/' test1 OR
awk -v m="$m" -v d="$d" -v y="$y" '$2 = m-d-y' test1 ### This replaces the 2nd field as -2010
I've used variables in awk before in printf statements to calculate numbers but the above example somehow doesn't seem to work for me. However, I tried a workaround i.e echo the awk cmd in a file and then executing the file as a script which gives me the desired o/p however I don't think this is the correct way of doing things and would appreciate if someone could give me a smarter way.
It's a guess but I bet you're trying to print every line where $2 is equal to a specific string constructed from your variables. If so that'd be:
awk -v m="$m" -v d="$d" -v y="$y" '$2 == m"-"d"-"y' file

How to find if substring is in a variable in awk

i am using awk and need to find if a variable , in this case $24 contains the word 3:2- if so to print the line (for sed command)- the variable may include more letters or spaces or \n.......
for ex.
$24 == "3:2" {print "s/(inter = ).*/\\1\"" "3:2_pulldown" "\"/" >> NR }
in my above line- it never find such a string although it exists.
can you help me with the command please??
If you're looking for "3:2" within $24, then you want $24 ~ /3:2/ or index($24, "3:2") > 0
Why are you using awk to generate a sed script?
Update
To pass a variable from the shell to awk, use the -v option:
val="3:2" # or however you determine this value
awk -v v="$val" '$24 ~ v {print}'
awk '$24~/3:2/' file_name
this will serach for "3:2" in field 24