How to use variables in awk scripts

How to use variables in awk scripts - awk

I am having trouble in using variables in awk scripts.
myvariable = tolower(substr($1,0,2)) tolower(substr($2,0,8))
so I can use $myvariable in the script instead of using the above every time.
I have tried ,but it prints everything nothing cut from the strings.
Thanks.

Awk is different from linux shell scripting.. you don't need to use "$" in front of variable names. In awk, "$" is special in its own way, for example it is used to reference a field/line/record.
if you declaring your variable declare it like this
myvariable = tolower(substr($1,0,2)) tolower(substr($2,0,8))
that is, drop the $ in front of your awk variable inside your awk statement
if you have a variable declare in your linux shell and you want to use that variable in your awk script
You can assign your shell variable to awk like
awk -v awkvariable="$myshellvariable" '{...commands....}'

AWK have three "types" of variables:
one for accessing fields ($1, $2, $3, ...)
a bunch of system variables (NR, NF, FS, RS, ...)
user-defined variables (my_var = 123, i = "hello", ...)
To print the lower to you simply write the variable name:
my_var = "john doe";
print my_var;
print NF; // Print Number of Fields
eg
echo john was here | awk '{print NF}' # 3
The interesting part is that you can mix system/user-defined variables with field variables ($1, ...) like so:
my_var = 2;
print $my_var; # print second field
print $NF; # Print last field (Using the "Number of Fields" variable)
eg
echo john was here | awk '{print $NF}' # here
echo john was here | awk '{print $(NF-1)}' # was

Related

awk to strore string format in variable

In the below awk when I echo f it is empty, but if I remove the $f I get the desired results, however the new formatting is not stored in the $d variable. Basically I am trying to convert the string in $d variable into a new formatted variable $f. Thank you :).
file
ID,1A
DATE,220102
awk
d=$(awk -F, '/Date/ {print $2}' file) | f=$(date -d "$d" +'%Y-%m-%d')
f --- desired ---
2022-01-02

You need to use it this way to return a value from awk and set a shell variable:
f=$(date -d "$(awk -F, '/DATE/ {print $2}' file)" +'%Y-%m-%d')
echo "$f"
2022-01-02

With awk:
awk 'BEGIN{FS=","; OFS="-"} $1=="DATE"{ print "20" substr($2,1,2), substr($2,3,2), substr($2,5,2) }' file
Output:
2022-01-02
See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

With your shown samples please try following awk code. Written and tested in GNU awk. Using awk's match function capability to use regex ^DATE,([0-9]{2})([0-9]{2})([0-9]{2})$ for getting required output. This creates 3 capturing groups and stores matched values into array named arr once this match is done then printing 20 and all 3 values of arrays separated by - as per required output.
awk -v OFS="-" '
match($0,/^DATE,([0-9]{2})([0-9]{2})([0-9]{2})$/,arr){
print "20" arr[1],arr[2],arr[3]
}
' Input_file

While the other answers provide a more efficient method of reformatting the date (and assuming OP has no need for d in follow-on code), I'm going to focus solely on a couple issues with OP's current code:
in the awk script need to match for (all caps) DATE instead of Date
current code attempts to pipe the output from d=$(...) to the f=$(...) portion of code; while this does 'work' in that f will be assigned 2022-01-02 the problem is that the assignment to f is performed in a subprocess and upon exiting the subprocess f is effectively 'unassigned'; what OP really needs is to separate the d=$(...) and f=$(...) commands from each other so that both assignments occur in the current shell, and this can be done by replacing the pipe with a semicolon.
If we make these two simple edits:
# old code:
d=$(awk -F, '/Date/ {print $2}' file) | f=$(date -d "$d" +'%Y-%m-%d')
^^^^ ^^^
# new code:
d=$(awk -F, '/DATE/ {print $2}' file) ; f=$(date -d "$d" +'%Y-%m-%d')
^^^^ ^^^
OP's code will now generate the desired result:
$ echo "${f}"
2022-01-02

the string approaches :
{n,g}awk -F'^[^,]*,' 'gsub("^....|..", "-&", $(_=!(NF*=NF==NR)))\
($+_ = substr($+_,++_+_--))^_' OFS=20
mawk -F'^[^,]*,' '$(gsub("^....|..", "-&",
$!(NF*=NF==NR))*(_=!NF)) = substr($_,++_+_)' OFS=20
mawk2 'gsub("^....|..", "-&",
$!(NF*=NF==NR)) + sub(".",_)^_' FS='^.+,' OFS=20
the numeric approach :
mawk -F',' 'NF==NR && ($!NF = sprintf("20%.*s-%.*s-%0*.f", _+=_^=_<_,
__ = $NF, _++, substr(__,_), --_, __%(_+_*_*_)^_))'
2022-01-02

Redhat how can i use a space and double quote as separators to output the 2nd, 4th and last column from a each line in a file

I have the following multiple lines in a file on Linux, the line information differs but the format is always the same:
-item bread.maker -model "modelname model type modelnum-43453-23241.7" -date1 23.10.01 -date2 30.10.04 -date3 04.02.05
I want to output only the 2nd, 4th and last columns of each line. I've tried with awk -F, and print $NF, but I cannot seem get it to treat the double quoted part as 1 column.

With any awk:
$ awk 'match($0,/"[^"]*"/){print $2, substr($0,RSTART,RLENGTH), $NF}' file
bread.maker "modelname model type modelnum-43453-23241.7" 04.02.05
or:
$ awk -v OFS='"' '{split($0,f,/"/); print $2, f[2], $NF}' file
bread.maker"modelname model type modelnum-43453-23241.7"04.02.05
or with GNU awk for FPAT:
$ awk -v FPAT='[^" ]+|"[^"]*"' '{print $2, $4, $10}' file
bread.maker "modelname model type modelnum-43453-23241.7" 04.02.05
Set OFS as appropriate if you want something other than a blank char to separate the output fields. I used " as the OFS for the 2nd script since it must not be present in your input if you're already using it to quote strings.

With bash v5.1, we can assign an even-numbered list of words as an associative array: it will be treated as a key-value list.
declare -A fields
while IFS= read -r line; do
eval fields=("$line") # yeah, eval is needed here to respect
# the quotes in the line
printf '%s,%s,%s\n' "${fields[-item]}" "${fields[-model]}" "${fields[-date3]}"
done < file
bread.maker,modelname model type modelnum-43453-23241.7,04.02.05

AWK column from a script param?

I want to call awk from a bash script like this:
#!/bin/bash
awk -vFPAT='[^ ]*|"[^"]*"|\\[[^]]*\\]' '{ print $2 }' $1
I want $2 to be a number that I specify. So if the script is named get-log-column I'd like to be able to call it this way: get-log-column /var/log/apache2/access.log 4
In this example, 4 would be the column so the output would be column 4 from access.log.
In other words, if access.log looked like this:
alpha beta orange apple snickers
paris john michael peace world
So the output would be:
apple
peace

Could you please try following.
#!/bin/bash
var="$1"
awk -v FPAT='[^ ]*|"[^"]*"|\\[[^]]*\\]' -v varcol="$var" '{ print $varcol }' Input_file
Explanation:
Have created a shell variable var which will have $1 value in it. Where $1 value is the argument passed to script. Now in awk we can't pass shell variables directly so created 1 awk variable named var_col which will have value of var in it. Now mentioning $varcol will print column value from current line as per OP's question. $ means field number and varcol is a variable which has user entered value in it.

This may work
#!/bin/bash
awk -v var="$1" -v FPAT='[^ ]*|"[^"]*"|\\[[^]]*\\]' '{ print $var }' $1
See this on how to use variable from shell in awk
How do I use shell variables in an awk script?

How to use variable including special symbol in awk?

For my case, if a certain pattern is found as the second field of one line in a file, then I need print the first two fields. And it should be able to handle case with special symbol like backslash.
My solution is first using sed to replace \ with \\, then pass the new variable to awk, then awk will parse \\ as \ then match the field 2.
escaped_str=$( echo "$pattern" | sed 's/\\/\\\\/g')
input | awk -v awk_escaped_str="$escaped_str" '$2==awk_escaped_str { $0=$1 " " $2 " "}; { print } '
While this seems too complicated, and cannot handle various case.
Is there a better way which is more simpler and could cover all other special symbol?

The way to pass a shell variable to awk without backslashes being interpreted is to pass it in the arg list instead of populating an awk variable outside of the script:
$ shellvar='a\tb'
$ awk -v awkvar="$shellvar" 'BEGIN{ printf "<%s>\n",awkvar }'
<a b>
$ awk 'BEGIN{ awkvar=ARGV[1]; ARGV[1]=""; printf "<%s>\n",awkvar }' "$shellvar"
<a\tb>
and then you can search a file for it as a string using index() or ==:
$ cat file
a b
a\tb
$ awk 'BEGIN{ awkvar=ARGV[1]; ARGV[1]="" } index($0,awkvar)' "$shellvar" file
a\tb
$ awk 'BEGIN{ awkvar=ARGV[1]; ARGV[1]="" } $0 == awkvar' "$shellvar" file
a\tb
You need to set ARGV[1]="" after populating the awk variable to avoid the shell variable value also being treated as a file name. Unlike any other way of passing in a variable, ALL characters used in a variable this way are treated literally with no "special" meaning.

There are three variations you can try without needing to escape your pattern:
This one tests literal strings. No regex instance is interpreted:
$2 == expr
This one tests if a literal string is a subset:
index($2, expr)
This one tests regex pattern:
$2 ~ pattern

How to find if substring is in a variable in awk

i am using awk and need to find if a variable , in this case $24 contains the word 3:2- if so to print the line (for sed command)- the variable may include more letters or spaces or \n.......
for ex.
$24 == "3:2" {print "s/(inter = ).*/\\1\"" "3:2_pulldown" "\"/" >> NR }
in my above line- it never find such a string although it exists.
can you help me with the command please??

If you're looking for "3:2" within $24, then you want $24 ~ /3:2/ or index($24, "3:2") > 0
Why are you using awk to generate a sed script?
Update
To pass a variable from the shell to awk, use the -v option:
val="3:2" # or however you determine this value
awk -v v="$val" '$24 ~ v {print}'

awk '$24~/3:2/' file_name
this will serach for "3:2" in field 24

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to use variables in awk scripts - awk

I am having trouble in using variables in awk scripts. myvariable = tolower(substr($1,0,2)) tolower(substr($2,0,8)) so I can use $myvariable in the script instead of using the above every time. I have tried ,but it prints everything nothing cut from the strings. Thanks.

Related

awk to strore string format in variable

Redhat how can i use a space and double quote as separators to output the 2nd, 4th and last column from a each line in a file

AWK column from a script param?

How to use variable including special symbol in awk?

How to find if substring is in a variable in awk

Categories

Resources