Can we use shell variables in awk? - awk

Can we use shell variables in AWK like $VAR instead of $1, $2? For example:
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW)
NUSR=`echo ${UL[*]}|awk -F, '{print NF}'`
echo $NUSR
echo ${UL[*]}|awk -F, '{print $NUSR}'
Actually am an oracle DBA we get lot of import requests. I'm trying to automate it using the script. The script will find out the users in the dump and prompt for the users to which dump needs to be loaded.
Suppose the dumps has two users AKHIL, SWATHI (there can be may users in the dump and i want to import more number of users). I want to import the dumps to new users AKHIL_NEW and SWATHI_NEW. So the input to be read some think like AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW.
First, I need to find the Number of users to be created, then I need to get new users i.e. AKHIL_NEW,SWATHI_NEW from the input we have given. So that I can connect to the database and create the new users and then import. I'm not copying the entire code: I just copied the code from where it accepts the input users.
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW) ## it can be many users like USER1:USER1_NEW,USER2_USER2_NEW,USER3:USER_NEW..
NUSR=`echo ${UL[*]}|awk -F, '{print NF}'` #finding number of fields or users
y=1
while [ $y -le $NUSR ] ; do
USER=`echo ${UL[*]}|awk -F, -v NUSR=$y '{print $NUSR}' |awk -F: '{print $2}'` #getting Users to created AKHIL_NEW and SWATHI_NEW and passing to SQLPLUS
if [[ $USER = SCPO* ]]; then
TBS=SCPODATA
else
if [[ $USER = WWF* ]]; then
TBS=WWFDATA
else
if [[ $USER = STSC* ]]; then
TBS=SCPODATA
else
if [[ $USER = CSM* ]]; then
TBS=CSMDATA
else
if [[ $USER = TMM* ]]; then
TBS=TMDATA
else
if [[ $USER = IGP* ]]; then
TBS=IGPDATA
fi
fi
fi
fi
fi
fi
sqlplus -s '/ as sysdba' <<EOF # CREATING the USERS in the database
CREATE USER $USER IDENTIFIED BY $USER DEFAULT TABLESPACE $TBS TEMPORARY TABLESPACE TEMP QUOTA 0K on SYSTEM QUOTA UNLIMITED ON $TBS;
GRANT
CONNECT,
CREATE TABLE,
CREATE VIEW,
CREATE SYNONYM,
CREATE SEQUENCE,
CREATE DATABASE LINK,
RESOURCE,
SELECT_CATALOG_ROLE
to $USER;
EOF
y=`expr $y + 1`
done
impdp sysem/manager DIRECTORY=DATA_PUMP DUMPFILE=imp.dp logfile=impdp.log SCHEMAS=AKHIL,SWATHI REMPA_SCHEMA=${UL[*]}
In the last impdp command I need to get the original users in the dumps i.e AKHIL,SWATHI using the variables.

Yes, you can use the shell variables inside awk. There are a bunch of ways of doing it, but my favorite is to define a variable with the -v flag:
$ echo | awk -v my_var=4 '{print "My var is " my_var}'
My var is 4
Just pass the environment variable as a parameter to the -v flag. For example, if you have this variable:
$ VAR=3
$ echo $VAR
3
Use it this way:
$ echo | awk -v env_var="$VAR" '{print "The value of VAR is " env_var}'
The value of VAR is 3
Of course, you can give the same name, but the $ will not be necessary:
$ echo | awk -v VAR="$VAR" '{print "The value of VAR is " VAR}'
The value of VAR is 3
A note about the $ in awk: unlike bash, Perl, PHP etc., it is not part of the variable's name but instead an operator.

Awk and Gawk provide the ENVIRON associative array that holds all exported environment variables. So in your awk script you can use ENVIRON["VarName"] to get the value of VarName, provided that VarName has been exported before running awk.
Note ENVIRON is a predefined awk variable NOT a shell environment variable.
Since I don't have enough reputation to comment on the other answers I have to include them here!
The earlier answer showing $ENVIRON is incorrect - that syntax would be expanded by the shell, and probably result in expanding to nothing.
Further earlier comments about C not being able to access environment variable is wrong. Contrary to what is said above, C (and C++) can access environment variables using the getenv("VarName") function. Many other languages provide similar access (e.g., Java: System.getenv(), Python: os.environ, Haskell System.Environment, ...). Note in all cases access to environment variables is read-only, you cannot change an environment variable in a program and get that value back to the calling script.

There are two ways to pass variables to awk: one way is defining the variable in a command line argument:
$ echo ${UL[*]}|awk -F, -v NUSR=$NUSR '{print $NUSR}'
SWATHI:SWATHI_NEW
Another way is converting the shell variable to an environment variable using export, and reading the environment variable from the ENVIRON array:
$ export NUSR
$ echo ${UL[*]}|awk -F, '{print $ENVIRON["NUSR"]}'
SWATHI:SWATHI_NEW
Update 2016: The OP has comma-separated data and wants to extract an item given its index. The index is in the shell variable NUSR. The value of NUSR is passed to awk, and awk's dollar operator extracts the item.
Note that it would be simpler to declare UL as an array of more than one element, and do the extraction in bash, and take awk out of the equation completely. This however uses 0-based indexing.
UL=(AKHIL:AKHIL_NEW SWATHI:SWATHI_NEW)
NUSR=1
echo ${UL[NUSR]} # prints SWATHI:SWATHI_NEW

There is another way, but it could cause immense confusion:
$ VarName="howdy" ; echo | awk '{print "Just saying '$VarName'"}'
Just saying howdy
$
So you are temporarily exiting the single quote environment (which would normally prevent the shell from interpreting '$') to interpret the variable and then going back into it. It has the virtue of being relatively brief.

Not sure if i understand your question.
But lets say we got a variable number=3 and we want to use it istead of $3, in awk we can do that with the following code
results="100 Mbits/sec 110 Mbits/sec 90 Mbits/sec"
number=3
speed=$(echo $results | awk '{print '"\$${number}"'}')
so the speed variable will get the value 110.
Hope this helps.

No. You can pass the value of a shell variable to an awk script just like you can pass the value of a shell variable to a C program but you cannot access a shell variable in an awk script any more than you could access a shell variable in a C program. Like C, awk is not shell. See question 24 in the comp.unix.shell FAQ at cfajohnson.com/shell/cus-faq-2.html#Q24.
One way to write your code would be:
UL="AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW"
NUSR=$(awk -F, -v ul="$UL" 'BEGIN{print gsub(FS,""); exit}')
echo "$NUSR"
echo "$UL" | awk -F, -v nusr="$NUSR" '{print $nusr}' # could have just done print $NF
but since your original starting point:
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW)
was declaring UL as an array with just one entry, you might want to rethink whatever it is you're trying to do as you may have completely the wrong approach.

Related

How to expand awk variables within the code?

Assuming that i passed some variables to the awk script:
$AWK -f script.awk -v var01="foo" var02="bar"
And inside the script i obtain some pattern:
# pattern01 var01
/pattern01/ {
if (??? == "foo") print
}
I want to expand the variable "$2" ("var01") to its given value.
I have been trying with gawk and it seems to be able to expand variables in the following way:
print $$x
But this, for some reason, doesn't work in the first example, also i need to keep POSIX compatibility. Is it possible to expand the variable in the given example?
(Note: I want specifically this behavior (if possible), so i don't want workarounds with other tools or shell expansion)
Equivalent in shell:
file01:
foobar
some random text
pattern01 var01
more random text...
code.sh:
#!/bin/sh
var01="Hello"
x="$(grep '^pattern01' file01 | awk '{print $2}')"
eval echo "$"$x # prints Hello
Using POSIX awk, there is no way to lookup the value of a variable by it's name. Instead consider using an array to store the values. Not the most elegant, but portable:
$AWK -e 'BEGIN { v["var01"] = "foo" ; v["var02"] = "bar" }' -f script.awk
script.awk
# pattern01 var01
/pattern01/ {
if ( v[$2] == "foo") print
}
If you know that you will be new GNU AWK version, and OK with using extensions, you can use the SYMTAB array. From man page:
SYMTAB An array whose indices are the names of all currently
defined global variables and arrays in the program. The array may be
used for indirect access to read or write the value of a variable:
foo = 5
SYMTAB["foo"] = 4
print foo # prints 4
$AWK -f script.awk -v var01="foo" var02="bar"
script.awk
# pattern01 var01
/pattern01/ {
if ( SYMTAB[$2] == "foo") print
}
Both approached eliminate the need to create environment variables, which may have impact on other programs, and may be hard to scale.
I have found one solution, by setting the variable as part of the environment and then calling the special variable "ENVIRON" with the name (as it acts as a dictionary):
# pattern01 var01
/pattern01/ {
if (ENVIRON[$2] == "foo") print
}
I think that by creating manually the dictionary at the BEGIN stage, the same behaviour could be achieved without making use of the environment.
Can you try this
var01="Hello"
x="$(grep '^pattern01' file01 | awk '{print $2}')"
echo ${!x}
hope this helps..
Thanks,

Why is error flagging the variable as undefined?

I'm trying to truncate a part of the file name of my present working directory. This is the code I am using:
set test = "$cwd" | awk -F "/" '{print $3}'
set USER_CUST = ${test:s/abc_//}
(Explanantion: I want to cut out the "abc_" part from the third folder from the root)
When I run the script(script_check.csh) I am getting this in my console :
tcsh -x script_check.csh
set test = /proj/proj_name/abc_username/folder/sub_folder
awk -F / {print $3}
test: Undefined variable.
Why is test an undefined variable? Is there another possible workaround?
The right-hind side of a variable declaration isn't a shell command, so you can't use pipes in there. You can see this with:
$ set test = ls
$ echo $test
ls
Why does it give the confusing "undefined variable" error? Who knows. The (t)csh parser is quirky and full of strange things like this, especially when given bad input. It's one of the main reasons scripting in (t)csh is generally discouraged (as someone pointed out in the comments) ;-)
To make it a shell command, add backticks like so:
$ set test = `echo "$cwd" | awk -F "/" '{print $3}'`
$ echo $test
martin
You can make this a bit shorter with pwd by the way:
$ set test = `pwd | awk -F "/" '{print $2}'`

awk: setting environment variables directly from within an awk script

first post here, but been a lurker for ages. i have googled for ages, but cant find what i want (many abigious topic subjects which dont request what the topic suggests it does ...). not new to awk or scripting, just a little rusty :)
i'm trying to write an awk script which will set shell env values as it runs - for another bash script to pick up and use later on. i cannot simply use stdout from awk to report this value i want setting (i.e. "export whatever=awk cmd here"), as thats already directed to a 'results file' which the awkscript is creating (plus i have more than one variable to export in the final code anyway).
As an example test script, to demo my issue:
echo $MYSCRIPT_RESULT # returns nothing, not currently set
echo | awk -f scriptfile.awk # do whatever, setting MYSCRIPT_RESULT as we go
echo $MYSCRIPT_RESULT # desired: returns the env value set in scriptfile.awk
within scriptfile.awk, i have tried (without sucess)
1/) building and executing an adhoc string directly:
{
cmdline="export MYSCRIPT_RESULT=1"
cmdline
}
2/) using the system function:
{
cmdline="export MYSCRIPT_RESULT=1"
system(cmdline)
}
... but these do not work. I suspect that these 2 commands are creating a subshell within the shell awk is executing from, and doing what i ask (proven by touching files as a test), but once the "cmd"/system calls have completed, the subshell dies, unfortunatley taking whatever i have set with it - so my env setting changes dont stick from "the caller of awk"'s perspective.
so my question is, how do you actually set env variables within awk directly, so that a calling process can access these env values after awk execution has completed? is it actually possible?
other than the adhoc/system ways above, which i have proven fail for me, i cannot see how this could be done (other than writing these values to a 'random' file somewhere to be picked up and read by the calling script, which imo is a little dirty anyway), hence, help!
all ideas/suggestions/comments welcomed!
You cannot change the environment of your parent process. If
MYSCRIPT_RESULT=$(awk stuff)
is unacceptable, what you are asking cannot be done.
You can also use something like is described at
Set variable in current shell from awk
unset var
var=99
declare $( echo "foobar" | awk '/foo/ {tmp="17"} END {print "var="tmp}' )
echo "var=$var"
var=
The awk END clause is essential otherwise if there are no matches to the pattern declare dumps the current environment to stdout and doesn't change the content of your variable.
Multiple values can be set by separating them with spaces.
declare a=1 b=2
echo -e "a=$a\nb=$b"
NOTE: declare is bash only, for other shells, use eval with the same syntax.
You can do this, but it's a bit of a kludge. Since awk does not allow redirection to a file descriptor, you can use a fifo or a regular file:
$ mkfifo fifo
$ echo MYSCRIPT_RESULT=1 | awk '{ print > "fifo" }' &
$ IFS== read var value < fifo
$ eval export $var=$value
It's not really necessary to split the var and value; you could just as easily have awk print the "export" and just eval the output directly.
I found a good answer. Encapsulate averything in a subshell!
The comand declare works as below:
#Creates 3 variables
declare var1=1 var2=2 var3=3
ex1:
#Exactly the same as above
$(awk 'BEGIN{var="declare "}{var=var"var1=1 var2=2 var3=3"}END{print var}')
I found some really interesting uses for this technique. In the next exemple I have several partitions with labels. I create variables using the labels as variable names and the device name as variable values.
ex2:
#Partition data
lsblk -o NAME,LABEL
NAME LABEL
sda
├─sda1
├─sda2
├─sda5 System
├─sda6 Data
└─sda7 Arch
#Creates a subshell to execute the text
$(\
#Pipe lsblk to awk
lsblk -o NAME,LABEL | awk \
#Initiate the variable with the text for the declare command
'BEGIN{txt="declare "}'\
#Filters devices with labels Arch or Data
'/Data|Arch/'\
#Concatenate txt with itself plus text for the variables(name and value)
#substr eliminates the special caracters before the device name
'{txt=txt$2"="substr($1,3)" "}'\
#AWK prints the text and the subshell execute as a command
'END{print txt}'\
)
The end result of this is 2 variables: Data with value sda6 and Arch with value sda7.
The same exemple in a single line.
$(lsblk -o NAME,LABEL | awk 'BEGIN{txt="declare "}/Data|Arch/{txt=txt$2"="substr($1,3)" "}END{print txt}')

Awk unable to store the value into an array

I am using a script like below
SCRIPT
declare -a GET
i=1
awk -F "=" '$1 {d[$1]++;} {for (c in d) {GET[i]=d[c];i=i+1}}' session
echo ${GET[1]} ${GET[2]}
DESCRIPTION
The problem is the GET value printed outside is not the correct value ...
I understand your question as "how can I use the results of my awk script inside the shell where awk was called". The truth is that it isn't really trivial. You wouldn't expect to be able to use the output from a C program or python script easily inside your shell. Same with awk, which is a scripting language of its own.
There are some workarounds. For a robust solution, write your results from the awk script to a file in a suitably easy format and read them from the shell. As a hack, you could also try to ready the output from awk directly into the shell using $(). Combine that with the set command and you could do:
set -- $(awk <awk script here>)
and then use $1 $2 etc. but you have to be careful with spaces in the output from awk.

awk split question

I wrote a small script, using awk 'split' command to get the current directory name.
echo $PWD
I need to replace '8' with the number of tokens as a result of the split operation.
// If PWD = /home/username/bin. I am trying to get "bin" into package.
package="`echo $PWD | awk '{split($0,a,"/"); print a[8] }'`"
echo $package
Can you please tell me what do I substitute in place of 'print a[8]' to get the script working for any directory path ?
-Sachin
You don't need awk for that. If you always want the last dir in a path just do:
#!/bin/sh
cur_dir="${PWD##*/}/"
echo "$cur_dir"
The above has the added benefit of not creating any subshells and/or forks to external binaries. It's all native POSIX shell syntax.
You could use print a[length(a)] but it's better to avoid splitting and use custom fields separator and $NF:
echo $PWD | awk -F/ '{print $NF}'
But in that specific case you should rather use basename:
basename "$PWD"
The other answers are better replacements to perform the function you're trying to accomplish. However, here is the specific answer to your question:
package=$(echo $PWD | awk '{n = split($0,a,"/"); print a[n] }')
echo "$package"
split() returns the number of resulting elements.