awk: setting environment variables directly from within an awk script - awk

first post here, but been a lurker for ages. i have googled for ages, but cant find what i want (many abigious topic subjects which dont request what the topic suggests it does ...). not new to awk or scripting, just a little rusty :)
i'm trying to write an awk script which will set shell env values as it runs - for another bash script to pick up and use later on. i cannot simply use stdout from awk to report this value i want setting (i.e. "export whatever=awk cmd here"), as thats already directed to a 'results file' which the awkscript is creating (plus i have more than one variable to export in the final code anyway).
As an example test script, to demo my issue:
echo $MYSCRIPT_RESULT # returns nothing, not currently set
echo | awk -f scriptfile.awk # do whatever, setting MYSCRIPT_RESULT as we go
echo $MYSCRIPT_RESULT # desired: returns the env value set in scriptfile.awk
within scriptfile.awk, i have tried (without sucess)
1/) building and executing an adhoc string directly:
{
cmdline="export MYSCRIPT_RESULT=1"
cmdline
}
2/) using the system function:
{
cmdline="export MYSCRIPT_RESULT=1"
system(cmdline)
}
... but these do not work. I suspect that these 2 commands are creating a subshell within the shell awk is executing from, and doing what i ask (proven by touching files as a test), but once the "cmd"/system calls have completed, the subshell dies, unfortunatley taking whatever i have set with it - so my env setting changes dont stick from "the caller of awk"'s perspective.
so my question is, how do you actually set env variables within awk directly, so that a calling process can access these env values after awk execution has completed? is it actually possible?
other than the adhoc/system ways above, which i have proven fail for me, i cannot see how this could be done (other than writing these values to a 'random' file somewhere to be picked up and read by the calling script, which imo is a little dirty anyway), hence, help!
all ideas/suggestions/comments welcomed!

You cannot change the environment of your parent process. If
MYSCRIPT_RESULT=$(awk stuff)
is unacceptable, what you are asking cannot be done.

You can also use something like is described at
Set variable in current shell from awk
unset var
var=99
declare $( echo "foobar" | awk '/foo/ {tmp="17"} END {print "var="tmp}' )
echo "var=$var"
var=
The awk END clause is essential otherwise if there are no matches to the pattern declare dumps the current environment to stdout and doesn't change the content of your variable.
Multiple values can be set by separating them with spaces.
declare a=1 b=2
echo -e "a=$a\nb=$b"
NOTE: declare is bash only, for other shells, use eval with the same syntax.

You can do this, but it's a bit of a kludge. Since awk does not allow redirection to a file descriptor, you can use a fifo or a regular file:
$ mkfifo fifo
$ echo MYSCRIPT_RESULT=1 | awk '{ print > "fifo" }' &
$ IFS== read var value < fifo
$ eval export $var=$value
It's not really necessary to split the var and value; you could just as easily have awk print the "export" and just eval the output directly.

I found a good answer. Encapsulate averything in a subshell!
The comand declare works as below:
#Creates 3 variables
declare var1=1 var2=2 var3=3
ex1:
#Exactly the same as above
$(awk 'BEGIN{var="declare "}{var=var"var1=1 var2=2 var3=3"}END{print var}')
I found some really interesting uses for this technique. In the next exemple I have several partitions with labels. I create variables using the labels as variable names and the device name as variable values.
ex2:
#Partition data
lsblk -o NAME,LABEL
NAME LABEL
sda
├─sda1
├─sda2
├─sda5 System
├─sda6 Data
└─sda7 Arch
#Creates a subshell to execute the text
$(\
#Pipe lsblk to awk
lsblk -o NAME,LABEL | awk \
#Initiate the variable with the text for the declare command
'BEGIN{txt="declare "}'\
#Filters devices with labels Arch or Data
'/Data|Arch/'\
#Concatenate txt with itself plus text for the variables(name and value)
#substr eliminates the special caracters before the device name
'{txt=txt$2"="substr($1,3)" "}'\
#AWK prints the text and the subshell execute as a command
'END{print txt}'\
)
The end result of this is 2 variables: Data with value sda6 and Arch with value sda7.
The same exemple in a single line.
$(lsblk -o NAME,LABEL | awk 'BEGIN{txt="declare "}/Data|Arch/{txt=txt$2"="substr($1,3)" "}END{print txt}')

Related

Issue with awk, bad interpretor [duplicate]

I'm trying to make an awk file executable. I've written the script, and did chmod +x filename. Here is the code:
#!/bin/awk -v
'TOPNUM = $1
## pick1 - pick one random number out of y
## main routine
BEGIN {
## set seed
srand ()
## get a random number
select = 1 +int(rand() * TOPNUM)
# print pick
print select
}'
When I try and run the program and put in a variable for the TOPNUM:
pick1 50
I get the response:
-bash: /home/petersone/bin/pick1: /bin/awk: bad interpreter: No such file or directory
I'm sure that there's something simple that I'm messing up, but I simply cannot figure out what it is. How can I fix this?
From a command line, run this command:
which awk
This will print the path of AWK, which is likely /usr/bin/awk. Correct the first line and your script should work.
Also, your script shouldn't have the single-quote characters at the beginning and end. You can run AWK from the command line and pass in a script as a quoted string, or you can write a script in a file and use the #!/usr/bin/awk first line, with the commands just in the file.
Also, the first line of your script isn't going to work right. In AWK, setup code needs to be inside the BEGIN block, and $1 is a reference to the first word in the input line. You need to use ARGV[1] to refer to the first argument.
http://www.gnu.org/software/gawk/manual/html_node/ARGC-and-ARGV.html
As #TrueY pointed out, there should be a -f on the first line:
#!/usr/bin/awk -f
This is discussed here: Invoking a script, which has an awk shebang, with parameters (vars)
Working, tested version of the program:
#!/usr/bin/awk -f
## pick1 - pick one random number out of y
## main routine
BEGIN {
TOPNUM = ARGV[1]
## set seed
srand ()
## get a random number
select = 1 +int(rand() * TOPNUM)
# print pick
print select
}
Actually this form is more preferrable:
#! /bin/awk -E
Man told:
-E Similar to -f, however, this is option is the last one processed and should be used with #! scripts, particularly for CGI applications, to avoid passing in options or source code (!) on the command line from a URL. This option disables command-line variable assignments

AWK invoking sh instead of csh

I'm sure there is a much easier way to do this, so I'm all ears.
sort -nrk 7 my_list.tsv | tail -n 1 | awk '{print("setenv INPUT_DIR `pwd`/"$1)}'
The first item in my .tsv are filenames (sorted) that I'm trying to set as an environmental variable in csh. I want to add the pull the path too. I though this would work but...
sh: setenv: command not found
Even though I'm in csh. Can I get the awk system function to use csh/tcsh?
J
'to add the pull the path too'. ??
Do you mean
echo "fileName" | awk '{printf("setenv '`pwd`'/%s\n", $1)}'
output
setenv INPUT_DIR /home/shellter/filename
I've replaced your sort ... | tail .. as that doesn't seem to be your core question.
Also note that the single-quotes prevent the back-quote command-substitution being processed. You turn off the single-quotes, get your cmd-sub value, the turn single-quotes back on again.
If this not what you mean, please edit your question and replace sort ... tail ... as above with a simple echo "string", AND include the expected output. It will also help to include the output you are currently getting.
To answer the question in your headline, awk is almost certainly using the value stored in the $SHELL environment variable. Or there may be another variable, so do a set | grep /bin/sh and setenv | grep /bin/sh so see where the reference to /bin/sh is getting set. Then decide how you're going manage that.
When I run your code, I don't get any indication that the shell was executed. I get
setenv INPUT_DIR `pwd`/file
but I'm running the code under ksh. I don't have a csh to use test with. But for the given case, it shouldn't matter.
IHTH

Can we use shell variables in awk?

Can we use shell variables in AWK like $VAR instead of $1, $2? For example:
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW)
NUSR=`echo ${UL[*]}|awk -F, '{print NF}'`
echo $NUSR
echo ${UL[*]}|awk -F, '{print $NUSR}'
Actually am an oracle DBA we get lot of import requests. I'm trying to automate it using the script. The script will find out the users in the dump and prompt for the users to which dump needs to be loaded.
Suppose the dumps has two users AKHIL, SWATHI (there can be may users in the dump and i want to import more number of users). I want to import the dumps to new users AKHIL_NEW and SWATHI_NEW. So the input to be read some think like AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW.
First, I need to find the Number of users to be created, then I need to get new users i.e. AKHIL_NEW,SWATHI_NEW from the input we have given. So that I can connect to the database and create the new users and then import. I'm not copying the entire code: I just copied the code from where it accepts the input users.
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW) ## it can be many users like USER1:USER1_NEW,USER2_USER2_NEW,USER3:USER_NEW..
NUSR=`echo ${UL[*]}|awk -F, '{print NF}'` #finding number of fields or users
y=1
while [ $y -le $NUSR ] ; do
USER=`echo ${UL[*]}|awk -F, -v NUSR=$y '{print $NUSR}' |awk -F: '{print $2}'` #getting Users to created AKHIL_NEW and SWATHI_NEW and passing to SQLPLUS
if [[ $USER = SCPO* ]]; then
TBS=SCPODATA
else
if [[ $USER = WWF* ]]; then
TBS=WWFDATA
else
if [[ $USER = STSC* ]]; then
TBS=SCPODATA
else
if [[ $USER = CSM* ]]; then
TBS=CSMDATA
else
if [[ $USER = TMM* ]]; then
TBS=TMDATA
else
if [[ $USER = IGP* ]]; then
TBS=IGPDATA
fi
fi
fi
fi
fi
fi
sqlplus -s '/ as sysdba' <<EOF # CREATING the USERS in the database
CREATE USER $USER IDENTIFIED BY $USER DEFAULT TABLESPACE $TBS TEMPORARY TABLESPACE TEMP QUOTA 0K on SYSTEM QUOTA UNLIMITED ON $TBS;
GRANT
CONNECT,
CREATE TABLE,
CREATE VIEW,
CREATE SYNONYM,
CREATE SEQUENCE,
CREATE DATABASE LINK,
RESOURCE,
SELECT_CATALOG_ROLE
to $USER;
EOF
y=`expr $y + 1`
done
impdp sysem/manager DIRECTORY=DATA_PUMP DUMPFILE=imp.dp logfile=impdp.log SCHEMAS=AKHIL,SWATHI REMPA_SCHEMA=${UL[*]}
In the last impdp command I need to get the original users in the dumps i.e AKHIL,SWATHI using the variables.
Yes, you can use the shell variables inside awk. There are a bunch of ways of doing it, but my favorite is to define a variable with the -v flag:
$ echo | awk -v my_var=4 '{print "My var is " my_var}'
My var is 4
Just pass the environment variable as a parameter to the -v flag. For example, if you have this variable:
$ VAR=3
$ echo $VAR
3
Use it this way:
$ echo | awk -v env_var="$VAR" '{print "The value of VAR is " env_var}'
The value of VAR is 3
Of course, you can give the same name, but the $ will not be necessary:
$ echo | awk -v VAR="$VAR" '{print "The value of VAR is " VAR}'
The value of VAR is 3
A note about the $ in awk: unlike bash, Perl, PHP etc., it is not part of the variable's name but instead an operator.
Awk and Gawk provide the ENVIRON associative array that holds all exported environment variables. So in your awk script you can use ENVIRON["VarName"] to get the value of VarName, provided that VarName has been exported before running awk.
Note ENVIRON is a predefined awk variable NOT a shell environment variable.
Since I don't have enough reputation to comment on the other answers I have to include them here!
The earlier answer showing $ENVIRON is incorrect - that syntax would be expanded by the shell, and probably result in expanding to nothing.
Further earlier comments about C not being able to access environment variable is wrong. Contrary to what is said above, C (and C++) can access environment variables using the getenv("VarName") function. Many other languages provide similar access (e.g., Java: System.getenv(), Python: os.environ, Haskell System.Environment, ...). Note in all cases access to environment variables is read-only, you cannot change an environment variable in a program and get that value back to the calling script.
There are two ways to pass variables to awk: one way is defining the variable in a command line argument:
$ echo ${UL[*]}|awk -F, -v NUSR=$NUSR '{print $NUSR}'
SWATHI:SWATHI_NEW
Another way is converting the shell variable to an environment variable using export, and reading the environment variable from the ENVIRON array:
$ export NUSR
$ echo ${UL[*]}|awk -F, '{print $ENVIRON["NUSR"]}'
SWATHI:SWATHI_NEW
Update 2016: The OP has comma-separated data and wants to extract an item given its index. The index is in the shell variable NUSR. The value of NUSR is passed to awk, and awk's dollar operator extracts the item.
Note that it would be simpler to declare UL as an array of more than one element, and do the extraction in bash, and take awk out of the equation completely. This however uses 0-based indexing.
UL=(AKHIL:AKHIL_NEW SWATHI:SWATHI_NEW)
NUSR=1
echo ${UL[NUSR]} # prints SWATHI:SWATHI_NEW
There is another way, but it could cause immense confusion:
$ VarName="howdy" ; echo | awk '{print "Just saying '$VarName'"}'
Just saying howdy
$
So you are temporarily exiting the single quote environment (which would normally prevent the shell from interpreting '$') to interpret the variable and then going back into it. It has the virtue of being relatively brief.
Not sure if i understand your question.
But lets say we got a variable number=3 and we want to use it istead of $3, in awk we can do that with the following code
results="100 Mbits/sec 110 Mbits/sec 90 Mbits/sec"
number=3
speed=$(echo $results | awk '{print '"\$${number}"'}')
so the speed variable will get the value 110.
Hope this helps.
No. You can pass the value of a shell variable to an awk script just like you can pass the value of a shell variable to a C program but you cannot access a shell variable in an awk script any more than you could access a shell variable in a C program. Like C, awk is not shell. See question 24 in the comp.unix.shell FAQ at cfajohnson.com/shell/cus-faq-2.html#Q24.
One way to write your code would be:
UL="AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW"
NUSR=$(awk -F, -v ul="$UL" 'BEGIN{print gsub(FS,""); exit}')
echo "$NUSR"
echo "$UL" | awk -F, -v nusr="$NUSR" '{print $nusr}' # could have just done print $NF
but since your original starting point:
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW)
was declaring UL as an array with just one entry, you might want to rethink whatever it is you're trying to do as you may have completely the wrong approach.

Awk unable to store the value into an array

I am using a script like below
SCRIPT
declare -a GET
i=1
awk -F "=" '$1 {d[$1]++;} {for (c in d) {GET[i]=d[c];i=i+1}}' session
echo ${GET[1]} ${GET[2]}
DESCRIPTION
The problem is the GET value printed outside is not the correct value ...
I understand your question as "how can I use the results of my awk script inside the shell where awk was called". The truth is that it isn't really trivial. You wouldn't expect to be able to use the output from a C program or python script easily inside your shell. Same with awk, which is a scripting language of its own.
There are some workarounds. For a robust solution, write your results from the awk script to a file in a suitably easy format and read them from the shell. As a hack, you could also try to ready the output from awk directly into the shell using $(). Combine that with the set command and you could do:
set -- $(awk <awk script here>)
and then use $1 $2 etc. but you have to be careful with spaces in the output from awk.

How to assign the output of a program to a variable in a DCL com script on VMS?

For example, I have a perl script p.pl that writes "5" to stdout. I'd like to assign that output to a variable like so:
$ x = perl p.pl ! not working code
$ ! now x would be 5
The PIPE command allows you to do Unix-ish pipelining, but DCL is not bash. Getting the output assigned to a symbol is tricky. Each PIPE segment runs in a separate subprocess (like Unix) and there's no way to return a symbol from a subprocess. AFAIK, there's no bash equivalent of assigning stdout to a variable.
The typical approach is to write (redirect) the output to a file and then read it back:
$ PIPE perl p.pl > temp.txt
$ open t temp.txt
$ read t x
$ close t
Another approach is to assign the return value as a JOB logical which is shared by all subprocesses. This can be done as a one-liner using PIPE:
$ PIPE perl p.pl | DEFINE/JOB RET_VALUE #SYS$PIPE
$ x = f$logical("RET_VALUE")
Since the "RET_VALUE" is shared by all processes in the job, you have to be careful of side-effects.
Look up the PIPE command. It lets you do unix like things.
I wanted to identify a particular ACE from a file's ACL and then assign the value to a variable I could refer to later in the script. I wanted to avoid the overhead of writing to/reading from a file as I had 1000s of files to iterate over. This method worked for me.
$ PIPE DIR/SEC filename | SEARCH SYS$PIPE variable | (READ SYS$PIPE variable && DEFINE/JOB/NOLOG variable &variable)
$ SHOW LOGICAL variable