Is there a way to create an awk input inactivity timer?

Is there a way to create an awk input inactivity timer? - awk

I have a text source (a log file), which gets new lines appended to it by some third party.
I can output the additions to my source file using tail -f source. I can then pipe that through an awk script awk -f parser.awk to parse and format the output.
My question is: while tail -f source | awk -f parser.awk is running, is there a way to call function foo() inside my parser.awk script every time there is more than 5 seconds elapsed without anything coming through the pipe into the standard input of the awk script?
Edit: Currently using GNU Awk 3.1.6. May be able to upgrade to newer version if required.

If your shell's read supports -t and -u, here's an ugly hack:
{ echo hello; sleep 6; echo world; } | awk 'BEGIN{
while( "while read -t 5 -u 3 line; do echo \"$line\"; done" | getline > 0 )
print
}' 3<&0
You can replace the print in the body of the while loop with your script. However, it would probably make a lot more sense to put the read timeout between tail and awk in the pipeline, and it would make even more sense to re-implement tail to timeout.

Not exactly the answer to your question. However there is a little hack in shell that can do practically what you want:
{ tail -f log.file >&2 | { while : ; do sleep 5; echo SECRET_PHRASE ; done ; } ; } 2>&1 | awk -f script.awk
When awk receives SECRET_PHRASE it will run foo function every 5 seconds. Unfortunately is will run it every 5 second even in case there was some output during this time from tail.
ps. You can replace '{}' with '()' and vice versa. In the first case it won't create subshell, in the second one it will.
The another way is to append this secret phrase dirctly to log file in case nobody wrote there during last five seconds. But looks like it's not good idea due to you will have spoiled log file.

Related

Extracting data after a tag and CR with Busybox sed

I have a script that extracts a file from a bash script combined with a binary file. It does so using the following GNU sed syntax
sed -n '/__DATA__/{n;:1;n;p;b1}' /tmp/combined.file > /tmp/binary.file
The files are assembled by cat'ing an ISO file to the end of a bash script. Which is then sent over the network to an embedded device and extracted on the device, piping the ISO file to a temporary dir and executing the bash script to install it.
However, on executing this I get a
sed: unterminated {
Am I missing something here? Is this task possible with BusyBox sed?

It tried the "Second attempt" below with OSX/BSD awk and it failed, just printing up til the first NUL character. So you can't do this job portably with awk or sed.
Here's what should work everywhere given that the POSIX standard says
the input file to tail can be any type
so the input to tail doesn't have to be a POSIX text file (no NULs) and we're exiting from awk before the first NUL is encountered in the input so they should both be happy:
$ tail -n +"$(awk '/^__DATA__$/{print NR+2; exit}' binary.bin)" binary.bin | cat -ev
ER^H^#^#^#M-^PM-^P^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#3��M-^Nռ^#|��f1�f1�fSfQ^FWM-^N�M-^N�R�^#|�^#^F�^#^A��K^F^#^#R�A��U1�0���^Sr^VM-^A�U�u^PM-^C�^At^Kf�^F�^F�B�^U�^B1�ZQ�^H�^S[^O��#PM-^C�?Q��SRP�^#|�^D^#f��^G�D^#^OM-^BM-^#^#f#M-^#�^B��fM-^A>#|��xpu ��{�D|^#^#�M-^C^#isolinux.bin missing or corrupt.^M$
f`f1�f^C^F�{f^S^V�{fRfP^FSj^Aj^PM-^I�f�6�{��^FM-^H�M-^H�M-^R�6�{M-^H�^H�A�^A^BM-^J^V�{�^SM-^Md^Pfa��^^^#Operating system load error.^M$
^��^NM-^J>b^D�^G�^P<$
u��^X���^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#L^D^#^#^#^#^#^#�K�6^#^#M-^#^#^A^#^#?�M-^K^#^#^#^#^#`^\^#^#�������<R^#^#^#^_^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#U�EFI PART^#^#^A^#\^#^#^#]3�.^#^#^#^#^A^#^#^#^#^#^#^#�_^\^#^#^#^#^##^#^#^#^#^#^#^#�_^\^#^#^#^#^#Uc�r^Oqc#M-^Rc^F�$LZ�^L^#^#^#^#^#^#^#�^#^#^#M-^#^#^#^#�t^]F^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#$
Second attempt:
Now that I have a better idea what you're trying to do (process a file consisting of POSIX text lines up to a point and then can contain NUL characters afterwards), try this:
$ cat -ev file
echo "I: Installation finished!"$
exit 0$
$
__DATA__$
$
foo^#bar^#etc
$ cat tst.awk
/^__DATA__$/ { n=NR + 1 }
n && (NR == n) { RS="\0"; ORS="" }
n && (NR > n) { print (c++ ? RS : "") $0 }
$ awk -f tst.awk file | cat -ev
foo^#bar^#etc
The above doesn't try to store any input lines containing NUL in memory, instead it reads \n-terminated text lines until it reaches the line after the one containing __DATA__ and then switches to reading NUL-terminated records into memory and printing NULs between them on output.
It's still undefined behavior per POSIX (see my comments below) but in theory it should work since it just relies on being able to set one variable (RS) to NUL rather than trying to store input strings that contain NULs. Also, setting RS to NUL has been a (flawed) workaround for awk scripts for years to be able to read a whole file into memory at once so being able to set RS to NUL should work in any modern awk.
Using the new sample you provided with the missing blank line after the __DATA__ line added:
$ cat -ev file
#!/bin/bash$
$
echo "I: Awesome Things happened here"$
exit 0$
$
__DATA__$
$
ER^H^#^#^#M-^PM-^P^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#3M-mM-zM-^NM-UM-<^#|M-{M-|f1M-[f1M-IfSfQ^FWM-^NM-]M-^NM-ERM->^#|M-?^#^FM-9^#^AM-sM-%M-jK^F^#^#RM-4AM-;M-*U1M-I0M-vM-yM-M^Sr^VM-^AM-{UM-*u^PM-^CM-a^At^KfM-G^FM-s^FM-4BM-k^UM-k^B1M-IZQM-4^HM-M^S[^OM-6M-F#PM-^CM-a?QM-wM-aSRPM-;^#|M-9^D^#fM-!M-0^GM-hD^#^OM-^BM-^#^#f#M-^#M-G^BM-bM-rfM-^A>#|M-{M-#xpu M-zM-<M-l{M-jD|^#^#M-hM-^C^#isolinux.bin missing or corrupt.^M$
f`f1M-Rf^C^FM-x{f^S^VM-|{fRfP^FSj^Aj^PM-^IM-ffM-w6M-h{M-#M-d^FM-^HM-aM-^HM-EM-^RM-v6M-n{M-^HM-F^HM-aAM-8^A^BM-^J^VM-r{M-M^SM-^Md^PfaM-CM-h^^^#Operating system load error.^M$
^M-,M-4^NM-^J>b^DM-3^GM-M^P<$
uM-qM-M^XM-tM-kM-}^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#L^D^#^#^#^#^#^#M-/KM-66^#^#M-^#^#^A^#^#?M-`M-^K^#^#^#^#^#`^\^#^#M-~M-^?M-^?M-oM-~M-^?M-^?<R^#^#^#^_^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#UM-*EFI PART^#^#^A^#\^#^#^#]3M-%.^#^#^#^#^A^#^#^#^#^#^#^#M-^?_^\^#^#^#^#^##^#^#^#^#^#^#^#M-J_^\^#^#^#^#^#UcM-)r^Oqc#M-^Rc^FM-2$LZM-p^L^#^#^#^#^#^#^#M-P^#^#^#M-^#^#^#^#M-{t^]F^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#$
.
$ awk -f tst.awk file | cat -ev
ER^H^#^#^#M-^PM-^P^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#3M-mM-zM-^NM-UM-<^#|M-{M-|f1M-[f1M-IfSfQ^FWM-^NM-]M-^NM-ERM->^#|M-?^#^FM-9^#^AM-sM-%M-jK^F^#^#RM-4AM-;M-*U1M-I0M-vM-yM-M^Sr^VM-^AM-{UM-*u^PM-^CM-a^At^KfM-G^FM-s^FM-4BM-k^UM-k^B1M-IZQM-4^HM-M^S[^OM-6M-F#PM-^CM-a?QM-wM-aSRPM-;^#|M-9^D^#fM-!M-0^GM-hD^#^OM-^BM-^#^#f#M-^#M-G^BM-bM-rfM-^A>#|M-{M-#xpu M-zM-<M-l{M-jD|^#^#M-hM-^C^#isolinux.bin missing or corrupt.^M$
f`f1M-Rf^C^FM-x{f^S^VM-|{fRfP^FSj^Aj^PM-^IM-ffM-w6M-h{M-#M-d^FM-^HM-aM-^HM-EM-^RM-v6M-n{M-^HM-F^HM-aAM-8^A^BM-^J^VM-r{M-M^SM-^Md^PfaM-CM-h^^^#Operating system load error.^M$
^M-,M-4^NM-^J>b^DM-3^GM-M^P<$
uM-qM-M^XM-tM-kM-}^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#L^D^#^#^#^#^#^#M-/KM-66^#^#M-^#^#^A^#^#?M-`M-^K^#^#^#^#^#`^\^#^#M-~M-^?M-^?M-oM-~M-^?M-^?<R^#^#^#^_^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#UM-*EFI PART^#^#^A^#\^#^#^#]3M-%.^#^#^#^#^A^#^#^#^#^#^#^#M-^?_^\^#^#^#^#^##^#^#^#^#^#^#^#M-J_^\^#^#^#^#^#UcM-)r^Oqc#M-^Rc^FM-2$LZM-p^L^#^#^#^#^#^#^#M-P^#^#^#M-^#^#^#^#M-{t^]F^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#$
Original answer:
Assuming this question is related to your previous question, this will work using any awk in any shell on every UNIX box:
$ awk '/^__DATA__$/{n=NR+1} n && NR>n' file
3<ED>M-^PM-^PM-^PM-^PM-^
When it finds __DATA__ it sets a variable n to the line number to start printing after and then when n is set prints every line for which the line number is greater than n.
The above was run against this input file from your previous question:
$ cat -ev file
echo "I: Installation finished!"$
exit 0$
$
__DATA__$
$
3<ED>M-^PM-^PM-^PM-^PM-^$

TCL (expect): can't read "NF" (from awk): no such variable

Attempting to get the last word of the first line from a file. Not sure why the following command:
send "cat moo.txt | grep QUACK * | awk 'NF>1{print $NF}' meow.txt >> bark.txt "
is getting the error message can't read "NF": no such variable.
I can run the awk 'NF>1{print $NF}' meow.txt >> bark.txt snippet just fine on my machine. Yet, when it runs in my expect script, it gives me that error.
Anyone know why expect doesn't recognize the awk built-in variable?

I think your script is trying to expand the variable $NF with it's value before shooting that command through send. $NF isn't set in your shell since it's internal to awk, which hasn't had a chance to even run yet and so it's balking.
Try escaping that variable so it is treated as a string literal and awk will be able to use it when it comes time for awk to run:
send "cat moo.txt | grep QUACK * | awk 'NF>1{print \$NF}' meow.txt >> bark.txt "

Exit when the result is ready and do not wait for the rest job?

I want to exit immediately when the result has been provided and do not wait for the rest of the jobs. I provided three examples by different approaches, i.e. awk, head and read. I want to exit after the '1' is shown in the following example without waiting for sleep. But none of the do not work. Is there any guy to help me?
(echo 1; sleep 10; seq 10) | head -n 1
(echo 1; sleep 10; seq 10) | awk -e 'NR==1{print $1;exit}'
(echo 1; sleep 10; seq 10) | ./test.sh
where the test.sh is the following:
while read -r -d $'\n' x
do
echo "$x"
exit
done

Refactor Using Bash Process Substitution
I want to exit after the '1' is shown in the following example without waiting for sleep.
By default, Bash shell pipelines wait for each pipeline segment to complete before processing the next segment of the pipeline. This is usually the expected behavior, because otherwise your commands wouldn't be able to act on the completed output of from each pipeline element. For example, how could sort do its job in a pipeline if it doesn't have all the data available at once?
In this specific case, you can do what you want, but you have to refactor your code so that awk is reading from process substitution rather than a pipe. For example:
$ time awk -e 'NR==1 {print $1; exit}' < <(echo 1; sleep 10; seq 10)
1
real 0m0.004s
user 0m0.001s
sys 0m0.002s
From the timings, you can see that the process exits when awk does. This may not be how you want to do it, but it certainly does what you want to accomplish with a minimum of fuss. Your mileage with non-Bash shells may vary.
Asynchronous Pipelines
Asynchronous pipelines are not really a generic solution, but using one works sufficiently to accomplish your goals for the given use case. The following returns immediately:
$ { echo 1 & sleep 10 & seq 10 & } | awk -e 'NR==1 {print $1; exit}'
1
because the commands in the command list are run asynchronously. When you run commands asynchronously in Bash:
The shell does not wait for the command to finish, and the return status is 0 (true).
However, note that this only appears to do what you want. Your other commands (e.g. sleep and seq) are actually still running in the background. You can validate this with:
$ { echo 1 & sleep 10 & seq 10 & } | awk -e 'NR==1 {print $1; exit}'; pgrep sleep
1
14921
As you can see, this allows awk to process the output of echo without waiting for the entire list of commands to complete, but it doesn't really short-circuit the execution of the command list. Process substitution is still likely to be the right solution, but it's always good to know you have alternatives.

Can we use shell variables in awk?

Can we use shell variables in AWK like $VAR instead of $1, $2? For example:
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW)
NUSR=`echo ${UL[*]}|awk -F, '{print NF}'`
echo $NUSR
echo ${UL[*]}|awk -F, '{print $NUSR}'
Actually am an oracle DBA we get lot of import requests. I'm trying to automate it using the script. The script will find out the users in the dump and prompt for the users to which dump needs to be loaded.
Suppose the dumps has two users AKHIL, SWATHI (there can be may users in the dump and i want to import more number of users). I want to import the dumps to new users AKHIL_NEW and SWATHI_NEW. So the input to be read some think like AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW.
First, I need to find the Number of users to be created, then I need to get new users i.e. AKHIL_NEW,SWATHI_NEW from the input we have given. So that I can connect to the database and create the new users and then import. I'm not copying the entire code: I just copied the code from where it accepts the input users.
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW) ## it can be many users like USER1:USER1_NEW,USER2_USER2_NEW,USER3:USER_NEW..
NUSR=`echo ${UL[*]}|awk -F, '{print NF}'` #finding number of fields or users
y=1
while [ $y -le $NUSR ] ; do
USER=`echo ${UL[*]}|awk -F, -v NUSR=$y '{print $NUSR}' |awk -F: '{print $2}'` #getting Users to created AKHIL_NEW and SWATHI_NEW and passing to SQLPLUS
if [[ $USER = SCPO* ]]; then
TBS=SCPODATA
else
if [[ $USER = WWF* ]]; then
TBS=WWFDATA
else
if [[ $USER = STSC* ]]; then
TBS=SCPODATA
else
if [[ $USER = CSM* ]]; then
TBS=CSMDATA
else
if [[ $USER = TMM* ]]; then
TBS=TMDATA
else
if [[ $USER = IGP* ]]; then
TBS=IGPDATA
fi
fi
fi
fi
fi
fi
sqlplus -s '/ as sysdba' <<EOF # CREATING the USERS in the database
CREATE USER $USER IDENTIFIED BY $USER DEFAULT TABLESPACE $TBS TEMPORARY TABLESPACE TEMP QUOTA 0K on SYSTEM QUOTA UNLIMITED ON $TBS;
GRANT
CONNECT,
CREATE TABLE,
CREATE VIEW,
CREATE SYNONYM,
CREATE SEQUENCE,
CREATE DATABASE LINK,
RESOURCE,
SELECT_CATALOG_ROLE
to $USER;
EOF
y=`expr $y + 1`
done
impdp sysem/manager DIRECTORY=DATA_PUMP DUMPFILE=imp.dp logfile=impdp.log SCHEMAS=AKHIL,SWATHI REMPA_SCHEMA=${UL[*]}
In the last impdp command I need to get the original users in the dumps i.e AKHIL,SWATHI using the variables.

Yes, you can use the shell variables inside awk. There are a bunch of ways of doing it, but my favorite is to define a variable with the -v flag:
$ echo | awk -v my_var=4 '{print "My var is " my_var}'
My var is 4
Just pass the environment variable as a parameter to the -v flag. For example, if you have this variable:
$ VAR=3
$ echo $VAR
3
Use it this way:
$ echo | awk -v env_var="$VAR" '{print "The value of VAR is " env_var}'
The value of VAR is 3
Of course, you can give the same name, but the $ will not be necessary:
$ echo | awk -v VAR="$VAR" '{print "The value of VAR is " VAR}'
The value of VAR is 3
A note about the $ in awk: unlike bash, Perl, PHP etc., it is not part of the variable's name but instead an operator.

Awk and Gawk provide the ENVIRON associative array that holds all exported environment variables. So in your awk script you can use ENVIRON["VarName"] to get the value of VarName, provided that VarName has been exported before running awk.
Note ENVIRON is a predefined awk variable NOT a shell environment variable.
Since I don't have enough reputation to comment on the other answers I have to include them here!
The earlier answer showing $ENVIRON is incorrect - that syntax would be expanded by the shell, and probably result in expanding to nothing.
Further earlier comments about C not being able to access environment variable is wrong. Contrary to what is said above, C (and C++) can access environment variables using the getenv("VarName") function. Many other languages provide similar access (e.g., Java: System.getenv(), Python: os.environ, Haskell System.Environment, ...). Note in all cases access to environment variables is read-only, you cannot change an environment variable in a program and get that value back to the calling script.

There are two ways to pass variables to awk: one way is defining the variable in a command line argument:
$ echo ${UL[*]}|awk -F, -v NUSR=$NUSR '{print $NUSR}'
SWATHI:SWATHI_NEW
Another way is converting the shell variable to an environment variable using export, and reading the environment variable from the ENVIRON array:
$ export NUSR
$ echo ${UL[*]}|awk -F, '{print $ENVIRON["NUSR"]}'
SWATHI:SWATHI_NEW
Update 2016: The OP has comma-separated data and wants to extract an item given its index. The index is in the shell variable NUSR. The value of NUSR is passed to awk, and awk's dollar operator extracts the item.
Note that it would be simpler to declare UL as an array of more than one element, and do the extraction in bash, and take awk out of the equation completely. This however uses 0-based indexing.
UL=(AKHIL:AKHIL_NEW SWATHI:SWATHI_NEW)
NUSR=1
echo ${UL[NUSR]} # prints SWATHI:SWATHI_NEW

There is another way, but it could cause immense confusion:
$ VarName="howdy" ; echo | awk '{print "Just saying '$VarName'"}'
Just saying howdy
$
So you are temporarily exiting the single quote environment (which would normally prevent the shell from interpreting '$') to interpret the variable and then going back into it. It has the virtue of being relatively brief.

Not sure if i understand your question.
But lets say we got a variable number=3 and we want to use it istead of $3, in awk we can do that with the following code
results="100 Mbits/sec 110 Mbits/sec 90 Mbits/sec"
number=3
speed=$(echo $results | awk '{print '"\$${number}"'}')
so the speed variable will get the value 110.
Hope this helps.

No. You can pass the value of a shell variable to an awk script just like you can pass the value of a shell variable to a C program but you cannot access a shell variable in an awk script any more than you could access a shell variable in a C program. Like C, awk is not shell. See question 24 in the comp.unix.shell FAQ at cfajohnson.com/shell/cus-faq-2.html#Q24.
One way to write your code would be:
UL="AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW"
NUSR=$(awk -F, -v ul="$UL" 'BEGIN{print gsub(FS,""); exit}')
echo "$NUSR"
echo "$UL" | awk -F, -v nusr="$NUSR" '{print $nusr}' # could have just done print $NF
but since your original starting point:
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW)
was declaring UL as an array with just one entry, you might want to rethink whatever it is you're trying to do as you may have completely the wrong approach.

Problem with awk and grep

I am using the following script to get the running process to print the id, command..
if [ "`uname`" = "SunOS" ]
then
awk_c="nawk"
ps_d="/usr/ucb/"
time_parameter=7
else
awk_c="awk"
ps_d=""
time_parameter=5
fi
main_class=RiskEngine
connection_string=db.regression
AWK_CMD='BEGIN{printf "%-15s %-6s %-8s %s\n","ID","PID","STIME","Cmd"} {printf "%-15s %-6s %-8s %s %s %s\n","MY_APP",$2,$time_parameter, main_class, connection_string, port}'
while getopts ":pnh" opt; do
case $opt in
p) AWK_CMD='{ print $2 }'
do_print_message=1;;
n) AWK_CMD='{printf "%-15s %-6s %-8s %s %s %s\n","MY_APP",$2,$time_parameter,main_class, connection_string, port}' ;;
h) print "usage : `basename ${0}` {-p} {-n} : Returns details of process running "
print " -p : Returns a list of PIDS"
print " -n : Returns process list without preceding header"
exit 1 ;
esac
done
ps auxwww | grep $main_class | grep 10348 | grep -v grep | ${awk_c} -v main_class=$merlin_main_class -v connection_string=$merlin_connection_
string -v port=10348 -v time_parameter=$time_parameter "$AWK_CMD"
# cat /etc/redhat-release
Red Hat Enterprise Linux AS release 4 (Nahant Update 6)
# uname -a
Linux deapp25v 2.6.9-67.0.4.EL #1 Fri Jan 18 04:49:54 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
When I am executing the following from the script independently or inside script
# ps auxwww | grep $main_class | grep 10348 | grep -v grep | ${awk_c} -v main_class=$merlin_main_class -v connection_string=$merlin_connection_string -v port=10348 -v time_parameter=$time_parameter "$AWK_CMD"
I get two rows on Linux:
ID PID STIME Cmd
MY_APP 6217 2355352 RiskEngine 10348
MY_APP 21874 5316 RiskEngine 10348
I just have one jvm (Java command) running in the background but still I see 2 rows.
I know one of them (Duplicate with pid 21874) comes from awk command that I am executing. It includes again the main class and the port so two rows. Can you please help me to avoid the one that is duplicate row?
Can you please help me?

AWK can do all that grepping for you.
Here is a simple example of how an AWK command can be selective:
ps auxww | awk -v select="$mainclass" '$0 ~ select && /10348/ && ! (/grep/ || /awk/) && {print}'
ps can be made to selectively output fields which will help a little to reduce false positives. However pgrep may be more useful to you since all you're really using is the PID from the result.
pgrep -f "$mainclass.*10348"

I've reformatted the code as code, but you need to learn that the return key is your friend. The monstrously long pipelines should be split over multiple lines - I typically use one line per command in the pipeline. You can also write awk scripts on more than one line. This makes your code more readable.
Then you need to explain to us what you are up to.
However, it is likely that you are using 'awk' as a variant on grep and are finding that the value 10348 (possibly intended as a port number on some command line) is also in the output of ps as one of the arguments to awk (as is the 'main_class' value), so you get the extra information. You'll need to revise the awk script to eliminate (ignore) the line that contains 'awk'.
Note that you could still be bamboozled by a command running your main class on port 9999 (any value other than 10348) if it so happens that it is run by a process with PID or PPID equal to 10348. If you're going to do the job thoroughly, then the 'awk' script needs to analyze only the 'command plus options' part of the line.

You're already using the grep -v grep trick in your code, why not just update it to exclude the awk process as well with grep -v ${awk_c}?
In other words, the last line of your script would be (on one line and with the real command parameters to awk rather than blah blah blah).:
ps auxwww
| grep $main_class
| grep 10348
| grep -v grep
| grep -v ${awk_c}
| ${awk_c} -v blah blah blah
This will ensure the list of processes will not containg any with the word awk in it.
Keep in mind that it's not always a good idea to do it this way (false positives) but, since you're already taking the risk with processes containing grep, you may as well do so with those containing awk as well.

You can add this simple code in front of all your awk args:
'!/awk/ { .... original awk code .... }'
The '!/awk/' will have the effect of telling awk to ignore any line containing the string awk.
You could also remove your 'grep -v' if you extended my awk suggestion into something like:
'!/awk/ && !/grep/ { ... original awk code ... }'.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Is there a way to create an awk input inactivity timer? - awk

Related

Extracting data after a tag and CR with Busybox sed

TCL (expect): can't read "NF" (from awk): no such variable

Exit when the result is ready and do not wait for the rest job?

Can we use shell variables in awk?

Problem with awk and grep

Categories

Resources