execute a system() method in awk if a condition matches - awk

I am trying to check if the gcc version and based on the result
execute another system call (scl enable devtoolset-7 bash).
Below is what I have tried so far
gcc --version | awk '/gcc/ && ($3+0)<7.0{print "Current Version",$3,"is less than 7.5" system("scl enable devtoolset-7 bash")}'
I wish to know how to get this done correctly in awk. Also, I am open to other simpler approaches to do the above, if any.

Don't complicate things by having awk spawn a subshell to call another command if you don't have to:
if [[ gcc --version | awk '/gcc/ && ($3+0)<7.0{print "Current Version",$3,"is less than 7.5"; f=1} END{exit !f}' >&2 ]]; then
scl enable devtoolset-7 bash
fi
Note that your test is <7.0 but message says <7.5. You might want to make it this instead to ensure consistency:
if [[ gcc --version | awk -v minVer='7.0' '/gcc/ && ($3+0)<(minVer+0){print "Current Version",$3,"is less than",minVer; f=1} END{exit !f}' >&2 ]]; then
scl enable devtoolset-7 bash
fi
In general software versions can't be directly compared as either strings or numbers, though, so you might have to think about your comparison. e.g. if you wanted to compare 1.2.3 vs 1.2.4 both would be truncated to 1.2 in a numeric comparison and declared equal, but if you wanted to compare 2 different versions like 12.3 vs 5.7, the 5.7 would be considered larger in a string comparison.

Related

Process part of a line through the shell pipe

I would like to process part of each line of command output, leaving the rest untouched.
Problem
Let's say I have some du output:
❯ du -xhd 0 /usr/lib/gr*
3.2M /usr/lib/GraphicsMagick-1.3.40
584K /usr/lib/grantlee
12K /usr/lib/graphene-1.0
4.2M /usr/lib/graphviz
4.0K /usr/lib/grcrt1.o
224K /usr/lib/groff
Now I want to process each path with another command, for example running pacman -Qo on it, leaving the remainder of the line untouched.
Approach
I know I can use awk {print $2} to get only the path, and could probably use it in a convoluted for loop to weld it back together, but maybe there is an elegant way, ideally easy to type on the fly, producing this in the end:
3.2M /usr/lib/GraphicsMagick-1.3.40/ is owned by graphicsmagick 1.3.40-2
584K /usr/lib/grantlee/ is owned by grantlee 5.3.1-1
12K /usr/lib/graphene-1.0/ is owned by graphene 1.10.8-1
4.2M /usr/lib/graphviz/ is owned by graphviz 7.1.0-1
4.0K /usr/lib/grcrt1.o is owned by glibc 2.36-7
224K /usr/lib/groff/ is owned by groff 1.22.4-7
Workaround
This is the convoluted contraption I am living with for now:
❯ du -xhd 0 /usr/lib/gr* | while read line; do echo "$line $(pacman -Qqo $(echo $line | awk '{print $2}') | paste -s -d',')"; done | column -t
3.2M /usr/lib/GraphicsMagick-1.3.40 graphicsmagick
584K /usr/lib/grantlee grantlee,grantleetheme
12K /usr/lib/graphene-1.0 graphene
4.2M /usr/lib/graphviz graphviz
4.0K /usr/lib/grcrt1.o glibc
224K /usr/lib/groff groff
But multiple parts of it are pacman-specific.
du -xhd 0 /usr/lib/gr* | while read line; do echo "$line" | awk -n '{ORS=" "; print $1}'; pacman --color=always -Qo $(echo $line | awk '{print $2}') | head -1; done | column -t
3.2M /usr/lib/GraphicsMagick-1.3.40/ is owned by graphicsmagick 1.3.40-2
584K /usr/lib/grantlee/ is owned by grantlee 5.3.1-1
12K /usr/lib/graphene-1.0/ is owned by graphene 1.10.8-1
4.2M /usr/lib/graphviz/ is owned by graphviz 7.1.0-1
4.0K /usr/lib/grcrt1.o is owned by glibc 2.36-7
224K /usr/lib/groff/ is owned by groff 1.22.4-7
This is a more generic solution, but what if there are three columns of output and I want to process only the middle one?
It grows in complexity, and I thought there must be a simpler way avoiding duplication.
Use a bash loop
(
IFS=$'\t'
while read -r -a fields; do
fields[1]=$(pacman -Qo "${fields[1]}")
printf '%s\n' "${fields[*]}"
done
)
Use a simple shell loop.
du -xhd 0 /usr/lib/gr* |
while read -r size package; do
pacman --color=always -Qo "$package" |
awk -v sz="$size" '{
printf "%s is owned by %s\n", sz, $0 }'
done
If you want to split out parts of the output from pacman, Awk makes that easy to do; for example, the package name is probably in Awk's $1 and the version in $2.
(Sorry, don't have pacman here; perhaps edit your question to show its output if you need more details. Going forward, please take care to ask the actual question you need help with, so you don't have to move the goalposts by editing after you have received replies - this is problematic for many reasons, not least of which because the answers you already received will seem wrong or unintelligible if they no longer answer the question as it stands after your edit.)
These days, many tools have options to let you specify which fields exactly you want to output, and a formatting option to produce them in machine-readable format. The pacman man page mentions a --machinereadable option, though it does not seem to be of particular use here. Many modern tools will produce JSON, which can be unwieldy to handle in shell scripts, but easy if you have a tool like jq which understands JSON format (less convenient if the only available output format is XML; some tools will let you get the result as CSV, which is mildly clumsy but relatively easy to parse). Maybe also look for an option like --format for specifying how exactly to arrange the output. (In curl it's called -w/--write-out.)

Using awk to set first character to lowercase UNIX

I've written the following bash script that utilizes awk, the aim is to set the first character to lower case. The script works mostly fine, however I'm adding an extra space when I concat the two values. Any ideas how to remove this errant space?
Script:
#!/bin/bash
foo="MyCamelCaseValue"
awk '{s=tolower(substr($1,1,1))}{g=substr($1,2,length($1))}{print s,g}' <<<$foo
Output:
m yCamelCaseValue
edit:
Please see discussion from Bobdylan and RavinderSingh13 on accepted answer as it highlights issues with default MacOs bash version.
bash --version
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin19)
Copyright (C) 2007 Free Software Foundation, Inc.
You were close and good try, you need not get length of line here, substr is intelligent enough to get the rest of the length of you mention a character position and don't give till where it should print(length value). Could you please try following.
(Usually these kind of problems could be solved by bash itself but when OP tried bash solution provided by #bob dylan its having issues because of OLD version of BASH, hence I am undeleting this one which is working for OP)
echo "$foo" | awk '{print tolower(substr($0,1,1)) substr($0,2)}' Input_file
Explanation:
Use substr function of awk to get sub-strings in current line.
Then grab the very first letter by substr($0,1,1) and wrap it inside tolower to make it in small case.
Now print rest of the line(since first character is already being captures by previous substr) by doing `substr($0,2) this will print from 2nd character to last of line.
EDIT by #bob dylan:
https://www.shell-tips.com/mac/upgrade-bash/
MacOS comes with an older version of bash. However if you're on 4+ you should be able to use the native bash function to translate the first character from upper to lower:
$ bash --version
GNU bash, version 4.4.19(1)-release (x86_64-redhat-linux-gnu)
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
$ cat foo.sh
#!/bin/bash
foo="MyCamelCaseValue"
echo "${foo,}"
$ bash foo.sh
myCamelCaseValue
Further examples for the whole string, to lower, to upper etc.
$ echo $foo
myCamelCaseValue
echo "${foo,}"
myCamelCaseValue
$ echo "${foo,,}"
mycamelcasevalue
$ echo "${foo^}"
MyCamelCaseValue
$ echo "${foo^^}"
MYCAMELCASEVALUE

Awk Greater Than Less Than

I am using this command
num1=2.2
num2=4.5
result=$(awk 'BEGIN{print ($num2>$num1)?1:0}')
This always returns 0. Whether num2>numl or num1>num2
But when I put the actual numbers as such
result=$(awk 'BEGIN{print (4.5>2.2)?1:0}')
I would get a return value of 1. Which is correct.
What can I do to make this work?
The reason it fails when you use variables is because the awk script enclosed by single quotes is evaluated by awk and not bash: so if you'd like to pass variables you are using from bash to awk, you'll have to specify it with the -v option as follows:
num1=2.2
num2=4.5
result=$(awk -v n1=$num1 -v n2=$num2 'BEGIN{print (n2>n1)?1:0}')
Note that program variables used inside the awk script must not be prefixed with $
Try doing this :
result=$(awk -v num1=2.2 -v num2=4.5 'BEGIN{print (num2 > num1) ? 1 : 0}')
See :
man awk | less +/'^ *-v'
Because $num1 and $num2 are not expanded by bash -- you are using single quotes. The following will work, though:
result=$(awk "BEGIN{print ($num2>$num1)?1:0}")
Note, however, as pointed out in the comments that this is poor coding style and mixing bash and awk. Personally, I don't mind such constructs; but in general, especially for complex things and if you don't remember what things will get evaluated by bash when in double quotes, turn to the other answers to this question.
See the excellent example from #EdMorton below in the comments.
EDIT: Actually, instead of awk, I would use bc:
$num1=2.2
$num2=4.5
result=$( echo "$num2 > $num1" | bc )
Why? Because it is just a bit clearer... and lighter.
Or with Perl (because it is shorter and because I like Perl more than awk and because I like backticks more than $():
result=`perl -e "print ( $num2 > $num1 ) ? 1 : 0;"`
Or, to be fancy (and probably inefficient):
if [ `echo -e "$num1\n$num2" | sort -n | head -1` != "$num1" ] ; then result=0 ; else result=1 ; fi
(Yes, I know)
I had a brief, intensive, 3-year long exposure to awk, in prehistoric times. Nowadays bash is everywhere and can do loads of stuff (I had sh/csh only at that time) so often it can be used instead of awk, while computers are fast enough for Perl to be used in ad hoc command lines instead of awk. Just sayin'.
This might work for you:
result=$(awk 'BEGIN{print ('$num2'>'$num1')?1:0}')
Think of the ''s as like poking holes through the awk command to the underlying bash shell.

Implementing `make check` or `make test`

How can I implement a simple regression test framework with Make? (I’m using GNU Make, if that matters.)
My current makefile looks something like this (edited for simplicity):
OBJS = jscheme.o utility.o model.o read.o eval.o print.o
%.o : %.c jscheme.h
gcc -c -o $# $<
jscheme : $(OBJS)
gcc -o $# $(OBJS)
.PHONY : clean
clean :
-rm -f jscheme $(OBJS)
I’d like to have a set of regression tests, e.g., expr.in testing a “good” expression & unrecognized.in testing a “bad” one, with expr.cmp & unrecognized.cmp being the expected output for each. Manual testing would look like this:
$ jscheme < expr.in > expr.out 2>&1
$ jscheme < unrecognized.in > unrecognized.out 2>&1
$ diff -q expr.out expr.cmp # identical
$ diff -q unrecognized.out unrecognized.cmp
Files unrecognized.out and unrecognized.cmp differ
I thought to add a set of rules to the makefile looking something like this:
TESTS = expr.test unrecognized.test
.PHONY test $(TESTS)
test : $(TESTS)
%.test : jscheme %.in %.cmp
jscheme < [something.in] > [something.out] 2>&1
diff -q [something.out] [something.cmp]
My questions:
• What do I put in the [something] placeholders?
• Is there a way to replace the message from diff with a message saying, “Test expr failed”?
Your original approach, as stated in the question, is best. Each of your tests is in the form of a pair of expected inputs and outputs. Make is quite capable of iterating through these and running the tests; there is no need to use a shell for loop. In fact, by doing this you are losing the opportunity to run your tests in parallel, and are creating extra work for yourself in order to clean up temp files (which are not needed).
Here's a solution (using bc as an example):
SHELL := /bin/bash
all-tests := $(addsuffix .test, $(basename $(wildcard *.test-in)))
.PHONY : test all %.test
BC := /usr/bin/bc
test : $(all-tests)
%.test : %.test-in %.test-cmp $(BC)
#$(BC) <$< 2>&1 | diff -q $(word 2, $?) - >/dev/null || \
(echo "Test $# failed" && exit 1)
all : test
#echo "Success, all tests passed."
The solution directly addresses your original questions:
The placeholders you're looking for are $< and $(word 2, $?) corresponding to the prerequisites %.test-in and %.test-cmp respectively. Contrary to the #reinierpost comment temp files are not needed.
The diff message is hidden and replaced using echo.
The makefile should be invoked with make -k to run all the tests regardless of whether an individual test fails or succeeds.
make -k all will only run if all the tests succeed.
We avoid enumerating each test manually when defining the all-tests variable by leveraging the file naming convention (*.test-in) and the GNU make functions for file names. As a bonus this means the solution scales to tens of thousands of tests out of the box, as the length of variables is unlimited in GNU make. This is better than the shell based solution which will fall over once you hit the operating system command line limit.
Make a test runner script that takes a test name and infers the input filename, output filename and smaple data from that:
#!/bin/sh
set -e
jscheme < $1.in > $1.out 2>&1
diff -q $1.out $1.cmp
Then, in your Makefile:
TESTS := expr unrecognised
.PHONY: test
test:
for test in $(TESTS); do bash test-runner.sh $$test || exit 1; done
You could also try implementing something like automake's simple test framework.
What I ended up with looks like this:
TESTS = whitespace list boolean character \
literal fixnum string symbol quote
.PHONY: clean test
test: $(JSCHEME)
for t in $(TESTS); do \
$(JSCHEME) < test/$$t.ss > test/$$t.out 2>&1; \
diff test/$$t.out test/$$t.cmp > /dev/null || \
echo Test $$t failed >&2; \
done
It’s based on Jack Kelly’s idea, with Jonathan Leffler’s tip included.
I'll address just your question about diff. You can do:
diff file1 file2 > /dev/null || echo Test blah blah failed >&2
although you might want to use cmp instead of diff.
On another note, you might find it helpful to go ahead and take
the plunge and use automake. Your Makefile.am (in its entirety)
will look like:
bin_PROGRAMS = jscheme
jscheme_SOURCES = jscheme.c utility.c model.c read.c eval.c print.c jscheme.h
TESTS = test-script
and you will get a whole lot of really nice targets for free, including a pretty full-featured test framework.

Problem with awk and grep

I am using the following script to get the running process to print the id, command..
if [ "`uname`" = "SunOS" ]
then
awk_c="nawk"
ps_d="/usr/ucb/"
time_parameter=7
else
awk_c="awk"
ps_d=""
time_parameter=5
fi
main_class=RiskEngine
connection_string=db.regression
AWK_CMD='BEGIN{printf "%-15s %-6s %-8s %s\n","ID","PID","STIME","Cmd"} {printf "%-15s %-6s %-8s %s %s %s\n","MY_APP",$2,$time_parameter, main_class, connection_string, port}'
while getopts ":pnh" opt; do
case $opt in
p) AWK_CMD='{ print $2 }'
do_print_message=1;;
n) AWK_CMD='{printf "%-15s %-6s %-8s %s %s %s\n","MY_APP",$2,$time_parameter,main_class, connection_string, port}' ;;
h) print "usage : `basename ${0}` {-p} {-n} : Returns details of process running "
print " -p : Returns a list of PIDS"
print " -n : Returns process list without preceding header"
exit 1 ;
esac
done
ps auxwww | grep $main_class | grep 10348 | grep -v grep | ${awk_c} -v main_class=$merlin_main_class -v connection_string=$merlin_connection_
string -v port=10348 -v time_parameter=$time_parameter "$AWK_CMD"
# cat /etc/redhat-release
Red Hat Enterprise Linux AS release 4 (Nahant Update 6)
# uname -a
Linux deapp25v 2.6.9-67.0.4.EL #1 Fri Jan 18 04:49:54 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
When I am executing the following from the script independently or inside script
# ps auxwww | grep $main_class | grep 10348 | grep -v grep | ${awk_c} -v main_class=$merlin_main_class -v connection_string=$merlin_connection_string -v port=10348 -v time_parameter=$time_parameter "$AWK_CMD"
I get two rows on Linux:
ID PID STIME Cmd
MY_APP 6217 2355352 RiskEngine 10348
MY_APP 21874 5316 RiskEngine 10348
I just have one jvm (Java command) running in the background but still I see 2 rows.
I know one of them (Duplicate with pid 21874) comes from awk command that I am executing. It includes again the main class and the port so two rows. Can you please help me to avoid the one that is duplicate row?
Can you please help me?
AWK can do all that grepping for you.
Here is a simple example of how an AWK command can be selective:
ps auxww | awk -v select="$mainclass" '$0 ~ select && /10348/ && ! (/grep/ || /awk/) && {print}'
ps can be made to selectively output fields which will help a little to reduce false positives. However pgrep may be more useful to you since all you're really using is the PID from the result.
pgrep -f "$mainclass.*10348"
I've reformatted the code as code, but you need to learn that the return key is your friend. The monstrously long pipelines should be split over multiple lines - I typically use one line per command in the pipeline. You can also write awk scripts on more than one line. This makes your code more readable.
Then you need to explain to us what you are up to.
However, it is likely that you are using 'awk' as a variant on grep and are finding that the value 10348 (possibly intended as a port number on some command line) is also in the output of ps as one of the arguments to awk (as is the 'main_class' value), so you get the extra information. You'll need to revise the awk script to eliminate (ignore) the line that contains 'awk'.
Note that you could still be bamboozled by a command running your main class on port 9999 (any value other than 10348) if it so happens that it is run by a process with PID or PPID equal to 10348. If you're going to do the job thoroughly, then the 'awk' script needs to analyze only the 'command plus options' part of the line.
You're already using the grep -v grep trick in your code, why not just update it to exclude the awk process as well with grep -v ${awk_c}?
In other words, the last line of your script would be (on one line and with the real command parameters to awk rather than blah blah blah).:
ps auxwww
| grep $main_class
| grep 10348
| grep -v grep
| grep -v ${awk_c}
| ${awk_c} -v blah blah blah
This will ensure the list of processes will not containg any with the word awk in it.
Keep in mind that it's not always a good idea to do it this way (false positives) but, since you're already taking the risk with processes containing grep, you may as well do so with those containing awk as well.
You can add this simple code in front of all your awk args:
'!/awk/ { .... original awk code .... }'
The '!/awk/' will have the effect of telling awk to ignore any line containing the string awk.
You could also remove your 'grep -v' if you extended my awk suggestion into something like:
'!/awk/ && !/grep/ { ... original awk code ... }'.