How to output just the user of a running process - awk

When I do ps aux | grep mongod, I get
mongod 53830 0.1 0.3 247276 27168 ? Sl Apr04 128:21 /var/lib/mongodb-mms-automation/mongodb-mms-monitoring-agent-5.4.4.366-1.rhel7_x86_64/mongodb-mms-monitoring-agent
mongod 104378 0.6 0.8 469384 71920 ? Ssl Mar22 571:03 /opt/mongodb-mms-automation/bin/mongodb-mms-automation-agent -f /etc/mongodb-mms/automation-agent.config -pidfilepath /var/run/mongodb-mms-automation/mongodb-mms-automation-agent.pid >> /var/log/mongodb-mms-automation/automation-agent-fatal.log 2>&1
mongod 104471 0.6 5.4 1993296 433624 ? Sl Mar22 578:03 /var/lib/mongodb-mms-automation/mongodb-linux-x86_64-3.4.13/bin/mongod -f /data/mdiag/data/automation-mongod.conf
However, I'm only interested in outputting user of the third entry, which runs the actual mongod process. I'd like the output to be just
mongod
How would I tweak around ps, grep, and awk to do this?

Pipe it to awk and search:
ps aux | grep mongod | awk '/bin\/mongod /{print $8}'
With that you can probably drop the grep and just let awk do the searching:
ps aux | awk '/bin\/mongod /{print $8}'
This is searching for the string "bin/mongod " anywhere in the record and then returning whatever is in the 8th position for that record.

Trying to use shell commands to get that user is most likely going to break. Can you start mongod using the PID option?
/var/lib/mongodb-mms-automation/mongodb-linux-x86_64-3.4.13/bin/mongod -f /data/mdiag/data/automation-mongod.conf --pidfilepath /run/mongodb-pid.txt
Then you can simply run ps $(cat /run/mongodb-pid.txt) to get only the specific process you want.

It is probably best to specify the pattern to match the end of the first part of the COMMAND field we are interested in. Also, using bracket expressions in place of \/ makes at least my eyes happier when looking at patterns for matching file paths.
ps aux | awk -v command=11 -v user=1 '$command ~ /[/]bin[/]mongod$/ { print $user }'

perhaps filter out agents?
$ ps aux | awk '/mongod/ && !/agent/{print $1}'

Following awk may help here.
ps aux | awk '/mongod/ && /automation-mongod.conf/{print $8}'
OR
ps aux | awk '/mongod/ && /\/data\/mdiag\/data\/automation-mongod.conf/{print $8}'

Related

Output of awk in color

I am trying to set up polybar on my newly-installed Arch system. I know very little bash scripting. I am just getting started. If this is not an appropriate question for this forum, I will gladly delete it. I want to get the following awk output in color:
sensors | grep "Package id 0:" | tr -d '+' | awk '{print $4}'"
I know how to do this with echo, so I tried to pass the output so that with the echo command, it would be rendered in color:
sensors | grep "Package id 0:" | tr -d '+' | awk '{print $4}' | echo -e "\e[1;32m ???? \033[0m"
where I want to put the appropriate information where the ??? are.
The awk output is just a temperature, something like this: 50.0°C.
edit: It turns out that there is a very easy way to pass colors to outputs of bash scripts (even python scripts too) in polybar. But I am still stumped as to why the solutions suggested here in the answers work in the terminal but not in the polybar modules. I have several custom modules that use scripts with no problems.
Using awk
$ sensors | awk '/Package id 0:/{gsub(/+/,""); print "\033[32m"$4"\033[0m"}'
If that does not work, you can try this approach;
$ sensors | awk -v color="$(tput setaf 2)" -v end="$(tput sgr0)" '/Package id 0:/{gsub(/+/,""); print color $4 end}'
This is where you want to capture the output of awk. Since awk can do what grep and tr do, I've integrated the pipeline into one awk invocation:
temp=$(sensors | awk '/Package id 0:/ {gsub(/\+/, ""); print $4}')
echo -e "\e[1;32m $temp \033[0m"

Caret regexp produces no output in mawk

I am trying to print all files in /usr/bin/ where the filename starts with a v. This works,
ls -lA /usr/bin/ | awk '{print $9}' | grep ^v
Surprisingly, this returns no output,
ls -lA /usr/bin/ | awk '/^v/ {print $9}'.
I don't understand the difference. I am running Ubuntu 21.10 with awk -W version saying that it is on 1.3.4 20200120.
Edit: I understand that awk may not be the best way to accomplish what I am wanting to do here. But, this is an exercise in learning awk by testing my understanding via comparing it to the real output.
The difference between the two pipelines is that the first outputs the 9th column and then check to see if that starts with a v the second checks to see if the line starts with a v, change the second to:
$ ls -lA /usr/bin/ | awk '$9 ~ /^v/ {print $9}'
When writing:
/pattern/ { ... }
it's the same as writing
$0 ~ /pattern/ { ... }
but in your case you want to compare the 9th column, so write that instead.
But you really don't want to create a pipeline for this, and what would happen if your files contain a space?
You can consider using find or globs instead:
$ printf '%s\n' /usr/bin/v*
/usr/bin/vi
/usr/bin/view
...
or
$ find /usr/bin -name 'v*' -print
/usr/bin/vi
/usr/bin/view
...

Is using awk at least 'awk -F' always will be fine?

What is the difference on Ubuntu between awk and awk -F? For example to display the frequency of the cpu core 0 we use the command
cat /proc/cpuinfo | grep -i "^ cpu MHz" | awk -F ":" '{print $ 2}' | head -1
But why it uses awk -F? We could put awk without the -F and it would work of course (already tested).
Because without -F , we couldn't find from wath separator i will begin the calculation and print the right result. It's like a way to specify the kind of separator for this awk's using. Without it, it will choose the trivial separator in the line like if i type on the terminal: ps | grep xeyes | awk '{print $1}' ; in this case it will choose the space ' ' as a separator to print the first value: pid OF the process xeyes. I found it in https://www.shellunix.com/awk.html. Thanks for all.

ksh cmd one-liner to grep for several PIDs at once

I got a bunch of processes that I need to check CPU affinity for, so I got this one liner:
for i in `ps -Fae | grep proc_name| awk '{print $2}'`; do taskset -acp $i;done
but I have a problem, taskset shows all the child processes' pid too so I get a massive line of numbers along with their cpu affinity.
I want to pipe the above line into an egrep 'pid1|pid2' so I can filter out all the child processes.
I tried to this:
for i in `ps -Fae | grep proc_name| awk '{print $2}'`; do taskset -acp $i;done | xargs egrep 'ps -Fae | grep proc_name| awk '{print $2}''
but my ksh shell didn't like the awk brackets at all.
So I have two questions:
can taskset be changed to show only parent pid?
how do I write the last bit where I egrep only the parent pid?
Filter inside the loop:
for i in $(ps -Fae | grep proc_name| grep -v grep | awk '{print $2}'); do
taskset -acp "$i" | grep "$i"
done
It sounds like you're asking for this syntax if it were bash (see https://mywiki.wooledge.org/BashFAQ/001, I'm not sure what the equivalent robust read loop syntax is for ksh):
while IFS= read -r i; do
taskset -acp "$i"
done < <(ps -Fae | awk '/proc_name/{print $2}') |
grep -E 'pid1|pid2'
but that's pretty fragile, e.g. if pid1 appeared as a substring of some other pid. If you edit your question to provide concise, testable sample input (i.e. the output of ps -Fae and the associated output of taskset) plus the expected output then we can be of more help.

Problem with awk and grep

I am using the following script to get the running process to print the id, command..
if [ "`uname`" = "SunOS" ]
then
awk_c="nawk"
ps_d="/usr/ucb/"
time_parameter=7
else
awk_c="awk"
ps_d=""
time_parameter=5
fi
main_class=RiskEngine
connection_string=db.regression
AWK_CMD='BEGIN{printf "%-15s %-6s %-8s %s\n","ID","PID","STIME","Cmd"} {printf "%-15s %-6s %-8s %s %s %s\n","MY_APP",$2,$time_parameter, main_class, connection_string, port}'
while getopts ":pnh" opt; do
case $opt in
p) AWK_CMD='{ print $2 }'
do_print_message=1;;
n) AWK_CMD='{printf "%-15s %-6s %-8s %s %s %s\n","MY_APP",$2,$time_parameter,main_class, connection_string, port}' ;;
h) print "usage : `basename ${0}` {-p} {-n} : Returns details of process running "
print " -p : Returns a list of PIDS"
print " -n : Returns process list without preceding header"
exit 1 ;
esac
done
ps auxwww | grep $main_class | grep 10348 | grep -v grep | ${awk_c} -v main_class=$merlin_main_class -v connection_string=$merlin_connection_
string -v port=10348 -v time_parameter=$time_parameter "$AWK_CMD"
# cat /etc/redhat-release
Red Hat Enterprise Linux AS release 4 (Nahant Update 6)
# uname -a
Linux deapp25v 2.6.9-67.0.4.EL #1 Fri Jan 18 04:49:54 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
When I am executing the following from the script independently or inside script
# ps auxwww | grep $main_class | grep 10348 | grep -v grep | ${awk_c} -v main_class=$merlin_main_class -v connection_string=$merlin_connection_string -v port=10348 -v time_parameter=$time_parameter "$AWK_CMD"
I get two rows on Linux:
ID PID STIME Cmd
MY_APP 6217 2355352 RiskEngine 10348
MY_APP 21874 5316 RiskEngine 10348
I just have one jvm (Java command) running in the background but still I see 2 rows.
I know one of them (Duplicate with pid 21874) comes from awk command that I am executing. It includes again the main class and the port so two rows. Can you please help me to avoid the one that is duplicate row?
Can you please help me?
AWK can do all that grepping for you.
Here is a simple example of how an AWK command can be selective:
ps auxww | awk -v select="$mainclass" '$0 ~ select && /10348/ && ! (/grep/ || /awk/) && {print}'
ps can be made to selectively output fields which will help a little to reduce false positives. However pgrep may be more useful to you since all you're really using is the PID from the result.
pgrep -f "$mainclass.*10348"
I've reformatted the code as code, but you need to learn that the return key is your friend. The monstrously long pipelines should be split over multiple lines - I typically use one line per command in the pipeline. You can also write awk scripts on more than one line. This makes your code more readable.
Then you need to explain to us what you are up to.
However, it is likely that you are using 'awk' as a variant on grep and are finding that the value 10348 (possibly intended as a port number on some command line) is also in the output of ps as one of the arguments to awk (as is the 'main_class' value), so you get the extra information. You'll need to revise the awk script to eliminate (ignore) the line that contains 'awk'.
Note that you could still be bamboozled by a command running your main class on port 9999 (any value other than 10348) if it so happens that it is run by a process with PID or PPID equal to 10348. If you're going to do the job thoroughly, then the 'awk' script needs to analyze only the 'command plus options' part of the line.
You're already using the grep -v grep trick in your code, why not just update it to exclude the awk process as well with grep -v ${awk_c}?
In other words, the last line of your script would be (on one line and with the real command parameters to awk rather than blah blah blah).:
ps auxwww
| grep $main_class
| grep 10348
| grep -v grep
| grep -v ${awk_c}
| ${awk_c} -v blah blah blah
This will ensure the list of processes will not containg any with the word awk in it.
Keep in mind that it's not always a good idea to do it this way (false positives) but, since you're already taking the risk with processes containing grep, you may as well do so with those containing awk as well.
You can add this simple code in front of all your awk args:
'!/awk/ { .... original awk code .... }'
The '!/awk/' will have the effect of telling awk to ignore any line containing the string awk.
You could also remove your 'grep -v' if you extended my awk suggestion into something like:
'!/awk/ && !/grep/ { ... original awk code ... }'.