Why i can't kill Erlang process? - process

I am spawning 2 processes and it seems i can not kill either of them:
restarter - process that spawns the worker whenever it goes down
worker -process that gets messages from the shell, concatenates them and returns them in the reason of an exit to the restarter which in turn forwards them to the shell.
The worker process can't be killed since the restarter would restart it on any trap exit message. But what keeps the restarter process alive?
-module(mon).
-compile_flags([debug_info]).
-export([worker/1,init/0,restarter/2,clean/1]).
% ctrl+g
init()->
Pid=spawn(?MODULE,restarter,[self(),[]]),
register(restarter,Pid),
Pid.
restarter(Shell,Queue)->
process_flag(trap_exit,true),
Wk=spawn_link(?MODULE,worker,[Queue]),
register(worker,Wk),
receive
{'EXIT',Pid,{Queue,normal}}->Shell ! {Queue,"From res: worker died peacefully, wont restart"};
{'EXIT',Pid,{Queue,horrible}} ->
Shell ! {Queue,"Processed so far:"},
Shell ! "will restart in 5 seconds, select fresh/stale -> 1/0",
receive
1 ->
Shell ! "Will restart fresh",
restarter(Shell,[]);
0 ->Shell ! "Will continue work",
restarter(Shell,Queue)
after 5000 ->
Shell ! "No response -> started with 666",
restarter(Shell,[666])
end;
{MSG}->Shell ! {"Unknown message...closing",MSG}
end.
worker(Queue)->
receive
die->exit({Queue,horrible});
finish->exit({Queue,normal});
MSG->worker([{time(),MSG}|Queue])
end.
Usage
mon:init().
regs(). %worker and restarter are working
whereis(worker) ! "msg 1", whereis(worker) ! "msg2".
whereis(worker) ! finish.
flush(). % should get the first clause from restarter
regs(). % worker should be up and running again
exit(whereis(restarter),reason).
regs(). % restarter should be dead

In this scenario, the restarter process is trapping exits, so exit(whereis(restarter), reason) doesn't kill it. The exit signal gets converted to a message, and gets put into the message queue of the process:
> process_info(whereis(restarter), messages).
{messages,[{'EXIT',<0.76.0>,reason}]}
The reason it's still in the message queue is that none of the clauses in the receive expression matches this message. The first two clauses are specific to the exit reasons used by the worker process, and the last clause might look like a catch-all clause but it actually isn't - it matches any message that is a tuple with one element. If it were written MSG instead of {MSG}, it would have received the exit reason message, and sent "Unknown message" to the shell.
If you really want to kill the process, use the kill reason:
exit(whereis(restarter), kill).
A kill exit signal is untrappable, even if the process is trapping exits.
Another thing: the first two receive clauses will only match if the worker's queue is empty. That is because it reuses the variable name Queue, so the queue in {'EXIT',Pid,{Queue,normal}} must be equal to the value passed as an argument to the restarter function. In a situation like this, you'd normally use NewQueue or something as the variable in the receive clauses.

Related

Monit false alerts

I am monitoring a java daemon process with PID. Below is the code.
check process SemanticReplication with pidfile "/ngs/app/edwt/opsmonit /monit/scripts/process.pid"
start = "/ngs/app/edwt/scripts/javadaemon/start_daemon.ksh"
stop = "/ngs/app/edwt/scripts/javadaemon/stop_daemon.ksh"
Many times, even though java daemon process is up and running, I get false alert as process not running.
In the next monit check cycle (after a minute), another monit alert triggers as process is up and running.
Can someone help how do we avoid this false alerts ?
Your check statement is to have monit check for the existence of the pid file (which looks weird with the spaces, btw). If there isn't any, it'll send an alert by default and then runs the start directive.
I get around this by having a check process ... matching statement like so:
check process app-pass matching 'Passenger RubyApp: \/home\/app\/app-name\/public'
Essentially, "matching" does the equivalent of ps aux | grep ... which does a better job when I can't rely on a pid file existing, like with a child process.

Killing processes - AHK

So far I have:
Process, Exist notepad.exe
Process, Close, %p_id%
How do you set ahk to kill the process if it exists? I read it's something to do with the PID, but don't know how to implement that.
Have a look at the Documentation.
You can kill by simply using the name of the process:
Process, Close, notepad.exe
If the process does not exist, it will do nothing.
If you still would like to kill the process by using the pid instead, you must use the WinGet command in order to retrieve the pid.
there are at least two ways to get the PID from a window that I can think of immediately
1:
WinGet, My_PID, PID, WinTitle
2:
Run, ProgramFilePath "Args", Options, My_PID
The first one is to get an already running window PID and the second is to get a PID when opening the program with AHK. In both cases the variable "My_PID" now contains the window process ID
To answer your question of closing a process if it exists you could try a couple of methods.
ifWinExist ahk_pid %My_PID%
Process, Close, %My_PID%
; OR
Process, Exist, %My_PID% ; from my examples above
;Process, Exist, notepad.exe ; from your example above
If ErrorLevel ; Errorlevel is set to matching PID if found
Process, Close, %ErrorLevel%
I think that should answer your immediate question
This AHK script kills the active process when pressing Ctrl+Alt+K:
^!k::
{
WinGet, xPID, PID, A
Process, Close, %xPID%
}
return

Sun Grid Engine resubmit job stuck in 'Rq' state

I have what I hope is a pretty simple question, but I'm not super familiar with Sun Grid, so I've been having trouble finding the answer. I am currently submitting jobs to a grid using a bash submission script that generates a command and then executes it. I have read online that if a sun grid job exits with a code of 99, it gets re-submitted to the grid. I have successfully written my bash script to do this:
[code to generate command, stores in $command]
$command
STATUS=$?
if [[ $STATUS -ne 0 ]]; then
exit 99
fi
exit 0
When I submit this job to the grid with a command that I know has a non-zero exit status, the job does indeed appear to be resubmitted, however the scheduler never sends it to another host, instead it just remains stuck in the queue with the status "Rq":
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
2150015 0.55500 GridJob.sh my_user Rq 04/08/2013 17:49:00 1
I have a feeling that this is something simple in the config options for the queue, but I haven't been able to find anything googling. I've tried submitting this job with the qsub -r y option, but that doesn't seem to change anything.
Thanks!
Rescheduled jobs will only get run in queues that have their rerun attribute (FALSE by default) set to TRUE, so check your queue configuration (qconf -mq myqueue). Without this, your job remains in the rescheduled-pending state indefinitely because it has nowhere to go.
IIRC, submitting jobs with qsub -r yes only qualifies them for automatic rescheduling in the event of an exec node crash, and that exiting with status 99 should trigger a reschedule regardless.

Bad spawn_id while executing expect command

I am writing a script that will copy Valgrind onto whatever shelf that we enter on the command line. The syntax is as follows:
vgrindCopy [shelf number]
For some reason, the files will copy over without any issue, but after the copy completes the follow error is observed:
bad spawn_id (process died earlier?)
while executing
"expect "#""
Here is a copy of the relevant code:
function login_shelf {
expect -c "
set timeout 15
spawn $1
expect \"password:\"
send \"$PW\r\"
expect \"#\"
sleep 1
exit
"
}
# login and make the valgrind directory at /sfs/software/shelf/current
set -- /opt/swe/tools/ext/gnu/valgrind-3.7.0/i686-linux2.6/lib/valgrind/*
login_shelf "/opt/corp/projects/shelftools/bin/app rsync -Lau $* $shelf:/shelf/valgrind"
After playing around with the code, I found that if I remove the line "expect \"#\"", then the program doesn't copy any of the files over anymore. What odd as well is that I'm seeing the issue when I run the script, but a co-worker is not.
Has anyone had a similar issue and determined the cause? Any help would be greatly appreciated as always!
Your code is spawning the rsync and at the expect \"#\" is waiting for rsync to output a #, which it never does, so it exits and expect reports the error.
When you remove the expect \"#\" the expect script exits, terminating the rsync.
Instead of expect \"#\" you should wait for rsync to exit:
expect eof
wait

Terminate NSTask even if app crashes

If my app crashes I don't get a chance to terminate the NSTasks it spawned, so they stay around eating up resources.
Is there any way to launch a task such that it terminates when your app terminates (even if it crashes)?
I suppose you need to handle application crashes manually and in a different way to terminate spawned processes. For example, you can check following article http://cocoawithlove.com/2010/05/handling-unhandled-exceptions-and.html and in exception/signal handler when the application crashes send terminate signal to your child processes using kill(pid, SIGKILL), but for this you need also to keep the pid of child processes (NSTask - (int)processIdentifier) somewhere to get it from exception/signal handler.
What I've done in the past is create a pipe in the parent process, and pass the write end of that pipe into the child. The parent never closes the read end, and the child watches the write end to close. If the write end ever closes, that means the parent exited. You'll also need to mark parent's end of the pipe to close on exec.
I actually wrote a program / script / whatever that does just this… Here's the shell script that was the foundation of it… The project actually implements it within X-code as a single file executable.. weird that apple makes this so precarious, IMO.
#!/bin/bash
echo "arg1 is the SubProcess: $1, arg2 is sleepytime: $2, and arg3 is ParentPID, aka $$: $3"
CHILD=$1 && SLEEPYTIME=$2 || SLEEPYTIME=10; PARENTPID=$3 || PARENTPID=$$
GoSubProcess () { # define functions, start script at very end.
$1 arguments & # "&" puts SubP in background subshell
CHILDPID=$! # what all the fuss is about.
if kill -0 $CHILDPID; then # rock the cradle to make sure it aint dead
echo "Child is alive at $!" # glory be to god
else echo "couldnt start child. dying."; exit 2; fi
babyRISEfromtheGRAVE # keep an eye on child process
}
babyRISEfromtheGRAVE () {
echo "PARENT is $PARENTPID"; # remember where you came from, like j.lo
while kill -0 $PARENTPID; do # is that fount of life, nstask parent alive?
echo "Parent is alive, $PARENTPID is it's PID"
sleep $SLEEPTIME # you lazy boozehound
if kill -0 $CHILDPID; then # check on baby.
echo "Child is $CHILDPID and is alive."
sleep $SLEEPTIME # naptime!
else echo "Baby, pid $CHILDPID died! Respawn!"
GoSubProcess; fi # restart daemon if it dies
done # if this while loop ends, the parent PID crashed.
logger "My Parent Process, aka $PARENTPID died!"
logger "I'm killing my baby, $CHILDPID, and myself."
kill -9 $CHILDPID; exit 1 # process table cleaned. nothing is left. all three tasks are dead. long live nstask.
}
GoSubProcess # this is where we start the script.
exit 0 # this is where we never get to
You could have your tasks periodically check to see if their parent process still exists.