How to get notified when a process terminates in Windows and Linux? - process

I want to write a program, that should be notified by O.S. whenever any running process on that OS dies.
I don't want to myself poll and compare everytime if a previously existing process has died. I want my program to be alerted by OS whenever a process termination happens.
How do I go about it? Some sample code would be very helpful.
PS: Looking for approaches in Java/C++.

Sounds like you want PsSetCreateProcessNotifyRoutine(). See this article to get started:
http://www.codeproject.com/KB/threads/procmon.aspx

Under Unix, you could use the sigchld signal to get notified of the death of the process. This requires, however, that the process being monitored is a child process of the monitoring process.
Under Windows, you might need to have a valid handle to the process. If you spawn the process yourself using CreateProcess, you get the handle for free, otherwise you must acquire by other means. It might then be possible to wait for the process to terminate by calling WaitForSingleObject on the handle.
Sorry, I don't have any example code for this. I am not even sure, that waiting on the process handle under Windows really awaits termination of the process (as opposed to some other "significant" condition, which causes the process handle to enter "signalled" state or something).

I don't have a code sample ready but one idea – on Linux – might be to find out the ID of the process you'd like to watch when first starting your watcher program (e.g. using $ pgrep) and then using inotify to watch /proc/<PID>/ – which gets deleted when the process dies. In contrast to polling, this doesn't cost any significant CPU resources.
Now, procfs is not completely supported by inotify, so I can't guarantee this approach would actually work but it is certainly worth looking into.

Related

Listen or wait for a specific time without using timer

Is there a way to listen or wait for a specific time (e.g. 11:30 am) every day. The only way I know how is to set a timer that checks for the current time every 60 seconds which I have actually implemented using a backgroundworker. But is there a way to just wait and listen for the specified time (similar to monitoring for directory changes) and then take some action?
Thanks in advance.
Typically, rather than having a program resident in memory waiting, you would setup a Scheduled Task for this (or a cron job on linux). The scheduled task will run the program at the appropriate time. The program can still check (validate) the expected time if needed, but it shouldn't just always sit in the background using up resources if it's only going to run once per day.
The scheduled task is also better because it will recover automatically from computer reboots, crashes, etc. If something happens that interrupts your program's normal running, the scheduled task will still be able to run.
This is especially important in the .Net world, because .Net requires you to be very careful writing long-lived programs to avoid address space fragmentation. The .Net garbage collector is good at freeing up and returning old memory to the operating system, but over time your program's virtual address space can become fragmented and eventually you will not be able to allocate new memory any longer.
Even if this is part of a larger program, where there are also other things happening based on user interactions, it's still a good idea to split this off into a separate process.

Spring Batch restart crashed jobs

Hi spring batch users,
regarding the documentation http://docs.spring.io/spring-batch/reference/htmlsingle/#d5e1320
"If the process died ("kill -9" or server failure) the job is, of course, not running, but the JobRepository has no way of knowing because no-one told it before the process died."
I try to find and restart the stale job executions by using
Set<JobExecution> jobExecutions = jobExplorer.findRunningJobExecutions(jobName);
...
jobExecution.setStatus(FAILED);
jobExecution.setEndTime(new Date());
jobRepository.update(jobExecution);
jobOperator.restart(jobExecution.getId());
But this seems to be very inconvenient.
1) I have to do this before other (new) jobs could be started.
2) I have to handle multiple instances of running servers so findRunningJobExecutions will not do the trick.
You can find other questions regarding this topic:
https://jira.spring.io/browse/BATCH-2433?jql=project%20%3D%20BATCH%20AND%20status%20%3D%20Open%20ORDER%20BY%20priority%20DESC
Spring Batch after JVM crash
I would love to see a solution to register a "start up clean jobs listener". This will still not fix the problems originated by the multi server environment because spring batch does not know if the JobExecution marked by STARTED is not running on an other instance.
Thanks for any advice
Alex
Your job cannot and should not recover "automatically" from a kill -9 scenario. A kill -9 is treated very differently than you application throwing a caught Exception. The reason for this is that you've effectively pulled the carpet out from under the application without giving it a chance to reach a synchronization point with the database to commit any necessary information to the ExecutionContext or update the job/step status(es). Therefore, the last status touchpoint with the database will remain and the job will still look STARTED.
"OK, fine" you say, "but if I start another execution, I want it to find that STARTED execution, and pick up where it left off." The problem here is that there is no clean way for the application to distinguish a job that is ACTUALLY RUNNING from one that has failed but couldn't up the database. The framework here correctly errs on the side of caution and prevents you from starting a job that already appears running, and this is a GOOD thing.
Why? Because let's assume your job was actually still running and you restarted by accident. As coded, the framework will start to spin up, see your running execution and fail with the following message A job execution for this job is already running. I can't tell you how many times we've been saved by this because someone accidentally launched a job twice!
If you were to implement the listener you suggest, the 2nd execution would instead be allowed to start and you'd have 2 different JVMs repeating the same work, possibly writing to the same files/tables and causing a huge data mess that could be impossible to clean up.
Trust me, in the event the Linux terminal kills your job or your job dies because the connection to the database has been severed, you WANT human eyes on those execution states before you attempt a restart.
Finally, on the off chance you actually wanted to kill you job, you can leverage several other standard patterns for stopping jobs:
Stop via throw Exception
Stop via JobOperator.stop()

dump per process stack in linux

I need to dump the each and every process stack in linux kernel when the system hangs.
I am currently trying to implement in one of my kernel module based on the watchdog timer timeout.
Watchdog timer is reset by a user daemon for every timeslice.
When the system hangs, there's no one up to reset the timer.
Hence it expires and generates an interrupt.
I wrote an interrupt handler where I should dump the stack of every process running.
So, my question is how can I dump stack of every process in kernel?
Thanks
Venkatesh
show_state() (include/linux/sched.h) will do this for you. BTW, this function is already available from the sysrq handler, which you might be able to make use of if it's enabled. See Documentation/sysrq.txt
Also, there are some other kernel debugging options you might be able to enable to help find your problem. Check out the Kernel Hacking menu in make menuconfig. In particular, CONFIG_LOCKUP_DETECTOR ("Detect Hard and Soft Lockups" in the menu) might be helpful.

Will detached NSThreads always complete prior to application exit?

When using NSThread's detachNewThreadSelector:toTarget:withObject:, I'm finding that the thread will fully complete its execution before the application is terminated normally if the user were to attempt to quit the application while the background process was executing.
In this case, this is the behavior I desire, but I couldn't find anything in Apple's docs that suggests that this will always be the case. The only relevant information I was able to find was the following, from Apple's Threading Programming Guide:
Important: At application exit time, detached threads can be terminated immediately but joinable threads cannot. Each joinable thread must be joined before the process is allowed to exit. Joinable threads may therefore be preferable in cases where the thread is doing critical work that should not be interrupted, such as saving data to disk.
So from this, I know that detached threads can be terminated at the time of application exit, but will they ever be terminated automatically? Or, am I always safe to assume the thread will complete its execution before the application quits?
You cannot assume that any thread -- including the main thread -- will ever complete execution normally, regardless of the documentation.
This is because the user can quit an application at any time, the system may lose power/panic, or the app may crash.
As for detached threads, it would not be unheard of for the system frameworks to automatically terminate the app forcibly after some timeout once the main event loop has given up the ghost.

Tracking Chrome and its many processes

I'm trying to keep an eye on how long an application runs. To do this, I capture every process's ID as it starts, and when that process is shut down, I log the time. However, Google's Chrome starts and stops like 6 processes when you start it up and shut it down, meaning each execution of Chrome gets logged multiple times.
Is there a better way to track the execution of an application than by process ID? Or is there, perhaps, a technique for getting around this particular problem? I'd considered not adding a process ID if a process with the same ID was added within a second or so, but that seems exploitable.
Any ideas?
I am not 100% but I would assume that one process in Chrome must be the parent. try eliminating processes from your list if their parent (PPID) is the same (and not init = PID 1)
I ended up just checking if I was adding a duplicate. Not very efficient, but easy and effective. It will serve for now.