Reference for proper handling of PID file on Unix - locking

Where can I find a well-respected reference that details the proper handling of PID files on Unix?
On Unix operating systems, it is common practice to “lock” a program (often a daemon) by use of a special lock file: the PID file.
This is a file in a predictable location, often ‘/var/run/foo.pid’. The program is supposed to check when it starts up whether the PID file exists and, if the file does exist, exit with an error. So it's a kind of advisory, collaborative locking mechanism.
The file contains a single line of text, being the numeric process ID (hence the name “PID file”) of the process that currently holds the lock; this allows an easy way to automate sending a signal to the process that holds the lock.
What I can't find is a good reference on expected or “best practice” behaviour for handling PID files. There are various nuances: how to actually lock the file (don't bother? use the kernel? what about platform incompatibilities?), handling stale locks (silently delete them? when to check?), when exactly to acquire and release the lock, and so forth.
Where can I find a respected, most-authoritative reference (ideally on the level of W. Richard Stevens) for this small topic?

First off, on all modern UNIXes /var/run does not persist across reboots.
The general method of handling the PID file is to create it during initialization and delete it from any exit, either normal or signal handler.
There are two canonical ways to atomically create/check for the file. The main one these days is to open it with the O_EXCL flag: if the file already exists, the call fails. The old way (mandatory on systems without O_EXCL) is to create it with a random name and link to it. The link will fail if the target exists.

As far as I know, PID files are a convention rather than something that you can find a respected, mostly authoritative source for. The closest I could find is this section of the Filesystem Hierarchy Standard.
This Perl library might be helpful, since it looks like the author has at least given thought to some issues than can arise.
I believe that files under /var/run are often handled by the distro maintainers rather than daemons' authors, since it's the distro maintainers' responsibility to make sure that all of the init scripts play nice together. I checked Debian's and Fedora's developer documentation and couldn't find any detailed guidelines, but you might be able to get more info on their developers' mailing lists.

See Kerrisk's The Linux Programming Interface, section 55.6 "Running Just One Instance of a Program" which is based on the pidfile implementation in Stevens' Unix Network Programming, v2.
Note also that the location of the pidfile is usually something handled by the distro (via an init script), so a well written daemon will take a command line argument to specify the pidfile and not allow this to be accidentally overridden by a configuration file. It should also gracefully handle a stale pid file by itself (O_EXCL should not be used). fcntl() file locking should be used--you may assume that a daemon's pidfile is located on a local (non-NFS) filesystem.

Depending on the distribution, its actually the init script that handles the pidfile. It checks for existence at starting, removes when stopping, etc. I don't like doing it that way. I write my own init scripts and don't typically use the stanard init functions.
A well written program (daemon) will have some kind of configuration file saying where this pidfile (if any) should be written. It will also take care to establish signal handlers so that the PID file is cleaned up on normal, or abnormal exit, whenever a signal can be handled. The PID file then gives the init script the correct PID so it can be stopped.
Therefore, if the pidfile already exists when starting, its a very good indicator to the program that it previously crashed and should do some kind of recovery effort (if applicable). You kind of shoot that logic in the foot if you have the init script itself checking for the existence of the PID, or unlinking it.
As far as the name space, it should follow the program name. If you are starting 'foo-daemon', it would be foo-daemon.pid
You should also explore /var/lock/subsys, however that's used mostly on Red Hat flavors.

The systemd package on Red Hat 7 provides a man page daemon(7) with the header line "Writing and packaging system daemons."
This man page discusses both "old style" (SysV) and "new style" (systemd) daemonization. In new style, systemd itself handles the PID files for you (if so configured to do so). However, in old style, the man page has this to say:
In the daemon process, write the daemon PID (as returned by getpid())
to a PID file, for example /run/foobar.pid (for a hypothetical daemon
"foobar") to ensure that the daemon cannot be started more than once.
This must be implemented in race-free fashion so that the PID file is
only updated when it is verified at the same time that the PID
previously stored in the PID file no longer exists or belongs to a
foreign process.
You can also read this man page online.

Related

Is it possible to accurately log what applications the user has launched through the linux kernel?

My goal is to write to a file (that the user whenever the user launches an application, such as FireFox) and timestamp the event.
The tricky part is having to do this from the kernel (or a module loaded onto the kernel).
From the research I've done so far (sources listed below), the execve system call seemed the most viable. As it had the filename of the process it was handling which seemed like gold at the time, but I quickly learned that it wasn't as useful as I thought since this system call isn't limited to user-related operations.
So then I thought of using ps -ef as it listed all the current running processes and I would just have to filter through which ones were applications opened by the user.
But the issue with that method is that I would have to poll every X seconds so, it has the potential to miss something if the user launched and closed an application within the time that I didn't call ps -ef.
I've also realized that writing to a file would be a challenge as well, since you don't have access to the standard library from the kernel. So my guess for that would be making use of proc somehow to allow the user to actually access the information that I'm trying to log.
Basically I'm running out of leads and I'd greatly appreciate it if anyone could point me in the right direction.
Thanks.
Sources:
http://tldp.org/LDP/lkmpg/2.6/html/x978.html (not very recent)
https://0xax.gitbooks.io/linux-insides/content/SysCall/syscall-4.html
First, writing to a file or reading a real file from the kernel is a bad idea which is not used in the kernel. There is of course VFS files, like /sys/fs or /proc, but this is a special case and this is allowed.
See this article in Linux Journal,
"Driving Me Nuts - Things You Never Should Do in the Kernel" by Greg Kroach-Hrtman
http://www.linuxjournal.com/article/8110
Every new process that is created in Linux, adds an entry under /proc,
as /proc/pidNum, where pidNum is the Process ID of the new process.
You can find out the name of the new application which was invoked simply by
cat /proc/pidNum/cmdline.
So for example, if your crond daemon has pid 1336, then
$cat /proc/1336/cmdline
will give
cron
And there are ways to monitor adding entries to a folder in Linux.

Determine application executable artifact scope through monitoring on OpenVMS

We have a legacy COBOL application based on OpenVMS for-which we do not have a clear idea of configuration. In this context, by "configuration" I am talking about:
Which executable files comprise the application;
Which pristine source files correspond to which executable files.
It may seem odd that 1 above is something that is not known, but over time what has happened is that executables have "come and gone" (and many still remain used). The knowledge of which executable files constitute the application as it exists today is not known since knowledge of which executables are no longer required has been lost in time. In practical terms, the team faithfully compiles all source code files and deploy the resultant executables despite the fact that there are obviously programs that are no longer used.
It goes without saying that there is no formal configuration management process and the source code is not kept in a version control system. Since the application runs on OpenVMS, the corresponding Files-11-based file system keeps older versions of files (including source files) and this has long been the excuse for not putting the application source into a version control system (despite the reasons for using a VCS extending far beyond merely having a record of previous versions).
There are a number of ways in which the configuration can be determined, of course, but I'd like to start with a first "small step", that is: determine the set of executables that comprise the application. At this point I should mention that the executable components of the application are not limited to OpenVMS images, but also DCL command files. I would like to:
Log all invocations of images that reside in a certain directory or set of directories;
Log all invocations of command files that reside in a certain directory or set of directories.
If we run this logging on our production system over an extended period of time, say two months, we can get a pretty good idea of what the application comprises. Together with user consultation, we'll be able to confirm the need for the executable files that aren't being called.
I think I have an idea of how to do 1 above, although I'm not sure of the specifics, that is, use SET/AUDIT. The second part, at this stage, I have no idea of how to do.
So, the main criterion for this effort is that as little of the existing system be affected in order to gain the above information. Due to the question mark around the configuration (and the complete lack of automated tests), changing anything is a nerve-wracking undertaking.
Using operating-system-level services like SET/AUDIT would allow one to get to know what's being run without the need to change source and/or recompile anything. So, my question is a multi-parter:
Is this the optimal way to do this on OpenVMS?
What would I need to do to restrict SET/AUDIT to only monitor images in a particular directory?
How would I log command file invocation without changing the .COM source files?
What should I expect in terms of performance degradation as a result of logging such information?
Ad 2., 3.
I would try security auditing with ACLs. From a a privileged account, something like ...
Make sure ACL auditing is enabled:
$ show audit
should show
System security audits currently enabled for:
...
ACL
...
If it doesn't, enable it with
$ set audit/audit/enable=acl
and then you may want to disable it when you are done with
$ set audit/audit/disable=acl
Set audit ACLs on all the wanted files:
$ set sec/acl=(audit=security,access=success+execute) [.app]*.com
$ set sec/acl=(audit=security,access=success+execute) [.app]*.exe
and you may want to delete the ACLs when you are done with
$ set security/acl=(audit=security,access=success+execute)/delete [.app]*.com
$ set security/acl=(audit=security,access=success+execute)/delete [.app]*.exe
You can check what ACLs are set with:
$ show security [.app]*.*
Run you application ...
Get the results from the audit file
$ analyze/audit [vms$common.sysmgr]security.audit$journal/sel=access=execute/full/since=17:00/out=app.log
Check your report for your files:
$ pipe type app.log |search sys$pipe "File name", ,"Access requested"
File name: _EMUVAX$DUA0:[USER.APP]NOW.COM;1
Access requested: READ,EXECUTE
Auditable event: Object access
File name: _EMUVAX$DUA0:[USER.APP]ECHO.EXE;1
Access requested: READ,EXECUTE
$
Sorry, I have no answer for 1. and 4.
It would help to know the OpenVMS Version (e.g. 6.2, 7.3-2, 8.4...) and the architecture (Vax, Alpha,Itanium).
Recent OpenVMS versions have great sda extensions
http://h71000.www7.hp.com/doc/84final/6549/6549pro_ext1.html
or
http://de.openvms.org/Spring2009/05-SDA_EXTENSIONS.pdf
such as LNM to check the logical names used by a process, PCS for PC sampling of a process, FLT to check the faulting behavior of applications, RMS for RMS data structures, PERF only for Itanium performance tracing, PROCIO for the reads and writes for all files opened by a process
Post a
dir sys$share:*sda.exe
so that we know which Sda extensions are available for you.
You can always check what a process with a pid of 204002B4 does with
$ ana/sys
set proc/id=204020b4
sh process /channel
exam #pc
and repeat while the process moves on.

How to check if another instance of the app/binary is already running

I'm writing a command line application in Mac using Objective-c
At the start of the application, i want to check if another instance of the same application is already running. If it is, then i should be either wait for it to finish or exit the current instance or quit the other instance etc.
Is there any way of doing this?
The standard Unix solution for this is to create a "run file". When you start up, you try to create that file and write your pid to it if it doesn't exist; if it does exist, read the pid out of it, and if there's a running program with that pid and your process name, wait/exit/whatever.
The question is, where do you put that file, and what do you call it?
Well, first, you have to decide what exactly "already running" means. Obviously not "anywhere in the world", but it could be anything from "anywhere on the current machine" to "in the current desktop session". (For example, if User A starts your program, pauses it, then User B comes along and takes over the computer via Fast User Switching, should she be able to run the program, or not?)
For pretty much any reasonable answer to that question, there's an obvious pathname pattern. For example, on a Mac, /tmp is shared system-wide, while $TMPDIR is specific to a given session, so, e.g., /tmp/${ARGV[0]}.pid is a good way to say "only one copy on the machine, period", while ${TMPDIR}/${ARGV[0]}.pid is a good way to say "only one copy per session".
Simple but common way to do this is to check the process list for the name of your executable.
ps - A | grep <your executable name>
Thank you #abarnert.
This is how I have presently implemented. At the start of the main(), I would check if a file named .lock exists in the binary's own directory (I am considering moving it to /tmp). If it is, application exits.
If not, it would create the file.
At the end of the application, the .lock file is removed
I haven't yet written the pid to that file, but I will when exiting the previous instance is required (as of yet I don't need it, but may in the future).
I think PID can be retrieved using
int myPID=[[NSProcessInfo processInfo] processIdentifier];
The program will be invoked by a custom scheduler which is running as a root daemon. So it would be run as root.
Seeing the answers, I would assume that there is no direct method of solving the problem.

Platform independent file locking?

I'm running a very computationally intensive scientific job that spits out results every now and then. The job is basically to just simulate the same thing a whole bunch of times, so it's divided among several computers, which use different OSes. I'd like to direct the output from all these instances to the same file, since all the computers can see the same filesystem via NFS/Samba. Here are the constraints:
Must allow safe concurrent appends. Must block if some other instance on another computer is currently appending to the file.
Performance does not count. I/O for each instance is only a few bytes per minute.
Simplicity does count. The whole point of this (besides pure curiosity) is so I can stop having every instance write to a different file and manually merging these files together.
Must not depend on the details of the filesystem. Must work with an unknown filesystem on an NFS or Samba mount.
The language I'm using is D, in case that matters. I've looked, there's nothing in the standard lib that seems to do this. Both D-specific and general, language-agnostic answers are fully acceptable and appreciated.
Over NFS you face some problems with client side caching and stale data. I have written an OS independent lock module to work over NFS before. The simple idea of creating a [datafile].lock file does not work well over NFS. The basic idea to work around it is to create a lock file [datafile].lock which if present means file is NOT locked and a process that wants to acquire a lock renames the file to a different name like [datafile].lock.[hostname].[pid]. The rename is an atomic enough operation that works well enough over NFS to guarantee exclusivity of the lock. The rest is basically a bunch of fail safe, loops, error checking and lock retrieval in case the process dies before releasing the lock and renaming the lock file back to [datafile].lock
The classic solution is to use a lock file, or more accurately a lock directory. On all common OSs creating a directory is an atomic operation so the routine is:
try to create a lock directory with a fixed name in a fixed location
if the create failed, wait a second or so and try again - repeat until success
write your data to the real data file
delete the lock directory
This has been used by applications such as CVS for many years across many platforms. The only problem occurs in the rare cases when your app crashes while writing and before removing the lock.
Why not just build a simple server which sits between the file and the other computers?
Then if you ever wanted to change the data format, you would only have to modify the server, and not all of the clients.
In my opinion building a server would be much easier than trying to use a Network file system.
Lock File with a twist
Like other answers have mentioned, the easiest method is to create a lock file in the same directory as the datafile.
Since you want to be able to access the same file over multiple PC the best solution I can think of is to just include the identifier of the machine currently writing to the data file.
So the sequence for writing to the data file would be:
Check if there is a lock file present
If there is a lock file, see if I'm the one owning it by checking that its content has my identifier.
If that's the case, just write to the data file then delete the lock file.
If that's not the case, just wait a second or a small random length of time and try the whole cycle again.
If there is no lock file, create one with my identifier and try the whole cycle again to avoid race condition (re-check that the lock file is really mine).
Along with the identifier, I would record a timestamp in the lock file and check whether it's older than a given timeout value.
If the timestamp is too old, then assume that the lock file is stale and just delete it as it would mea one of the PC writing to the data file may have crashed or its connection may have been lost.
Another solution
If you are in control the format of the data file, could be to reserve a structure at the beginning of the file to record whether it is locked or not.
If you just reserve a byte for this purpose, you could assume, for instance, that 00 would mean the data file isn't locked, and that other values would represent the identifier of the machine currently writing to it.
Issues with NFS
OK, I'm adding a few things because Jiri Klouda correctly pointed out that NFS uses client-side caching that will result in the actual lock file being in an undetermined state.
A few ways to solve this issue:
mount the NFS directory with the noac or sync options. This is easy but doesn't completely guarantee data consistency between client and server though so there may still be issues although in your case it may be OK.
Open the lock file or data file using the O_DIRECT, the O_SYNC or O_DSYNC attributes. This is supposed to disable caching altogether.
This will lower performance but will ensure consistency.
You may be able to use flock() to lock the data file but its implementation is spotty and you will need to check if your particular OS actually uses the NFS locking service. It may do nothing at all otherwise.
If the data file is locked, then another client opening it for writing will fail.
Oh yeah, and it doesn't seem to work on SMB shares, so it's probably best to just forget about it.
Don't use NFS and just use Samba instead: there is a good article on the subject and why NFS is probably not the best answer to your usage scenario.
You will also find in this article various methods for locking files.
Jiri's solution is also a good one.
Basically, if you want to keep things simple, don't use NFS for frequently-updated files that are shared amongst multiple machines.
Something different
Use a small database server to save your data into and bypass the NFS/SMB locking issues altogether or keep your current multiple data files system and just write a small utility to concatenate the results.
It may still be the safest and simplest solution to your problem.
I don't know D, but I thing using a mutex file to do the jobe might work. Here's some pseudo-code you might find useful:
do {
// Try to create a new file to use as mutex.
// If it's already created, it will throw some kind of error.
mutex = create_file_for_writing('lock_file');
} while (mutex == null);
// Open your log file and write results
log_file = open_file_for_reading('the_log_file');
write(log_file, data);
close_file(log_file);
close_file(mutex);
// Free mutex and allow other processes to create the same file.
delete_file(mutex);
So, all processes will try to create the mutex file but only the one who wins will be able to continue. Once you write your output, close and delete the mutex so other processes can do the same.

How to reliably handle files uploaded periodically by an external agent?

It's a very common scenario: some process wants to drop a file on a server every 30 minutes or so. Simple, right? Well, I can think of a bunch of ways this could go wrong.
For instance, processing a file may take more or less than 30 minutes, so it's possible for a new file to arrive before I'm done with the previous one. I don't want the source system to overwrite a file that I'm still processing.
On the other hand, the files are large, so it takes a few minutes to finish uploading them. I don't want to start processing a partial file. The files are just tranferred with FTP or sftp (my preference), so OS-level locking isn't an option.
Finally, I do need to keep the files around for a while, in case I need to manually inspect one of them (for debugging) or reprocess one.
I've seen a lot of ad-hoc approaches to shuffling upload files around, swapping filenames, using datestamps, touching "indicator" files to assist in synchronization, and so on. What I haven't seen yet is a comprehensive "algorithm" for processing files that addresses concurrency, consistency, and completeness.
So, I'd like to tap into the wisdom of crowds here. Has anyone seen a really bulletproof way to juggle batch data files so they're never processed too early, never overwritten before done, and safely kept after processing?
The key is to do the initial juggling at the sending end. All the sender needs to do is:
Store the file with a unique filename.
As soon as the file has been sent, move it to a subdirectory called e.g. completed.
Assuming there is only a single receiver process, all the receiver needs to do is:
Periodically scan the completed directory for any files.
As soon as a file appears in completed, move it to a subdirectory called e.g. processed, and start working on it from there.
Optionally delete it when finished.
On any sane filesystem, file moves are atomic provided they occur within the same filesystem/volume. So there are no race conditions.
Multiple Receivers
If processing could take longer than the period between files being delivered, you'll build up a backlog unless you have multiple receiver processes. So, how to handle the multiple-receiver case?
Simple: Each receiver process operates exactly as before. The key is that we attempt to move a file to processed before working on it: that, and the fact the same-filesystem file moves are atomic, means that even if multiple receivers see the same file in completed and try to move it, only one will succeed. All you need to do is make sure you check the return value of rename(), or whatever OS call you use to perform the move, and only proceed with processing if it succeeded. If the move failed, some other receiver got there first, so just go back and scan the completed directory again.
If the OS supports it, use file system hooks to intercept open and close file operations. Something like Dazuko. Other operating systems may let you know about file operations in anoter way, for example Novell Open Enterprise Server lets you define epochs, and read list of files modified during an epoch.
Just realized that in Linux, you can use inotify subsystem, or the utilities from inotify-tools package
File transfers is one of the classics of system integration. I'd recommend you to get the Enterprise Integration Patterns book to build your own answer to these questions -- to some extent, the answer depends on the technologies and platforms you are using for endpoint implementation and for file transfer. It's a quite comprehensive collection of workable patterns, and fairly well written.