How to avoid race condition when checking if file exists and then creating it?

How to avoid race condition when checking if file exists and then creating it? - race-condition

I'm thinking of corner cases in my code and I can't figure out how to avoid problem when you check if file exists, and if it does not, you create a file with that filename. The code approximately looks like this:
// 1
status = stat(filename);
if (!status) {
// 2
create_file(filename);
}
Between the call to 1 and 2 another process could create the filename. How to avoid this problem and is there a general solution to this type of problems? They happen often in systems programming.

This is what the O_EXCL | O_CREAT flags to open() were designed for:
If O_CREAT and O_EXCL are set, open() shall fail if the file exists. The check for the existence of the file and the creation of the file if it does not exist shall be atomic with respect to other threads executing open() naming the same filename in the same directory with O_EXCL and O_CREAT set. If O_EXCL and O_CREAT are set, and path names a symbolic link, open() shall fail and set errno to [EEXIST], regardless of the contents of the symbolic link. If O_EXCL is set and O_CREAT is not set, the result is undefined.
So:
fd = open(FILENAME, O_EXCL | O_CREAT | O_RDWR);
if (fd <0) { /* file exists or there were problems like permissions */
fprintf(stderr, "open() failed: \"%s\"\n", strerror(errno));
abort();
}
/* file was newly created */

You're supposed to create the file anyway, and let the OS know whether or not you want a new file to be created in the process if the file doesn't already exist. You shouldn't perform a separate check before.

Related

How to add copy to filename if already exists?

Is there a way for if a file already exists (say /hello.txt) and you run the command:
[data writeToFile:#"/hello.txt" atomically:YES];
Instead of overwriting create the file hello copy.txt and then hello copy 2.txt as finder does naturally?

First, you need to use a data-writing method that will refuse to overwrite an existing file. You can use -[NSData writeToFile:options:error:] with the option NSDataWritingWithoutOverwriting. Check its return value to see if it failed and then check the returned NSError to see if the reason it failed was an existing file. If it is, build a new path string based on the original and the number of tries you've made, adding either " copy" or " copy %u", and loop around to try again. Stop looping if you succeed or you get any other error. (You might also put a limit on the maximum number of tries, in case something unforeseen happens.)
The NSError indicates a failure because a file already exists at that path if its domain is NSCocoaErrorDomain and its code is NSFileWriteFileExistsError.

how to fetch a data from one file location and to run using tcl code

In tcl how to get the data from one file location and to run that data using TCL code .
for example
In the folder 1 there is config file ,i want to get the informations of config file and i want to execute the information that is present or not,

If the configuration file contains Tcl code, it's just:
# Put the filename in quotes if you want, or in a variable, or ...
source /the/path/to/the/file.tcl
If the file contains Tcl code but you don't trust it, you can use a “safe interpreter” context. This disables many commands, giving a much more restricted set of capabilities that you can then add specific exceptions to (with interp alias):
# Make the context
set i [interp create -safe]
# Set up a way for the context to let the master find out about what to
# really set
interp alias $i configure {} recordConfiguration
proc recordConfiguration args {
puts "configured with $args"
}
# Evaluate the script (note that [source] is hidden by default) in the context
$i invokehidden source /the/path/to/the/file.tcl
# Dispose the context
interp delete $i
If the file isn't Tcl code, you have to parse it. That's a substantially more complex matter, so much so that we'll need to know the format of the file before we can answer.

If you are trying to read data (like text strings) from a file then you'll have to open a channel for that particular file like this:
set fileid [open "path/to/your/file.txt" r]
Read open manual page.
Then you can use gets command to read data from the file through the channel fileid .

Is there something im not understaning about the fileExistsAtPath:isDirectory method?

I do not understand how this method works. Here is the code
BOOL isDir = NULL;
BOOL returnVal;
path = #"/Users/me/Desktop/kkk";
returnVal = [[NSFileManager defaultManager] fileExistsAtPath:path isDirectory:&isDir];
And here are the results if:
1) kkk is a file
returnVal = NO
isDir = NO
2) kkk is an empty directory
returnVal = YES
isDir = YES
Scenario #2 seems to work as expected, but according to the documentation:
path
The path of a file or directory. If path begins with a tilde (~), it must first be expanded with stringByExpandingTildeInPath, or this method will return NO.
isDirectory
Upon return, contains YES if path is a directory or if the final path element is a symbolic link that points to a directory, otherwise contains NO. If path doesn’t exist, this value is undefined upon return. Pass NULL if you do not need this information.
So for scenario #1 shouldnt the result be the following?
returnVal = YES
isDir = NO
1) Edit
For the comments below.
But the files do exist. I create the file manually to test it. lol. its only a program with 4 lines of code. I have both file/folder on the desktop. First i put a file there called "kkk" (with no extention) then i remove the file and place a folder there called "kkk". It works for the folder, but not for the file. Interestingly, if the file has an extension, it works. So is there something wrong with a file with no extension ? (are you guys still not able to reproduce it with no extention?)
2) EDIT
Thanks for helping me solve this guys. I have my Mac set to display the extension of files. But it seems Mac as an odd behavior. I select the file "kk.plist" and then rename the file to "kk" as you see in the image. As soon as i do this Mac OSx automatically selects the hide extension option. So when i thought the file was "kk", it was still "kk.plist" with its extension hidden. As you can see, both files have the same extension, one is hidden, the other is not. I didnt realize hidden extensions can be applied to one file only. Thanks.

1) kkk is a file
returnVal = NO
isDir = NO
⋮
So for scenario #1 shouldnt the result be the following?
returnVal = YES
isDir = NO
Yes. But be wary of the Finder hiding things from you when you're trying to verify results from this method.
As you found, one example is hiding extensions: you gave a path with no extension and were surprised when it didn't find a file that you thought had no extension; in truth, it still had an extension, which the Finder had hidden, so it still did not match the path, so the result you got was correct.
The other example is hidden (a.k.a. invisible) items. You may get a result of YES for a file that you can't find in the Finder. The Go command will temporarily reveal an invisible directory, but won't help you for files.
Whenever fileExists:isDirectory:'s results surprise you, and the Finder appears to show that the results are wrong, try to ls the path in the Terminal:
ls -dl /path/to/item
If that command prints a description of the item, then it exists. If it prints an error, then it doesn't. You can also tell from the output whether the item is a directory or not.

stat vs mkdir with EEXIST

I need to create folder if it does not exists, so I use:
bool mkdir_if_not_exist(const char *dir)
{
bool ret = false;
if (dir) {
// first check if folder exists
struct stat folder_info;
if (stat(dir, &folder_info) != 0) {
if (errno == ENOENT) { // create folder
if (mkdir(dir, S_IRWXU | S_IXGRP | S_IRGRP | S_IROTH | S_IXOTH) ?!= 0) // 755
perror("mkdir");
else
ret = true;
} else
perror("stat");
} else
ret = true; ?// dir exists
}
return ret;
}
The folder is created only during first run of program - after that it is just a check.
There is a suggestion to skipping the stat call and call mkdir and check errno against EEXIST.
Does it give real benefits?

More important, with the stat + mkdir approach, there is a race condition: in between the stat and the mkdir another program could do the mkdir, so your mkdir could still fail with EEXIST.

There's a slight benefit. Look up 'LBYL vs EAFP' or 'Look Before You Leap' vs 'Easier to Ask Forgiveness than Permission'.
The slight benefit is that the stat() system call has to parse the directory name and get to the inode - or the missing inode in this case - and then mkdir() has to do the same. Granted, the data needed by mkdir() is already in the kernel buffer pool, but it still involves two traversals of the path specified instead of one. So, in this case, it is slightly more efficient to use EAFP than to use LBYL as you do.
However, whether that is really a measurable effect in the average program is highly debatable. If you are doing nothing but create directories all over the place, then you might detect a benefit. But it is definitely a small effect, essentially unmeasurable, if you create a single directory at the start of a program.
You might need to deal with the case where strcmp(dir, "/some/where/or/another") == 0 but although "/some/where" exists, neither "/some/where/or" nor (of necessity) "/some/where/or/another" exist. Your current code does not handle missing directories in the middle of the path. It just reports the ENOENT that mkdir() would report. Your code that looks does not check that dir actually is a directory, either - it just assumes that if it exists, it is a directory. Handling these variations properly is trickier.

Similar to Race condition with stat and mkdir in sequence, your solution is incorrect not only due to the race condition (as already pointed out by the other answers over here), but also because you never check whether the existing file is a directory or not.
When re-implementing functionality that's already widely available in existing command-line tools in UNIX, it always helps to see how it was implemented in those tools in the first place.
For example, take a look at how mkdir(1) -p option is implemented across the BSDs (bin/mkdir/mkdir.c#mkpath in OpenBSD and NetBSD), all of which, on mkdir(2)'s error, appear to immediately call stat(2) to subsequently run the S_ISDIR macro to ensure that the existing file is a directory, and not just any other type of a file.

Why does a read-only open of a named pipe block?

I've noticed a couple of oddities when dealing with named pipes (FIFOs) under various flavors of UNIX (Linux, FreeBSD and MacOS X) using Python. The first, and perhaps most annoying is that attempts to open an empty/idle FIFO read-only will block (unless I use os.O_NONBLOCK with the lower level os.open() call). However, if I open it for read/write then I get no blocking.
Examples:
f = open('./myfifo', 'r') # Blocks unless data is already in the pipe
f = os.open('./myfifo', os.O_RDONLY) # ditto
# Contrast to:
f = open('./myfifo', 'w+') # does NOT block
f = os.open('./myfifo', os.O_RDWR) # ditto
f = os.open('./myfifo', os.O_RDONLY|os.O_NONBLOCK) # ditto
Note: The behavior is NOT Python specific. Example in Python to make it easier to replicate and understand for a broader audience).
I'm just curious why. Why does the open call block rather than some subsequent read operation?
Also I've noticed that a non-blocking file descriptor can exhibit two different behaviors in Python. In the case where I use os.open() with the os.O_NONBLOCK for the initial opening operation, then an os.read() seems to return an empty string if data is not ready on the file descriptor. However, if I use fcntl.fcnt(f.fileno(), fcntl.F_SETFL, fcntl.GETFL | os.O_NONBLOCK) then an os.read raises an exception (errno.EWOULDBLOCK)
Is there some other flag being set by the normal open() that's not set by my os.open() example? How are they different and why?

That's just the way it's defined. From the Open Group page for the open() function
O_NONBLOCK
When opening a FIFO with O_RDONLY or O_WRONLY set: If O_NONBLOCK is
set:
An open() for reading only will return without delay. An open()
for writing only will return an error if no process currently
has the file open for reading.
If O_NONBLOCK is clear:
An open() for reading only will block the calling thread until a
thread opens the file for writing. An open() for writing only
will block the calling thread until a thread opens the file for
reading.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas