flock() vs. fcntl() semantics in glibc - locking

Related: one, two
It's stated that flock() (BSD-locks) and fcntl() (POSIX record-level locks) gives the user incompatible semantics, particularly, in regards of lock release.
However, in glibc flock() is implemented in terms of POSIX fcntl(). (I checked this on official git repo, here is just a viewable link)
https://code.woboq.org/userspace/glibc/sysdeps/posix/flock.c.html#18
/* This file implements the flock' function in terms of the POSIX.1fcntl'
locking mechanism. In 4BSD, these are two incompatible locking mechanisms,
perhaps with different semantics? */
How can these facts hold together?

On Linux, flock is a system call. flock locks and fcntl locks are independent and do not interfere with each other (on local file systems at least).
The glibc source file sysdeps/posix/flock.c is not actually used on Linux. The real implementation is the system call wrapper generated from this line in sysdeps/unix/sysv/linux/syscalls.list:
flock - flock i:ii __flock flock
OFD locks are yet another kind of locks, but they do interact with POSIX record locks. However, they have more reasonable behavior with multiple threads, and closing one descriptor does not release all locks for the same underlying file held by the same process (which makes POSIX record locking so difficult to use in multi-threaded processes).

NOTE. This is completely wrong, see the accepted answer. Still keeping it alive since it has a few useful links
Well, this was quite dull -- fcntl uses same flock struct as an argument and differentiates open file locks (BSD locks in my notation above) from process-associated file locks (POSIX record-level locks in my notation above) on a l_pid field value basis.
glibc docs on Open File Description Locks:
Open file description locks use the same struct flock as process-associated locks as an argument (see File Locks) and the macros for the command values are also declared in the header file fcntl.h. To use them, the macro _GNU_SOURCE must be defined prior to including any header file.
...
In contrast to process-associated locks, any struct flock used as an argument to open file description lock commands must have the l_pid value set to 0. Also, when returning information about an open file description lock in a F_GETLK or F_OFD_GETLK request, the l_pid field in struct flock will be set to -1 to indicate that the lock is not associated with a process.
Also, see glibc doc on process-associated file locks

Related

What does "Top level reordering" mean?

I have an open MPI code for Fortran, which compiles and runs without errors when using no optimization flags. When I swith on the -O1 flag, there is a Segmentation Fault error at the execution time. The only optimization flag that causes this problem is -ftoplevel-reorder. Can you intuitively explain what this flag does and what the best strategy is for spotting a bug in the code (if any)?
from https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
-fno-toplevel-reorder
Do not reorder top-level functions, variables, and asm statements. Output them in the same order that they appear in the input file. When this option is used, unreferenced static variables are not removed. This option is intended to support existing code that relies on a particular ordering. For new code, it is better to use attributes when possible.
Enabled at level -O0. When disabled explicitly, it also implies -fno-section-anchors, which is otherwise enabled at -O0 on some targets.
you might be accessing an array out of bounds, and depending on how local variables are put on the stack, the consequences span from unnoticeable to a fatal crash.

When a dll is loaded into memory, which part(s) can be shared between processes?

I meet a question about the interview test:
When a dll is loaded into memory, which part(s) can be shared between processes?
A. code segment
B. static variable global variable
C. external difinitions and references for linking
D. BSS segment
who can give me an answer and a clear explation?
Processes are actually isolated programs running multiple threads in an OS. Generally operating system policy says, All processes are memory isolated from each other.
Code Segment : [NOT SHARED]
BSS and Static Fields : [NOT SHARED]
Reason is very simple, why a operating system allow process A to access process B's binary? that's security and memory protection violation. process A could corrupt (if write access is given) process B memory.
what about External Definitions?
Here comes the interesting part, External definitions could be statically or dynamically linked.
statically linked library implies definitions are linked at compiled time and programs binary contains It's machine code.
dynamically linked implies definitions are linked just after user commands to load any program in memory. OS calls dynamic library loader to resolve external dependencies of the program by providing shared object's path.
This shared object is cached by operating system in a different page frame, and every time when a program demands for this library, It simply copy It's page frame to process's virtual memory; and do required dynamic linking. This way multiple process have same binary of a single library.
It save RAM memory and time in loading library from disk, Also dynamic linking reduces binary size of the program.
It is also possible that the OS choose to load library again from disk, and thus make two copies of same library. This is a part of dynamic linking operation. I don't go into more depth, but if you are really interested https://en.wikipedia.org/wiki/Dynamic_linker or just ping me in comments section.
But regarding BSS and static fields, It is again not shared; Whenever a write operation is performed on such region (which is shared). Operating System create a new copy of that region for the other process. This makes sure that both process have their own copy of BSS and static fields.

How to instruct linker to consider strong IRQ definition present in static library insteadof weak definition

We have problem in linking strong USB_IRQ handler.
We have real USB IRQ definition present in a static library.
We are filling .vector table in application startup file (*.s) with handler name and we also have the __weak definition, defined in the same startup file.
While linking we see linker always picks-up weak IRQ definition present in the startup file instead of strong IRQ definition present in the library (*.a).
If we remove weak definition from startup file, the strong definition is considered and it works well.
The problem that we see is, the library file that contains strong definition is not referred in any means from our application, that means, we are not using any functions or structures present in that file. only, IRQ handler is used and that too it trigger only when there is a hardware event.
We use ARM GNU tool chain, tried multiple options nothing helps.
We went through the internet help, and found few options like, --no_remove and --keep linker options, but, these flags does not seems to be supported.
Please suggest if you have some input in this regard.
I think you need to make sure that at least one symbol in the same source file as the strong definition gets referenced from your program. Otherwise, the linker will have no reason to load the object file that contains the IRQ. So for example you could define some init function in that source file, and call it from your program.
Make sure the IRQ function is declared with __attribute__((used)) as well.

STM32 programming tips and questions

I could not find any good document on internet about STM32 programming. STM's own documents do not explain anything more than register functions. I will greatly appreciate if anyone can explain my following questions?
I noticed that in all example programs that STM provides, local variables for main() are always defined outside of the main() function (with occasional use of static keyword). Is there any reason for that? Should I follow a similar practice? Should I avoid using local variables inside the main?
I have a gloabal variable which is updated within the clock interrupt handle. I am using the same variable inside another function as a loop condition. Don't I need to access this variable using some form of atomic read operation? How can I know that a clock interrupt does not change its value in the middle of the function execution? Should I need to cancel clock interrupt everytime I need to use this variable inside a function? (However, this seems extremely ineffective to me as I use it as loop condition. I believe there should be better ways of doing it).
Keil automatically inserts a startup code which is written in assembly (i.e. startup_stm32f4xx.s). This startup code has the following import statements:
IMPORT SystemInit
IMPORT __main
.In "C", it makes sense. However, in C++ both main and system_init have different names (e.g. _int_main__void). How can this startup code can still work in C++ even without using "extern "C" " (I tried and it worked). How can the c++ linker (armcc --cpp) can associate these statements with the correct functions?
you can use local or global variables, using local in embedded systems has a risk of your stack colliding with your data. with globals you dont have that problem. but this is true no matter where you are, embedded microcontroller, desktop, etc.
I would make a copy of the global in the foreground task that uses it.
unsigned int myglobal;
void fun ( void )
{
unsigned int myg;
myg=myglobal;
and then only use myg for the rest of the function. Basically you are taking a snapshot and using the snapshot. You would want to do the same thing if you are reading a register, if you want to do multiple things based on a sample of something take one sample of it and make decisions on that one sample, otherwise the item can change between samples. If you are using one global to communicate back and forth to the interrupt handler, well I would use two variables one foreground to interrupt, the other interrupt to foreground. yes, there are times where you need to carefully manage a shared resource like that, normally it has to do with times where you need to do more than one thing, for example if you had several items that all need to change as a group before the handler can see them change then you need to disable the interrupt handler until all the items have changed. here again there is nothing special about embedded microcontrollers this is all basic stuff you would see on a desktop system with a full blown operating system.
Keil knows what they are doing if they support C++ then from a system level they have this worked out. I dont use Keil I use gcc and llvm for microcontrollers like this one.
Edit:
Here is an example of what I am talking about
https://github.com/dwelch67/stm32vld/tree/master/stm32f4d/blinker05
stm32 using timer based interrupts, the interrupt handler modifies a variable shared with the foreground task. The foreground task takes a single snapshot of the shared variable (per loop) and if need be uses the snapshot more than once in the loop rather than the shared variable which can change. This is C not C++ I understand that, and I am using gcc and llvm not Keil. (note llvm has known problems optimizing tight while loops, very old bug, dont know why they have no interest in fixing it, llvm works for this example).
Question 1: Local variables
The sample code provided by ST is not particularly efficient or elegant. It gets the job done, but sometimes there are no good reasons for the things they do.
In general, you use always want your variables to have the smallest scope possible. If you only use a variable in one function, define it inside that function. Add the "static" keyword to local variables if and only if you need them to retain their value after the function is done.
In some embedded environments, like the PIC18 architecture with the C18 compiler, local variables are much more expensive (more program space, slower execution time) than global. On the Cortex M3, that is not true, so you should feel free to use local variables. Check the assembly listing and see for yourself.
Question 2: Sharing variables between interrupts and the main loop
People have written entire chapters explaining the answers to this group of questions. Whenever you share a variable between the main loop and an interrupt, you should definitely use the volatile keywords on it. Variables of 32 or fewer bits can be accessed atomically (unless they are misaligned).
If you need to access a larger variable, or two variables at the same time from the main loop, then you will have to disable the clock interrupt while you are accessing the variables. If your interrupt does not require precise timing, this will not be a problem. When you re-enable the interrupt, it will automatically fire if it needs to.
Question 3: main function in C++
I'm not sure. You can use arm-none-eabi-nm (or whatever nm is called in your toolchain) on your object file to see what symbol name the C++ compiler assigns to main(). I would bet that C++ compilers refrain from mangling the main function for this exact reason, but I'm not sure.
STM's sample code is not an exemplar of good coding practice, it is merely intended to exemplify use of their standard peripheral library (assuming those are the examples you are talking about). In some cases it may be that variables are declared external to main() because they are accessed from an interrupt context (shared memory). There is also perhaps a possibility that it was done that way merely to allow the variables to be watched in the debugger from any context; but that is not a reason to copy the technique. My opinion of STM's example code is that it is generally pretty poor even as example code, let alone from a software engineering point of view.
In this case your clock interrupt variable is atomic so long as it is 32bit or less so long as you are not using read-modify-write semantics with multiple writers. You can safely have one writer, and multiple readers regardless. This is true for this particular platform, but not necessarily universally; the answer may be different for 8 or 16 bit systems, or for multi-core systems for example. The variable should be declared volatile in any case.
I am using C++ on STM32 with Keil, and there is no problem. I am not sure why you think that the C++ entry points are different, they are not here (Keil ARM-MDK v4.22a). The start-up code calls SystemInit() which initialises the PLL and memory timing for example, then calls __main() which performs global static initialisation then calls C++ constructors for global static objects before calling main(). If in doubt, step through the code in the debugger. It is important to note that __main() is not the main() function you write for your application, it is a wrapper with different behaviour for C and C++, but which ultimately calls your main() function.

Should a class or method which processes a file close the file as a side effect?

I'm wondering which is the more 'Pythonic' / better way to write methods which process files. Should the method which processes the file close that file as a side effect? Should the concept of the data being a 'file' be completely abstracted from the method which is processing the data, meaning it should expect some 'stream' but not necessarily a file?:
As an example, is it OK to do this:
process(open('somefile','r'))
... carry on
Where process() closes the file handle:
def process(somefile):
# do some stuff with somefile
somefile.close()
Or is this better:
file = open('somefile','r')
process(file)
file.close()
For what it's worth, I generally am using Python to write relatively simple scripts that are extremely specifically purposed, where I'll likely be the only one ever using them. That said, I don't want to teach myself any bad practices, and I'd rather learn the best way to do things, as even a small job is worth doing well.
Generally, it is better practice for the opener of a file to close the file. In your question, the second example is better.
This is to prevent possible confusion and invalid operations.
Edit: If your real code isn't any more complex than your example code, then it might be better just to have process() open and close the file, since the caller doesn't use the file for anything else. Just pass in the path/name of the file. But if it is conceivable that a caller to process() will use the file before the file should be closed, then leave the file open and close operations outside of process().
Using with means not having to worry about this.
At least, not on 2.5+.
No, because it reduces flexibility: if function doesn't close file and caller doesn't need it - caller can close file. If it does but caller still need file open - caller stuck with reopening file, using StringIO, or something else.
Also closing file object require additional assumptions on it's real type (object can support .read() or .write(), but have no meaningful .close()), hindering duck-typing.
And files-left-open is not a problem for CPython - it will be closed on garbage collection instantly after something(open('somefile')). (other implementation will gc and close file too, but at unspecified moment)