Understanding postgreSQL shared memory - sql

I've watched the presentation and still have one question about working of shared buffers. As the slide 16 shows, when the server handles an incoming request, the postmaster process calls fork() to create a child one for handling the incoming request. Here is a picture from there:
So, we have the entire copy of the postmaster process except its pid. Now, if the child process update some data belonging to shared memory (putting in shared buffers, as shown in the slide 17), we need the other threads be awared of the changes. The picture:
The synchronization process is what I don't understand. Any process owns a copy of the shared memory and while copying it doesn't know if another thread will write something to its copy of the shared memory. What if after creating proc1 by calling fork(), another process proc2 is created a little bit later and start writing something into the its copy of the shared memory.
Question: How does proc1 know what to do with the part of the shared memory that are being modified by proc2?

The crucial thing to understand is that there are two different types of memory sharing used.
One is the copy-on-write sharing used by fork() (without exec()), where the child process inherits the parent process's memory and state. In this case when the child or parent modify anything, a new private copy of the modified memory page is allocated. So the child doesn't see changes made by the parent after fork() and the parent doesn't see changes made by the child after fork(). Peer children cannot see each other's changes either. They're all isolated as far as memory is concerned, they just share a common ancestor.
That memory is what's shown in the Program (text), data and stack sections of the diagram.
Because of that isoltion, PostgreSQL also uses POSIX shared memory - or, in older versions, system V shared memory. These are explicitly shared memory segments that are mapped to a range of addresses. Each process sees the same memory, and it is not copy-on-write. It's fully read/write shared.
This is what is shown in the purple "shared memory" section of the diagram.
POSIX shared memory is used for inter-process communication for locking, for shared_buffers, etc etc. Not the memory inherited from fork()ing.
While memory from fork is often shared copy-on-write, that's really an operating system implementation detail. The operating system could choose not to share it at all, and make an immediate copy of the parent's whole address space for the child at fork time. The only way the copy-on-write sharing is really relevant is when looking at top etc.
When PostgreSQL refers to "shared memory" it's always talking about the POSIX or System V shared memory block(s) that are mapped into each process's address space. Not copy-on-write sharing from fork().

I don't know about this special case but generally in linux and most other operating systems in order to speedup creating a new process, when a process asks operating system to create a new process then OS creates the new one with minimum requirements (specifically in DB applications) and share most of parent memory space with child. Now when a child want to modify some part of shared memory, OS uses COW (copy on write) concept and create a new copy of that part of the memory for child process usage. So this part becomes specific for child process and is no longer shared with parent process.

Related

How is the Address Space (of a process) and Process Control Block (PCB) are related in Operating System?

If we talk about Address Space of a process it is the virtual address range which includes static data, stack and heap memory for that particular process. And coming to Process Control Block (PCB) which is a data structure maintained by operating system for each process it manages, where PCB includes a lot of information about the process like process no., process state, program counter, list of open files, cpu scheduling info ...and more.
Now this is the point where I got confused that Address Space is also a memory which stores information about a process and similar thing is done by PCB too. Then how these are connected to each other. I am not able to visualize this in my mind. Why we have these two things existing simultaneously. Isn't it possible to achieve our goal just by using PCB?
Process Address space refer to memory regions the process is using. It typically consists of heap, stack, initialized data, uninitialized data and text. There are mainly two address spaces of a process -- logical and physical.
The PCB is a structure resides in kernel to track the state of process. One of the things PCB contain is memory information. In a typically system, PCB may contain information about pages the process has.
To answer your question, Process Address space is an idea built on top of PCB and many other things (such as page table).

Operating Systems: Process Creation

What does this statement in the context of process creation mean?
"In UNIX, the child's initial address space is a copy of the parent's, but there are definitely two distinct address spaces involved; no writable memory is shared".
I do understand that after the fork system call, the parent process is cloned and that clears the copied part. What I find difficult understanding is the "different address spaces" part, after the address space is copied.
Thank You
"Different address spaces" just means that the two processes have separate and independent copies of all their data in memory. Initially those copies are the same, but each process can change data in its own memory and the changes are not visible to the other process. For example, if the initial process has a variable called x stored at address 0x01234567, after the fork() both processes will have a variable at that address, but they're different variables that can hold different values despite having the same address. An address like 0x01234567 actually corresponds to different places in RAM in each process.
If both processes shared the same address space, they'd both be looking at the same memory (rather than separate and independent copies of it), so changes made by one process would be visible to the other. An address like 0x01234567 would refer to the same spot in RAM in both processes.
(In principle, fork() makes a complete copy of all the calling process's memory. In practice, the copying is typically deferred, using a technique called "copy-on-write" that allows the system to avoid making duplicate copies of data that's the same in both processes. But that's an implementation detail that's basically invisible to applications; the system behaves as if fork() made a complete copy of everything.)
In linux there is a concept of COW (Copy On Write). When a fork() is called the child process will be created. The child and parent process will have there own address space. The child process clones the address space of parent (including stack and heap). But it is required to know when the cloning will happen. If the parent has stack memory of say 100 bytes and the child process does not modify that memory (or simply reads) through out it's life span then the stack memory of 100 bytes will not be cloned. Only when the child process tries to write to that memory the memory will be cloned. This functionality in linux is called COW.

Does context switch between processes invalidate the MMU(memory control unit)?

This is a sentence in the PowerPoint of my system lecture, but I don't understand why context switch invalidates the MMU. I know it will invalidate the cache since the cache contains information of another process. However, as for MMU, it just maps virtual memory to physical memory. If context switch invalidates it, does this mean the MMU use different mechanism of mapping in different processes?
Does this mean the MMU use different mechanism of mapping in different processes?
Your conclusion is essentially right.
Each process has its mapping from virtual to physical addresses (called context).
The address 0x401000 for example can be translated to 0x01234567 for process A and to 0x89abcdef for process B.
Having different contexts allows for an easy isolation of the processes, easy on demand paging and simplified relocation.
So each context switch must invalidate the TLB or the CPU would continue using the old translations.
Some pages however are global, meaning that they have the same translation independently of the current process address space.
For example the kernel code is mapped in the same way for every process adn thus doesn't need to be remapped.
So in the end only a part of the TLB is invalidated.
You can read how Linux handles the process address space for a real example of applied theory.
What you are describing is entirely system specific.
First of all, what they are probably referring to is invaliding the MMU cache. That assume the MMU has a cache (likely these days but not guaranteed).
When a context switch occurs, the processor has set put the MMU in a state where leftovers from the previous process would screw up the new process. If it did not, the cache would map the new process's logical pages to the old process's physical page frames.
For example, some processors use one page table for the system space and one or more other page tables for the user space. After a context switch, it would be ideal for the processor to invalidate any caching of the user space page tables but leave any caching of the system table table alone.
Note that in most processors all of this is done entirely behind the scenes. Even OS programmers do not need to deal with (or even be aware of) any flushing or invalidation of the MMU. There is a single switch process context instruction that handles everything. Other processors require the OS programmer to handle additional tasks as part of a context switch which, in some oddball processors, includes explicitly flushing the MMU cache.

IPC through writing into files?

I have a question about Inter-process-communication in operating systems.
Can 2 processes communicate with each other by both processes opening the same file (which say was created before both processes, so both processes have the file handler) and then communicating by writing into this file?
If yes, what does this method come under? I have heard that 2 major ways of IPC is by shared-memory and message-passing. Which one of these, this method comes under?
The reason, I am not sure if it comes under shared-memory is that, because this file is not mapped to address space of any of these processes. And, from my understanding, in shared-memory, the shared-memory-region is part of address space of both the processes.
Assume that processes write into the file in some pre-agreed protocol/format so both have no problem in knowing where the other process writes and when etc. This assumption is to merely understand. In real world though, this may be too stringent to hold true etc.
If no, what is wrong with this scenario? Is it that if 2 different processes open the same file, then the changes made by 1st process are not flushed into persistent storage for others to view until the process terminates? or something else?
Any real world example from Windows and Linux should also be useful.
Thanks,
Using a file is a kind of shared memory. Instead of allocating a common memory buffer in RAM, a common file is used.
To successfully manage the communication some kind of locking mechanism for different ranges in the file is needed. This could either be locking of ranges provided by the file system (available at least on Windows) or global operating system mutexes.
One real-world scenario where disk storage is used for inter-process-communication is the quorom disk used in clusters. It is a common disk resource, accessible over a SAN by all cluster nodes, that stores the cluster's configuration.
The posix system call mmap does mappings of files to virtual memory. If the mapping is shared between two processes, writes to that area in one process will affect other processes. Now coming to you question, yes a process reading from or writing to the underlying file will not always see the same data that the process that has mapped it, since the segment of the file is copied into RAM and periodically flushed to disk. Although I believe you can force synchronization with the msync system call. Do read up on mmap(). It has a host of other memory sharing options.

process control block vs process descriptor

what is the exact difference between process control block and process descriptor?.
I was reading about kernel of linux. It was written that there is some thread_info structure which contains the pointer to actual process descriptor table. It was written that the thread_info lies just above/below of kernel stack. So definitely thread_info is in main memory. But what about actual process descriptor task_struct? where is it located? If process descriptor resides in main memory, where is the actual place for it ?
The thread_info and task_struct structures are just two different structures that hold different pieces of information about a thread, with the thread_info holding more architecture-specific data than the task_struct. It makes more sense to split up the information rather than keep it all in the same structure. (Although you could put them in the same struct; the 2.4 Linux kernel did this.)
How those structs are allocated depends on the architecture you're using. The relevant functions you want to examine are alloc_task_struct() and alloc_thread_info().
In the kernel, the process descriptor is a structure called task_struct, which keeps track of process attributes and information. All kernel information regarding a process is found there.