How does the system choose the right Page Table? - process

Let's focus on uniprocessor computer systems. When a process gets created, as far as I know, the page table gets set up which maps the virtual addresses to the physical memory address space. Each process gets its own page table, stored in the kernel address space. But how does the MMU choose the right page table for the process since there is not only one process running and there will be many context switches happening?
Any help is appreciated!
Best,
Simon

Processors have a privileged register called the page table base register (PTBR), on x86 it is CR3. On a context switch, the OS changes the value of the PTBR so that the processor now knows which page table to use. In addition to the PTBR, many modern processors have a notion of an address space number (ASN). Processes are given an address space number (from a limited pool) and this ASN is set in a register on a context switch as well. This ASN is used as part of TLB matching and allows TLB entries from multiple address spaces to coexist. Only when an ASN is reused is it necessary to flush the TLB, and then only for entries matching that ASN. Most x86 implementations are more coarse grained than this and there is a notion of global pages (for shared libraries and shared data).

The MMU in this case is unaware completely of what a process is. The operating system, which keeps tracks of processes, generates a page table for each process, as you say, as they are created. The process for context switching is as follows:
The operating system tells the MMU to use page table located at physical address 0xFOO
The operating system programs the programmable interrupt timer (PIT) to cause a hardware interrupt after BAR milliseconds.
The operating system restores the process state (CPU registers, program counter, etc) and jumps to the correct address.
The process runs until the PIT triggers an interrupt.
The Operating System routine for handling the PIT interrupt then saves the program state (registers etc), uses a scheduling algorithm for determining the next process to run (in a simple case, a circular linked list), then starts over at step 1.
I hope that clears up any doubts you may have. The short answer: The MMU is process agnostic and doesn't know what a process is.

Related

How is the Address Space (of a process) and Process Control Block (PCB) are related in Operating System?

If we talk about Address Space of a process it is the virtual address range which includes static data, stack and heap memory for that particular process. And coming to Process Control Block (PCB) which is a data structure maintained by operating system for each process it manages, where PCB includes a lot of information about the process like process no., process state, program counter, list of open files, cpu scheduling info ...and more.
Now this is the point where I got confused that Address Space is also a memory which stores information about a process and similar thing is done by PCB too. Then how these are connected to each other. I am not able to visualize this in my mind. Why we have these two things existing simultaneously. Isn't it possible to achieve our goal just by using PCB?
Process Address space refer to memory regions the process is using. It typically consists of heap, stack, initialized data, uninitialized data and text. There are mainly two address spaces of a process -- logical and physical.
The PCB is a structure resides in kernel to track the state of process. One of the things PCB contain is memory information. In a typically system, PCB may contain information about pages the process has.
To answer your question, Process Address space is an idea built on top of PCB and many other things (such as page table).

Does context switch between processes invalidate the MMU(memory control unit)?

This is a sentence in the PowerPoint of my system lecture, but I don't understand why context switch invalidates the MMU. I know it will invalidate the cache since the cache contains information of another process. However, as for MMU, it just maps virtual memory to physical memory. If context switch invalidates it, does this mean the MMU use different mechanism of mapping in different processes?
Does this mean the MMU use different mechanism of mapping in different processes?
Your conclusion is essentially right.
Each process has its mapping from virtual to physical addresses (called context).
The address 0x401000 for example can be translated to 0x01234567 for process A and to 0x89abcdef for process B.
Having different contexts allows for an easy isolation of the processes, easy on demand paging and simplified relocation.
So each context switch must invalidate the TLB or the CPU would continue using the old translations.
Some pages however are global, meaning that they have the same translation independently of the current process address space.
For example the kernel code is mapped in the same way for every process adn thus doesn't need to be remapped.
So in the end only a part of the TLB is invalidated.
You can read how Linux handles the process address space for a real example of applied theory.
What you are describing is entirely system specific.
First of all, what they are probably referring to is invaliding the MMU cache. That assume the MMU has a cache (likely these days but not guaranteed).
When a context switch occurs, the processor has set put the MMU in a state where leftovers from the previous process would screw up the new process. If it did not, the cache would map the new process's logical pages to the old process's physical page frames.
For example, some processors use one page table for the system space and one or more other page tables for the user space. After a context switch, it would be ideal for the processor to invalidate any caching of the user space page tables but leave any caching of the system table table alone.
Note that in most processors all of this is done entirely behind the scenes. Even OS programmers do not need to deal with (or even be aware of) any flushing or invalidation of the MMU. There is a single switch process context instruction that handles everything. Other processors require the OS programmer to handle additional tasks as part of a context switch which, in some oddball processors, includes explicitly flushing the MMU cache.

Setting mode bits during OS system calls

I wanted to know exactly whose responsibility is it to set the mode bits during system calls to the kernel.
Does the job scheduler manage these bits, or is the whole Process Status Word (PSW) a part of the Process Control Block?
Or is it the responsibility of the interrupt handler to do this? If so, how does the Interrupt Service routine (being a routine itself) get to perform such a privileged task and not any other user routine? What if some user process tries to address the PSW ?Is the behavior different for different Operating Systems?
Alot of the protection mechanisms you ask about are architecture specific. I believe that the Process Status Word refers to an IBM architecture, but I am not certain. I don't know specifically how the Process Status Word is used in that architecture
I can, however, give you an example of how this is done in the case of x86. In x86, privileged instructions can only be executed on ring 0, which is what the interrupt handlers and other kernel code execute in.
The way the CPU knows whether code is in kernel space or user space is via protection bits set on that particular page in the virtual memory system. That means when a process is created, certain areas of memory are marked as being user code and other areas, where the kernel is mapped to, is marked as being kernel code, so the processor knows whether code being executed should have privileged access based on where it is in the virtual memory space. Since only the kernel can modify this space, user code is unable to execute privileged instructions.
The Process Control Block is not architecture specific, which means that it is entirely up to the operating system to determine how it is used to set up privileges and such. One thing is for certain, however, the CPU does not read the Process Control Block as it exists in the operating system. Some architectures, however, could have their own process control mechanism built in, but this is not strictly necessary. On x86, the Process Control Block would be used to know what sort of system calls the process can make, as well as virtual memory mappings which tell the CPU it's privilege level.
While different architectures have different mechanisms for protecting user code, they all share many common attributes in that when kernel code is executed via a system call, the system knows that only the code in that particular location can be privileged.

Running multiple executables linked to 0x400000

I'm interested in operating systems topic and I have a dummy question. Standard PE executable files are linked to 0x400000. My question is how can operating system load multiply executables with same image base, when virtual memory just maps virtual addresses to physical. Is it storing PDE and PTE index of thread somewhere? Is there some addition to each address before execution starts? How does it work?
Each process gets its own virtual address space, and hence there's no conflict. All virtual address spaces that exist in any one time in the system get mapped into the physical address space. Virtual memory that can't or currently isn't mapped onto a particular physical memory is held in the swap file (swap partition, or alike) — this is called paging.
During thread switches, when the CPU is about to execute a thread from a different process than it was executing so far, the operating system's scheduler informs the CPU (sets the respective registers) about the new virtual address translation table to use. Thus the CPU thinks there's just one virtual address space at the given time, while the operating system can manage many more, one for each process.
Disclaimer: My answer may be a thought of as a bit superficial or imprecise as opposed to the reality. This for the sake of simplicity in respect to the nature of the OPs question. Also, these mechanisms are CPU-dependent and operating system-dependent.

System call without context switching?

I was just reading up on how linux works in my OS-book when I came across this..
[...] the kernel is created as a single, monolitic binary. The main reason is to improve performance. Because all kernel code and data structures are kept in a single address space, no context switches are necessary when a process calls an operating-system function or when a hardware interrup is delivered.
That sounded quite amazing to me, surely it must store the process's context before running off into kernel mode to handle an interrupt.. But ok, I'll buy it for now. A few pages on, while describing a process's scheduling context, it said:
Both system calls and interrups that occur while the process is executing will use this stack.
"this stack" being the place where the kernel stores the process's registers and such.
Isn't this a direct contradiction to the first quote? Am I missinterpreting it somehow?
I think the first quote is referring to the differences between a monolithic kernel and a microkernel.
Linux being monolithic, all its kernel components (device drivers, scheduler, VM manager) run at ring 0. Therefore, no context switch is necessary when performing system calls and handling interrupts.
Contrast microkernels, where components like device drivers and IPC providers run in user space, outside of ring 0. Therefore, this architecture requires additional context switches when performing system calls (because the performing module might reside in user space) and handling interrupts (to relay the interrupts to the device drivers).
"Context switch" could mean one of a couple of things, both relevant: (1) switching from user to kernel mode to process the system call, or an involuntary switch to kernel mode to process an interrupt against the interrupt stack, or (2) switching to run another user process in user space, with a jump to kernel space in between the two.
Any movement from user space to kernel space implies saving enough user-space to return to it reliably. If the kernel-space code decides that - while you're no longer running the user-code for that process - it's time to let another user-process run, it gets in.
So at the least, you're talking 2-3 stacks or places to store a "context": hardware-interrupts need a kernel-level stack to say what to return to; user method/subroutine calls use a standard stack for getting that done. Etc.
The original Unix kernels - and the model isn't that different now for this part - ran the system calls like a short-order cook processing breakfast orders: move this over on the stove to make room for the order of bacon that just arrived, start the bacon, go back to the first order. All in kernel switching context. Was not a huge monitoring application, which probably drove the IBM and DEC software folks mad.
When making a system call in Linux, a context switch is done from user-space to kernel space (ring3 to ring0). Each process has an associated kernel mode stack, that is used by the system call. Before the system call is executed, the CPU registers of the process are stored on its user-mode stack, this stack is different from the kernel mode stack, and is the one which the process uses for user-space executions.
When a process is in kernel mode (or user mode), calling functions of the same mode will not require a context switch. This is what is referred by the first quote.
The second quote refers to the kernel mode stack, and not the user-mode stack.
Having said this, I must mention Linux optimisations, where no transition is needed to the kernel space for executing a system call, i.e. all processing related to the system call is done in the user space itself (thus no context switch). vsyscall, and VDSO are such techniques. The idea behind them is quite simple. It is to send to the user space, the data that is required for execution of the corresponding system call. More info can be found in this LWN article.
In addition to this, there have been some research projects in which all the execution happens in the same ring. User space programs, and the OS code, both reside in the same ring. Idea is to get rid of the overhead of ring switches. Microsoft's [singularity][2] OS is one such project.