This is a sentence in the PowerPoint of my system lecture, but I don't understand why context switch invalidates the MMU. I know it will invalidate the cache since the cache contains information of another process. However, as for MMU, it just maps virtual memory to physical memory. If context switch invalidates it, does this mean the MMU use different mechanism of mapping in different processes?
Does this mean the MMU use different mechanism of mapping in different processes?
Your conclusion is essentially right.
Each process has its mapping from virtual to physical addresses (called context).
The address 0x401000 for example can be translated to 0x01234567 for process A and to 0x89abcdef for process B.
Having different contexts allows for an easy isolation of the processes, easy on demand paging and simplified relocation.
So each context switch must invalidate the TLB or the CPU would continue using the old translations.
Some pages however are global, meaning that they have the same translation independently of the current process address space.
For example the kernel code is mapped in the same way for every process adn thus doesn't need to be remapped.
So in the end only a part of the TLB is invalidated.
You can read how Linux handles the process address space for a real example of applied theory.
What you are describing is entirely system specific.
First of all, what they are probably referring to is invaliding the MMU cache. That assume the MMU has a cache (likely these days but not guaranteed).
When a context switch occurs, the processor has set put the MMU in a state where leftovers from the previous process would screw up the new process. If it did not, the cache would map the new process's logical pages to the old process's physical page frames.
For example, some processors use one page table for the system space and one or more other page tables for the user space. After a context switch, it would be ideal for the processor to invalidate any caching of the user space page tables but leave any caching of the system table table alone.
Note that in most processors all of this is done entirely behind the scenes. Even OS programmers do not need to deal with (or even be aware of) any flushing or invalidation of the MMU. There is a single switch process context instruction that handles everything. Other processors require the OS programmer to handle additional tasks as part of a context switch which, in some oddball processors, includes explicitly flushing the MMU cache.
Related
I am reading the wikipedia page on system calls and I cannot reconcile a few of the statements that are made there.
At the bottom, it says that "A system call does not generally require a context switch to another process; instead, it is executed in the context of whichever process invoked it."
Yet, at the top, it says that "[...] applications to request services via system calls, which are often initiated via interrupts. An interrupt [...] passes control to the kernel [and then] the kernel executes a specific set of instructions over which the calling program has no direct control".
It seems to me that if the interrupt "passes control to the kernel," that means that the kernel, which is "another process," is executing and therefore a context switch happened. Therefore, there seems to be a contradiction in the wikipedia page. Where is my understanding wrong?
Your understanding is wrong because the kernel isn't a separate process. The kernel is sitting in RAM in shared memory areas. Typically, it sits in the top half of the virtual address space.
When the kernel is invoked with a system call, it is not necessarily using an interrupt. On x86-64, it is invoked directly using a specific processor instruction (syscall). This instruction makes the processor jump to the address stored in a special register.
Syscalls don't necessarily involve a full context switch. They must involve a user mode to kernel mode context switch. Most often, kernels have a kernel stack per process. This stack is mostly unused and empty when no system call is active as it then makes no sense to have anything stored in it.
The registers also need to be saved since the kernel can use them. I don't know for other processors but x86-64 does have the TSS allowing for automated user mode to kernel mode stack switch. The registers still need to be saved manually.
In the end, there is actually a necessary partial context switch when entering the kernel through a system call but it doesn't involve switching the whole process. Since the temporary storage for swapped registers and the kernel stack are already reserved, it involves much less overhead as the kernel doesn't need to touch the page tables. Swapping page tables often involves cache managing and some cache flushing to make it consistent.
Let's focus on uniprocessor computer systems. When a process gets created, as far as I know, the page table gets set up which maps the virtual addresses to the physical memory address space. Each process gets its own page table, stored in the kernel address space. But how does the MMU choose the right page table for the process since there is not only one process running and there will be many context switches happening?
Any help is appreciated!
Best,
Simon
Processors have a privileged register called the page table base register (PTBR), on x86 it is CR3. On a context switch, the OS changes the value of the PTBR so that the processor now knows which page table to use. In addition to the PTBR, many modern processors have a notion of an address space number (ASN). Processes are given an address space number (from a limited pool) and this ASN is set in a register on a context switch as well. This ASN is used as part of TLB matching and allows TLB entries from multiple address spaces to coexist. Only when an ASN is reused is it necessary to flush the TLB, and then only for entries matching that ASN. Most x86 implementations are more coarse grained than this and there is a notion of global pages (for shared libraries and shared data).
The MMU in this case is unaware completely of what a process is. The operating system, which keeps tracks of processes, generates a page table for each process, as you say, as they are created. The process for context switching is as follows:
The operating system tells the MMU to use page table located at physical address 0xFOO
The operating system programs the programmable interrupt timer (PIT) to cause a hardware interrupt after BAR milliseconds.
The operating system restores the process state (CPU registers, program counter, etc) and jumps to the correct address.
The process runs until the PIT triggers an interrupt.
The Operating System routine for handling the PIT interrupt then saves the program state (registers etc), uses a scheduling algorithm for determining the next process to run (in a simple case, a circular linked list), then starts over at step 1.
I hope that clears up any doubts you may have. The short answer: The MMU is process agnostic and doesn't know what a process is.
I was just reading up on how linux works in my OS-book when I came across this..
[...] the kernel is created as a single, monolitic binary. The main reason is to improve performance. Because all kernel code and data structures are kept in a single address space, no context switches are necessary when a process calls an operating-system function or when a hardware interrup is delivered.
That sounded quite amazing to me, surely it must store the process's context before running off into kernel mode to handle an interrupt.. But ok, I'll buy it for now. A few pages on, while describing a process's scheduling context, it said:
Both system calls and interrups that occur while the process is executing will use this stack.
"this stack" being the place where the kernel stores the process's registers and such.
Isn't this a direct contradiction to the first quote? Am I missinterpreting it somehow?
I think the first quote is referring to the differences between a monolithic kernel and a microkernel.
Linux being monolithic, all its kernel components (device drivers, scheduler, VM manager) run at ring 0. Therefore, no context switch is necessary when performing system calls and handling interrupts.
Contrast microkernels, where components like device drivers and IPC providers run in user space, outside of ring 0. Therefore, this architecture requires additional context switches when performing system calls (because the performing module might reside in user space) and handling interrupts (to relay the interrupts to the device drivers).
"Context switch" could mean one of a couple of things, both relevant: (1) switching from user to kernel mode to process the system call, or an involuntary switch to kernel mode to process an interrupt against the interrupt stack, or (2) switching to run another user process in user space, with a jump to kernel space in between the two.
Any movement from user space to kernel space implies saving enough user-space to return to it reliably. If the kernel-space code decides that - while you're no longer running the user-code for that process - it's time to let another user-process run, it gets in.
So at the least, you're talking 2-3 stacks or places to store a "context": hardware-interrupts need a kernel-level stack to say what to return to; user method/subroutine calls use a standard stack for getting that done. Etc.
The original Unix kernels - and the model isn't that different now for this part - ran the system calls like a short-order cook processing breakfast orders: move this over on the stove to make room for the order of bacon that just arrived, start the bacon, go back to the first order. All in kernel switching context. Was not a huge monitoring application, which probably drove the IBM and DEC software folks mad.
When making a system call in Linux, a context switch is done from user-space to kernel space (ring3 to ring0). Each process has an associated kernel mode stack, that is used by the system call. Before the system call is executed, the CPU registers of the process are stored on its user-mode stack, this stack is different from the kernel mode stack, and is the one which the process uses for user-space executions.
When a process is in kernel mode (or user mode), calling functions of the same mode will not require a context switch. This is what is referred by the first quote.
The second quote refers to the kernel mode stack, and not the user-mode stack.
Having said this, I must mention Linux optimisations, where no transition is needed to the kernel space for executing a system call, i.e. all processing related to the system call is done in the user space itself (thus no context switch). vsyscall, and VDSO are such techniques. The idea behind them is quite simple. It is to send to the user space, the data that is required for execution of the corresponding system call. More info can be found in this LWN article.
In addition to this, there have been some research projects in which all the execution happens in the same ring. User space programs, and the OS code, both reside in the same ring. Idea is to get rid of the overhead of ring switches. Microsoft's [singularity][2] OS is one such project.
I have a question about Inter-process-communication in operating systems.
Can 2 processes communicate with each other by both processes opening the same file (which say was created before both processes, so both processes have the file handler) and then communicating by writing into this file?
If yes, what does this method come under? I have heard that 2 major ways of IPC is by shared-memory and message-passing. Which one of these, this method comes under?
The reason, I am not sure if it comes under shared-memory is that, because this file is not mapped to address space of any of these processes. And, from my understanding, in shared-memory, the shared-memory-region is part of address space of both the processes.
Assume that processes write into the file in some pre-agreed protocol/format so both have no problem in knowing where the other process writes and when etc. This assumption is to merely understand. In real world though, this may be too stringent to hold true etc.
If no, what is wrong with this scenario? Is it that if 2 different processes open the same file, then the changes made by 1st process are not flushed into persistent storage for others to view until the process terminates? or something else?
Any real world example from Windows and Linux should also be useful.
Thanks,
Using a file is a kind of shared memory. Instead of allocating a common memory buffer in RAM, a common file is used.
To successfully manage the communication some kind of locking mechanism for different ranges in the file is needed. This could either be locking of ranges provided by the file system (available at least on Windows) or global operating system mutexes.
One real-world scenario where disk storage is used for inter-process-communication is the quorom disk used in clusters. It is a common disk resource, accessible over a SAN by all cluster nodes, that stores the cluster's configuration.
The posix system call mmap does mappings of files to virtual memory. If the mapping is shared between two processes, writes to that area in one process will affect other processes. Now coming to you question, yes a process reading from or writing to the underlying file will not always see the same data that the process that has mapped it, since the segment of the file is copied into RAM and periodically flushed to disk. Although I believe you can force synchronization with the msync system call. Do read up on mmap(). It has a host of other memory sharing options.
In our embedded system (using a PowerPC processor), we want to disable the processor cache. What steps do we need to take?
To clarify a bit, the application in question must have as constant a speed of execution as we can make it.
Variability in executing the same code path is not acceptable. This is the reason to turn off the cache.
I'm kind of late to the question, and also it's been a while since I did all the low-level processor init code on PPCs, but I seem to remember the cache & MMU being pretty tightly coupled (one had to be enabled to enable the other) and I think in the MMU page tables, you could define the cacheable attribute.
So my point is this: if there's a certain subset of code that must run in deterministic time, maybe you locate that code (via a linker command file) in a region of memory that is defined as non-cacheable in the page tables? That way all the code that can/should benefit from the cache does, and the (hopefully) subset of code that shouldn't, doesn't.
I'd handle it this way anyway, so that later, if you want to enable caching for part of the system, you just need to flip a few bits in the MMU page tables, instead of (re-)writing the init code to set up all the page tables & caching.
From the E600 reference manual:
The HID0 special-purpose register contains several bits that invalidate, disable, and lock the instruction and data caches.
You should use HID0[DCE] = 0 to disable the data cache.
You should use HID0[ICE] = 0 to disable the instruction cache.
Note that at power up, both caches are disabled.
You will need to write this in assembly code.
Perhaps you don't want to globally disable cache, you only want to disable it for a particular address range?
On some processors you can configure TLB (translation lookaside buffer) entries for address ranges such that each range could have caching enabled or disabled. This way you can disable caching for memory mapped I/O, and still leave caching on for the main block of RAM.
The only PowerPC I've done this on was a PowerPC 440EP (from IBM, then AMCC), so I don't know if all PowerPCs work the same way.
What kind of PPC core is it? The cache control is very different between different cores from different vendors... also, disabling the cache is in general considered a really bad thing to do to the machine. Performance becomes so crawlingly slow that you would do as well with an old 8-bit processor (exaggerating a bit). Some ARM variants have TCMs, tightly-coupled memories, that work instead of caches, but I am not aware of any PPC variant with that facility.
Maybe a better solution is to keep Level 1 caches active, and use the on-chip L2 caches as statically mapped RAM instead? That is common on modern PowerQUICC devices, at least.
Turning off the cache will do you no good at all. Your execution speed will drop by an order of magnitude. You would never ship a system like this, so its performance under these conditions is of no interest.
To achieve a steady execution speed, consider one of these approaches:
1) Lock some or all of the cache. All current PowerPC chips from Freescale, IBM, and AMCC offer this feature.
2) If it's a Freescale chip with L2 cache, consider mapping part of that cache as on-chip memory.