Does OS know the process state? - process

Is all the process state such as new, ready, running, waiting and terminated are recognized by the operating system kernel or is it for the convenience of understanding? If it is recognized by operating system, how will it do it?

The process state you are talking about (in contrast to the context, what is called a process state as well in some literature) is solely needed by the OS itself. It is a bookkeeping instrument. As such, it introduces an overhead in hope to get (a.o) a performance gain at other places. E.g., by considering ready processes only, the OS avoids to switch to processes that would only yield to the next one (what would generate superfluous context switches).
The implementation of the concept may differ. Not always the PCB has an explicit data field for the process state. Frequently, the state is implemented by different queues, where processes are sorted to. Sometimes, OS have even an redundant representation of the process state. The representation is a matter of efficiency: E.g. if the OS seeks for some ready process (not caring which), a queue has a complexity of O(1) while a list of PCB with explicit states would require O(n).
To summarize: If the OS wouldn't be aware of the process states they would be superfluous. In what way the state is implemented and how it used differs from system to system.

The problem with the question here is that process states a entirely system specific.
Your first question is largely correct that system states are largely pedagogical constructs for "convenience of understanding."
The operating system has to know the state of the process. That is likely to be maintainted in a variety of ways, including state variables and queues.

Related

Can a Deadlock occur with CPU as a resource?

I am on my fourth year of Software Engineering and we are covering the topic of Deadlocks.
The generalization goes that a Deadlock occurs when two processes A and B, use two resources X and Y and wait for the release of the other process resource before releasing theirs.
My question would be, given that the CPU is a resource in itself, is there a scenario where there could be a deadlock involving CPU as a resource?
My first thought on this problem is that you would require a system where a process cannot be released from the CPU by timed interrupts (it could just be a FCFS algorithm). You would also require no waiting queues for resources, because getting into a queue would release the resource. But then I also ask, can there be Deadlocks when there are queues?
CPU scheduler can be implemented in any way, you can build one which used FCFS algorithm and allowed processes to decide when they should relinquish control of CPU. but these kind of implementations are neither going to be practical nor reliable since CPU is the single most important resource an operating system has and allowing a process to take control of it in such a way that it may never be preempted will effectively make process the owner of the system which contradicts the basic idea that operating system should always be in control of the system.
As far as contemporary operating systems (Linux, Windows etc) are concerned, this will never happen because they don't allow such situations.

Does context switch between processes invalidate the MMU(memory control unit)?

This is a sentence in the PowerPoint of my system lecture, but I don't understand why context switch invalidates the MMU. I know it will invalidate the cache since the cache contains information of another process. However, as for MMU, it just maps virtual memory to physical memory. If context switch invalidates it, does this mean the MMU use different mechanism of mapping in different processes?
Does this mean the MMU use different mechanism of mapping in different processes?
Your conclusion is essentially right.
Each process has its mapping from virtual to physical addresses (called context).
The address 0x401000 for example can be translated to 0x01234567 for process A and to 0x89abcdef for process B.
Having different contexts allows for an easy isolation of the processes, easy on demand paging and simplified relocation.
So each context switch must invalidate the TLB or the CPU would continue using the old translations.
Some pages however are global, meaning that they have the same translation independently of the current process address space.
For example the kernel code is mapped in the same way for every process adn thus doesn't need to be remapped.
So in the end only a part of the TLB is invalidated.
You can read how Linux handles the process address space for a real example of applied theory.
What you are describing is entirely system specific.
First of all, what they are probably referring to is invaliding the MMU cache. That assume the MMU has a cache (likely these days but not guaranteed).
When a context switch occurs, the processor has set put the MMU in a state where leftovers from the previous process would screw up the new process. If it did not, the cache would map the new process's logical pages to the old process's physical page frames.
For example, some processors use one page table for the system space and one or more other page tables for the user space. After a context switch, it would be ideal for the processor to invalidate any caching of the user space page tables but leave any caching of the system table table alone.
Note that in most processors all of this is done entirely behind the scenes. Even OS programmers do not need to deal with (or even be aware of) any flushing or invalidation of the MMU. There is a single switch process context instruction that handles everything. Other processors require the OS programmer to handle additional tasks as part of a context switch which, in some oddball processors, includes explicitly flushing the MMU cache.

operating system - context switches

I have been confused about the issue of context switches between processes, given round robin scheduler of certain time slice (which is what unix/windows both use in a basic sense).
So, suppose we have 200 processes running on a single core machine. If the scheduler is using even 1ms time slice, each process would get its share every 200ms, which is probably not the case (imagine a Java high-frequency app, I would not assume it gets scheduled every 200ms to serve requests). Having said that, what am I missing in the picture?
Furthermore, java and other languages allows to put the running thread to sleep for e.g. 100ms. Am I correct in saying that this does not cause context switch, and if so, how is this achieved?
So, suppose we have 200 processes running on a single core machine. If
the scheduler is using even 1ms time slice, each process would get its
share every 200ms, which is probably not the case (imagine a Java
high-frequency app, I would not assume it gets scheduled every 200ms
to serve requests). Having said that, what am I missing in the
picture?
No, you aren't missing anything. It's the same case in the case of non-pre-emptive systems. Those having pre-emptive rights(meaning high priority as compared to other processes) can easily swap the less useful process, up to an extent that a high-priority process would run 10 times(say/assume --- actual results are totally depending on the situation and implementation) than the lowest priority process till the former doesn't produce the condition of starvation of the least priority process.
Talking about the processes of similar priority, it totally depends on the Round-Robin Algorithm which you've mentioned, though which process would be picked first is again based on the implementation. And, Windows and Unix have same process scheduling algorithms. Windows and Unix does utilise Round-Robin, but, Linux task scheduler is called Completely Fair Scheduler (CFS).
Furthermore, java and other languages allows to put the running thread
to sleep for e.g. 100ms. Am I correct in saying that this does not
cause context switch, and if so, how is this achieved?
Programming languages and libraries implement "sleep" functionality with the aid of the kernel. Without kernel-level support, they'd have to busy-wait, spinning in a tight loop, until the requested sleep duration elapsed. This would wastefully consume the processor.
Talking about the threads which are caused to sleep(Thread.sleep(long millis)) generally the following is done in most of the systems :
Suspend execution of the process and mark it as not runnable.
Set a timer for the given wait time. Systems provide hardware timers that let the kernel register to receive an interrupt at a given point in the future.
When the timer hits, mark the process as runnable.
I hope you might be aware of threading models like one to one, many to one, and many to many. So, I am not getting into much detail, jut a reference for yourself.
It might appear to you as if it increases the overhead/complexity. But, that's how threads(user-threads created in JVM) are operated upon. And, then the selection is based upon those memory models which I mentioned above. Check this Quora question and answers to that one, and please go through the best answer given by Robert-Love.
For further reading, I'd suggest you to read from Scheduling Algorithms explanation on OSDev.org and Operating System Concepts book by Galvin, Gagne, Silberschatz.

Interrupts execution context

I'm trying to figure out this basic scenario:
Suppose my cpu received an exception or an interrupt. What I do know, is that the cpu starts to perform an interrupt service routine (looks at the idtr register to locate the idt table, and goes to the appropriate entry to receive the isr address), but in what context is the code running?
Meaning if I have a thread currently running and generating an interrupt of some sort, in which context will the isr run, in the initial process that "holds" the thread, or in some other magical thread?
Thanks!
Interesting question, which raises a few different issues.
The first is that interrupts don’t actually run inside of any thread from the CPU’s perspective. Indeed, the CPU itself is barely aware of threads; it may know a bit more if it has hyper threading or some similar technology, but a thread is really an operating system thing (or, sometimes, an application thing).
The second is that ISRs (Interrupt Service Routines) generally run at some elevated privilege level; you don’t really say which processor family you’re talking about, so it’s difficult to be specific, but modern processors normally have at least one special mode that they enter for handling interrupts — often with its own register bank. One might also ask, as part of your question, whose page table is active during an interrupt?
Third is the question of whose memory map ISRs have when they are entered. The answer, again, is going to be highly processor specific; it’s possible to imagine architectures that disable paging on ISR entry, other architectures that switch automatically to an interrupt page table, and (probably the most common approach) those that decide not to bother doing anything about the page table when entering an ISR.
The fourth is that some operating systems have policies of their own on these kinds of things. A common approach on modern operating systems is to make ISRs themselves as short as possible, and where any significant work needs to be done, convert the interrupt into some kind of event that can be handled by a kernel thread (or even, potentially, by a user thread). In this kind of system, the code that actually handles an interrupt may well be running in a specific thread, though it probably isn’t actually an interrupt service routine at that point.
Summary:
ISRs themselves don’t really run in the context of any given thread.
ISRs may run with the page table of the interrupted thread (depends on architecture).
ISRs may start with a copy of that thread’s registers (depends on architecture).
In modern systems, ISRs commonly try to schedule an event and then exit quickly. That event might be handled by a specific thread (e.g. for processor exceptions, it’s usually delivered as a signal or Structured Exception or similar to the thread that caused it); or by a pool of threads (e.g. to service I/O in the kernel).
If you’re interested in the specifics for x86 (I guess you are, as you use some Intel specific terms in your question), you need to look at the Intel 64 and IA-32 Architectures Software Developer’s Manual, volume 3B, and you’ll need to look at the operating system documentation. x86 is a very complicated architecture compared to some others — for instance, it can optionally perform a task switch on interrupt delivery (if you put a “task gate” in the IDT), in which case it will certainly have its own set of registers and quite possibly its own page table; even if this feature is used by a given operating system, there is no guarantee that x86 tasks map straightforwardly (or at all) to operating system processes and/or threads.

Context switch time - Role of RTOS and Processor

Does the RTOS play a major role or processor play a major role in determining the time for context switch ? What is the percentage of share between these two major players in determining the context switch time .
Can anyone tell with respect to uC/OS-II RTOS ?
I would say both are significant, but it is not really as simple as that:
The actual context switch time is simply a matter of the number of instruction cycles required to perform the switch, like anything in software it may be coded efficiently or it may not. On the other hand, all other things being equal, a processor with a large register set will require more instruction cycles to save the context; but having a large register set may make other code far more efficient.
A processor may also have an architecture that directly supports fast context switching. For example the lowly 8bit 8051 has four duplicate register banks; so a context switch is little more that a register bank switch (so long as you have not more that four threads), and given that Silicon Labs produce 8051 based devices at 100MIPS, that could be very fast indeed!
More sophisticated processors and operating systems may use an MMU to provide thread memory protection, this is an additional context switch overhead but with benefits that may override that. Also of course such processors generally also have high clock rates which helps.
So all in all, the processor speed, the processor architecture, the quality of the RTOS implementation, and the functionality provided by the RTOS may all affect context switch time. But in the end the easiest way to improve switch time is almost certainly to increase the clock rate.
Although it is nice to have more headroom, if context switch time is a make or break issue for your project on any reputable RTOS you should consider the suitability of either your hardware or your design. You should aim toward a design that minimises context switches. For example, if an ADC conversion takes 6us and a context switch takes 20us, then you would do better to busy-wait than to use a conversion-complete interrupt; better yet use DMA transfers to avoid context switches on single data items where possible.
uC/OS-II RTOS is written in C, with some very specific sections(maybe in assembly) for the processor specific handling. The context switching will be part of the sections that are very specific to the processor.
So the context switch time will be very dependent on the processor selected and the specific sections used to adapt uC/OS-II to that processor. I believe all the source code is available so you should be able to see how much source is needed for a context switch. I also think uC/OS-II has callback's that may allow you to add some performance measuring code.
Just to complete on what Clifford was saying, context switching time also depends on the conditions that trigger the context switch, so mainly it depends on the benchmark.
Depending on the RTOS implementation, in some cases it's possible to switch directly to the first waiting process bypassing the scheduler altogether.
This of course gives a huge boost in some benchmarks.
For example we make some benchmark that measures the overhead (in µs) required to deliver a signal and switch to the high-priority process varying the particular kernel configuration and the target architecture:
http://www.bertos.org/discover/context-switch-overhead