Since Process in main memory is stored in the form of stack, heap, data section and static data section. Where does the PCB lies in here? Or is it stored independent from this?
Is it stored on the bottom of the stack of the process or is it independent of the memory representation of process
PCB (Process Control Blocks) which contains all the info about a process are usually stored in a specially reserved memory as part of the kernel space.
The kernel space, part of logical space in RAM, is the core of an Operating System which has full access to the underlying hardware.
Related
The Operating System notes from my university reads :
The PCB is created when a process is born via fork, and is reclaimed
when a process is terminated. While systems calls such as exec rewrite
the memory image of the process, the PCB (and the entities pointed by
it, like the kernel stack) largely remain intact during exec or any
other system call, except for slight modifications (like changing the
page tables to point to the new memory image).
But during fork system call, the memory image from the parent is wiped and a new memory image is initialized to the child process. Hence the PCB located in the kernel stack of the memory image is also wiped and hence a completely new PCB is re-written to the process is my understanding.
What concept have I understood wrong?
The process control block is located in the Kernel space in the RAM. The kernel space also has the Paging table. When he exec system command is called, the memory image of the process is wiped nd new memory image is written for the process without affecting the process control block in the kernel space for that process, but the paging table that maps the logical address and virtual address of the processes has to be changes since the memory image is changed.Hence in my knowledge,the PCB is not re-written.
Are they removed from memory or they are maintained in memory? If they are in memory, what is the purpose?
The PCB (process context block) holds the register state for the process (thread). It is one of many data structures an operating system maintains for a process. The PCB serves no purpose once the process is terminated so it is typically freed or reused by another process (or thread). Most other data structures are are freed or reused as well.
If we talk about Address Space of a process it is the virtual address range which includes static data, stack and heap memory for that particular process. And coming to Process Control Block (PCB) which is a data structure maintained by operating system for each process it manages, where PCB includes a lot of information about the process like process no., process state, program counter, list of open files, cpu scheduling info ...and more.
Now this is the point where I got confused that Address Space is also a memory which stores information about a process and similar thing is done by PCB too. Then how these are connected to each other. I am not able to visualize this in my mind. Why we have these two things existing simultaneously. Isn't it possible to achieve our goal just by using PCB?
Process Address space refer to memory regions the process is using. It typically consists of heap, stack, initialized data, uninitialized data and text. There are mainly two address spaces of a process -- logical and physical.
The PCB is a structure resides in kernel to track the state of process. One of the things PCB contain is memory information. In a typically system, PCB may contain information about pages the process has.
To answer your question, Process Address space is an idea built on top of PCB and many other things (such as page table).
what is the exact difference between process control block and process descriptor?.
I was reading about kernel of linux. It was written that there is some thread_info structure which contains the pointer to actual process descriptor table. It was written that the thread_info lies just above/below of kernel stack. So definitely thread_info is in main memory. But what about actual process descriptor task_struct? where is it located? If process descriptor resides in main memory, where is the actual place for it ?
The thread_info and task_struct structures are just two different structures that hold different pieces of information about a thread, with the thread_info holding more architecture-specific data than the task_struct. It makes more sense to split up the information rather than keep it all in the same structure. (Although you could put them in the same struct; the 2.4 Linux kernel did this.)
How those structs are allocated depends on the architecture you're using. The relevant functions you want to examine are alloc_task_struct() and alloc_thread_info().
In the kernel, the process descriptor is a structure called task_struct, which keeps track of process attributes and information. All kernel information regarding a process is found there.
In a typical C program, the linux kernel provides 84K - ~100K of memory. How does the kernel allocate more memory for the stack when the process uses the given memory.
IMO when the process takes up all the memory of the stack and now uses the next contiguous memory, ideally it should page fault and then the kernel handles the page fault.
Is it here that the kernel provides more memory to the stack for the given process, and which data structure in linux kernel identifies the size of the stack for the process??
There are a number of different methods used, depending on the OS (linux realtime vs. normal) and the language runtime system underneath:
1) dynamic, by page fault
typically preallocate a few real pages to higher addresses and assign the initial sp to that. The stack grows downward, the heap grows upward. If a page fault happens somewhat below the stack bottom, the missing intermediate pages are allocated and mapped. Effectively increasing the stack from the top towards the bottom automatically. There is typically a maximum up to which such automatic allocation is performed, which can or can not be specified in the environment (ulimit), exe-header, or dynamically adjusted by the program via a system call (rlimit). Especially this adjustability varies heavily between different OSes. There is also typically a limit to "how far away" from the stack bottom a page fault is considered to be ok and an automatic grow to happen. Notice that not all systems' stack grows downward: under HPUX it (used?) to grow upward so I am not sure what a linux on the PA-Risc does (can someone comment on this).
2) fixed size
other OSes (and especially in embedded and mobile environments) either have fixed sizes by definition, or specified in the exe header, or specified when a program/thread is created. Especially in embedded real time controllers, this is often a configuration parameter, and individual control tasks get fix stacks (to avoid runaway threads taking the memory of higher prio control tasks). Of course also in this case, the memory might be allocated only virtually, untill really needed.
3) pagewise, spaghetti and similar
such mechanisms tend to be forgotten, but are still in use in some run time systems (I know of Lisp/Scheme and Smalltalk systems). These allocate and increase the stack dynamically as-required. However, not as a single contigious segment, but instead as a linked chain of multi-page chunks. It requires different function entry/exit code to be generated by the compiler(s), in order to handle segment boundaries. Therefore such schemes are typically implemented by a language support system and not the OS itself (used to be earlier times - sigh). The reason is that when you have many (say 1000s of) threads in an interactive environment, preallocating say 1Mb would simply fill your virtual address space and you could not support a system where the thread needs of an individual thread is unknown before (which is typically the case in a dynamic environment, where the use might enter eval-code into a separate workspace). So dynamic allocation as in scheme 1 above is not possible, because there are would be other threads with their own stacks in the way. The stack is made up of smaller segments (say 8-64k) which are allocated and deallocated from a pool and linked into a chain of stack segments. Such a scheme may also be requried for high performance support of things like continuations, coroutines etc.
Modern unixes/linuxes and (I guess, but not 100% certain) windows use scheme 1) for the main thread of your exe, and 2) for additional (p-)threads, which need a fix stack size given by the thread creator initially. Most embedded systems and controllers use fixed (but configurable) preallocation (even physically preallocated in many cases).
edit: typo
The stack for a given process has a limited, fixed size. The reason you can't add more memory as you (theoretically) describe is because the stack must be contiguous, and it grows toward the heap. So, when the stack reaches the heap, no extension is possible.
The stack size for a userland program is not determined by the kernel. The kernel stack size is a configuration option for the kernel (usually 4k or 8k).
Edit: if you already know this, and were merely talking about the allocation of physical pages for a process, then you have the procedure down already. But there's no need to keep track of the "stack size" like this: the virtual pages in the stack with no pagetable entries are just normal overcommitted virtual pages. Physical memory will be granted on their first access. But the kernel does not have to overcommit memory, and thus a stack will probably have complete physical realization when the executable is first loaded.
The stack can only be used up to a certain length, because it has a fixed storage capacity in memory. If your question asks in what direction does the stack being used up? the answer is downwards. It is filled down in memory towards the heap. The heap is a dynamic component of memory by which it can actually grow from the bottom up, based on your need of data storage.