Process states in operating system and resource utilization - process

What is a difference between sleep,wait and suspending a process in OS? Does any of these states consume resources or waste CPU cycles?

In all three cases, the process is not runnable, so it does not consume CPU. THe process is not returned to the runnable state until some event happens. The difference is what that event is:
Sleep: This can describe two different things. Either a process is runnable after a certain (fixed) period of time elapses, or the process is runnable after the device itself wakes up from a power saving mode.
Wait: process is runnable after something finishes. That something is usually an I/O operation (disk, network) completing.
Suspend: either the OS or another process takes the process out of the run state. This can overlap with "Sleeping" above.
Processes in all three states don't consume CPU time, but they do consume memory unless the process is entirely paged out. And processes in the wait state may be consuming I/O resources.

Related

How to reduce time taken on threads reaching Safepoint - Sync state

About the Issue:
During heavy IO in the VM, we faced JVM pause/slowness due to stopping threads taking more time. When looking on safepoint logs it showed Sync state takes the most time.
We also tried printing Safepoint traces on timeout delay (-XX:+SafepointTimeout -XX:SafepointTimeoutDelay=200) to know which threads is causing this issue but nothing seems to be suspicious. Also when setting timeout for safepoints, we are not getting timeout detected print when the time spent is in 'Sync' state.
Questions about this safepoint tracing:
How does the safepoint timeout work?
After logging the thread details, does the safepoint exists and all threads resume?
Will that VM operation be carried out. What will happen if the vmop is GC.
Using Async-profiler:
Tried time-to-safepoint profiling using async-profiler and noticed VM Thread is taking more time on SafepointSynchronize::begin() method and C2 compiler threads is taking almost equal time as VM Thread.
We doubt that C2 Compilers may be taking time to reach safepoint. Can someone help us in resolving this issue and to interpret this time-to-safepoint flamegraph. Thanks in advance.
SafepointTimeout option affects nothing but logging, i.e. threads will not be interrupted, VM operation will run normally, etc.
SafepointTimeout does not always print timed out threads: a thread may already have reached the safepoint by the time printing occurs. Furthermore, SafepointTimeout may not even detect a timeout, if the entire process has been frozen by the Operating System.
For example, such 'freezes' many happen
when a process has exhausted its cpu quota in a cgroup (container);
when a system is low on physical memory, and direct reclaim occurs;
due to activity of another process (e.g. I observed long JVM pauses when atop utility inspected the system).
async-profiler indeed has a time-to-safepoint profiling option (--ttsp), though using it correctly may seem tricky. It works best in wall profiling mode with jfr output. In this configuration, async-profiler will sample all threads (both running and blocking) during safepoint synchronization, and record each individual event with a timestamp.
Such profile can be then analyzed with JDK Mission Control: choose the time interval around the long pause, and look at the stack traces of java threads in this interval.
Note that if the JVM process is 'frozen', async-profiler thread does not work either, i.e. you will not see collected samples during this period. Normally, in wall clock profiling mode, all threads are sampled evenly. But if you see a 'gap ' (missed events during some time interval), it apparently means the JVM process has not received CPU time. In this case, the reason of JVM pauses is not in the Java application, but rather in the operating system / environment.

Operating System Basics

I am reading process management,and I have a few doubts-
What is meant by an I/o request,for E.g.-A process is executing and
hence it is in running state,it is in waiting state if it is waiting
for the completion of an I/O request.I am not getting by what is meant by an I/O request,Can you
please give an example to elaborate.
Another doubt is -Lets say that a process is executing and suddenly
an interrupt occurs,then the process stops its execution and will be
put in the ready state,is it possible that some other process began
its execution while the interrupt is also being processed?
Regarding the first question:
A simple way to think about it...
Your computer has lots of components. CPU, Hard Drive, network card, sound card, gpu, etc. All those work in parallel and independent of each other. They are also generally slower than the CPU.
This means that whenever a process makes a call that down the line (on the OS side) ends up communicating with an external device, there is no point for the OS to be stuck waiting for the result since the time it takes for that operation to complete is probably an eternity (in the CPU view point of things).
So, the OS fires up whatever communication the process requested (call it IO request), flags the process as waiting for IO, and switches execution to another process so the CPU can do something useful instead of sitting around blocked waiting for the IO request to complete.
When the external device finishes whatever operation was requested, it generates an interrupt, so the OS is informed the work is done, and it can then flag the blocked process as ready again.
This is all a very simplified view of course, but that's the main idea. It allows the CPU to do useful work instead of waiting for IO requests to complete.
Regarding the second question:
It's tricky, even for single CPU machines, and depends on how the OS handles interrupts.
For code simplicity, a simple OS might for example, whenever an interrupt happens process the interrupt in one go, then resume whatever process it decides it's appropriate whenever the interrupt handling is done. So in this case, no other process would run until the interrupt handling is complete.
In practice, things get a bit more complicated for performance and latency reasons.
If you think about an interrupt lifetime as just another task for the CPU (From when the interrupt starts to the point the OS considers that handling complete), you can effectively code the interrupt handling to run in parallel with other things.
Just think of the interrupt as notification for the OS to start another task (that interrupt handling). It grabs whatever context it needs at the point the interrupt started, then keeps processing that task in parallel with other processes.
I/O request generally just means request to do either Input , Output or both. The exact meaning varies depending on your context like HTTP, Networks, Console Ops, or may be some process in the CPU.
A process is waiting for IO: Say for example you were writing a program in C to accept user's name on command line, and then would like to print 'Hello User' back. Your code will go into waiting state until user enters their name and hits Enter. This is a higher level example, but even on a very low level process executing in your computer's processor works on same basic principle
Can Processor work on other processes when current is interrupted and waiting on something? Yes! You better hope it does. Thats what scheduling algorithms and stacks are for. However the real answer depending on what Architecture you are on, does it support parallel or serial processing etc.

How does the operating system handle its responsibilities while a process is executing?

I had this question in mind from long time and may sound little vacuous. We know that operating system is responsible for handling memory allocation, process management etc. CPU can perform only one task at a time(assuming it to be single core). Suppose an operating system has allocated a CPU cycle to some user initiated process and CPU is executing that. Now where is operating system running? If some other process is using the CPU, then, is operating system not running for that moment? as OS itself must need CPU to run. If in case OS is not running, then who is handling process management, device management etc for that period?
The question is mixing up who's in control of the memory and who's in control of the CPU. The wording “running” is imprecise: on a single CPU, a single task is running at any given time in the sense that the processor is executing its instructions; but many tasks are executing in the sense that their state is stored in memory and their execution can resume at any time.
While a process is executing on the CPU, the kernel is not executing. Its state is saved in memory. The execution of the kernel can resume:
if the process code makes a jump into kernel code — this is called a system call.
if an interrupt occurs.
If the operating system provides preemptive multitasking, it will schedule an interrupt to happen after an interval of time (called a time slice). On a non-preemptive operating system, the process will run forever if it doesn't yield the CPU. See What mechanisms prevent a process from taking over the processor forever? for an explanation of how preemption works.
Tasks such as process management and device management are triggered by some event. If the event is a request by the process, the request will take the form of a system call, which executes kernel code. If the event is triggered from hardware, it will take the form of an interrupt, which executes kernel code.
(Note: in this answer, I use “CPU” and “processor” synonymously, to mean a single execution thread: a single core, or whatever the hardware architecture is.)
The OS kernel does nothing at all until it is entered via an interrupt. It may be entered because of a hardware interrupt that causes a driver to run and the driver chooses to exit via the OS, or a running thread may make a syscall interrupt.
Unless an interrupt occurs, the OS kernel does nothing at all. It does not need to do anything.
Edit:
DMA is, (usually), used for bulk I/O and is handled by a hardware subsystem that handles requests issued by a system call, (software interrupt). When a DMA operation is complete, the DMA hardware raises a hardware interrupt, so running a driver that can further signal the OS of the completion, maybe changing the set of running threads, so DMA is managed by interrupts.
A new process/thread can only be loaded by an existing thread that has issued a system call, (software interrupt), and so new processes are initiated by interrupts.
It's interrupts, all the way down :)
It depends on which type of CPU Scheduling you are using : (in case of single core)
if your process is executing with preemptive scheduling then you can interrupt the process in between for some time duration and you can use the CPU for some other Process or O.S. but in case of Non-Preemptive Scheduling process is not going to yield the CPU before completing there execution.
In case of single Core, if there is a single process then it will execute with given instruction and if there are multiple process then states stored in the PCB. which make process queue and execute one after another, if no interrupts occur.
PCB is responsible for any process management.
when a process initialize it calls to Library function that's System calls and execution of Kernel get invoke if some process get failed during execution or interrupt occur.

OS Concepts Terminology

I'm doing some fill in the blanks from a sample exam for my class and I was hoping you could double check my terminology.
The various scheduling queues used by the operating system would consist of lists of processes.
Interrupt handling is the technique of periodically checking to see if a condition (such as completion of some requested I/O operation) has been met.
When the CPU is in kernel mode, a running program has access to a restricted set of CPU functionality.
The job of the CPU scheduler is to select a process on the ready queue and change its state.
The CPU normally supports a vector of interrupts so the OS can respond appropriately when some event of interest occurs in the hardware.
Using traps, a device controller can use idle time on the bus to read from or write to main memory.
During a context switch, the state of one process is copied from the CPU and saved, and the state of a different process is restored.
An operating system consists of a kernel and a collection of application programs that run as user processes and either provide OS services to the user or work in the background to keep the computer running smooth.
There are so many terms from our chapters, I am not quite sure if I am using the correct ones.
My thoughts:
1. Processes and/or threads. Jobs and tasks aren't unheard of either. There can be other things. E.g. in MS Windows there are also Deferred Procedure Calls (DPCs) that can be queued.
2. This must be polling.
4. Why CPU scheduler? Why not just scheduler?
6. I'm not sure about traps in the hardware/bus context.

What is the difference between a thread/process/task?

What is the difference between a thread/process/task?
Process:
A process is an instance of a computer program that is being executed.
It contains the program code and its current activity.
Depending on the operating system (OS), a process may be made up of multiple threads of execution that execute instructions concurrently.
Process-based multitasking enables you to run the Java compiler at the same time that you are using a text editor.
In employing multiple processes with a single CPU,context switching between various memory context is used.
Each process has a complete set of its own variables.
Thread:
A thread is a basic unit of CPU utilization, consisting of a program counter, a stack, and a set of registers.
A thread of execution results from a fork of a computer program into two or more concurrently running tasks.
The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process. Multiple threads can exist within the same process and share resources such as memory, while different processes do not share these resources.
Example of threads in same process is automatic spell check and automatic saving of a file while writing.
Threads are basically processes that run in the same memory context.
Threads may share the same data while execution.
Thread Diagram i.e. single thread vs multiple threads
Task:
A task is a set of program instructions that are loaded in memory.
Short answer:
A thread is a scheduling concept, it's what the CPU actually 'runs' (you don't run a process). A process needs at least one thread that the CPU/OS executes.
A process is data organizational concept. Resources (e.g. memory for holding state, allowed address space, etc) are allocated for a process.
To explain on simpler terms
Process: process is the set of instruction as code which operates on related data and process has its own various state, sleeping, running, stopped etc. when program gets loaded into memory it becomes process. Each process has atleast one thread when CPU is allocated called sigled threaded program.
Thread: thread is a portion of the process. more than one thread can exist as part of process. Thread has its own program area and memory area. Multiple threads inside one process can not access each other data. Process has to handle sycnhronization of threads to achieve the desirable behaviour.
Task: Task is not widely concept used worldwide. when program instruction is loaded into memory people do call as process or task. Task and Process are synonyms nowadays.
A process invokes or initiates a program. It is an instance of a program that can be multiple and running the same application. A thread is the smallest unit of execution that lies within the process. A process can have multiple threads running. An execution of thread results in a task. Hence, in a multithreading environment, multithreading takes place.
A program in execution is known as process. A program can have any number of processes. Every process has its own address space.
Threads uses address spaces of the process. The difference between a thread and a process is, when the CPU switches from one process to another the current information needs to be saved in Process Descriptor and load the information of a new process. Switching from one thread to another is simple.
A task is simply a set of instructions loaded into the memory. Threads can themselves split themselves into two or more simultaneously running tasks.
for more Understanding refer the link: http://www.careerride.com/os-thread-process-and-task.aspx
Wikipedia sums it up quite nicely:
Threads compared with processes
Threads differ from traditional multitasking operating system processes in that:
processes are typically independent, while threads exist as
subsets of a process
processes carry considerable state information, whereas multiple
threads within a process share state
as well as memory and other resources
processes have separate address spaces, whereas threads share their
address space
processes interact only through system-provided inter-process
communication mechanisms.
Context switching between threads in the same process is
typically faster than context
switching between processes.
Systems like Windows NT and OS/2 are said to have "cheap" threads and "expensive" processes; in other operating systems there is not so great a difference except the cost of address space switch which implies a TLB flush.
Task and process are used synonymously.
from wiki clear explanation
1:1 (Kernel-level threading)
Threads created by the user are in 1-1 correspondence with schedulable entities in the kernel.[3] This is the simplest possible threading implementation. Win32 used this approach from the start. On Linux, the usual C library implements this approach (via the NPTL or older LinuxThreads). The same approach is used by Solaris, NetBSD and FreeBSD.
N:1 (User-level threading)
An N:1 model implies that all application-level threads map to a single kernel-level scheduled entity;[3] the kernel has no knowledge of the application threads. With this approach, context switching can be done very quickly and, in addition, it can be implemented even on simple kernels which do not support threading. One of the major drawbacks however is that it cannot benefit from the hardware acceleration on multi-threaded processors or multi-processor computers: there is never more than one thread being scheduled at the same time.[3] For example: If one of the threads needs to execute an I/O request, the whole process is blocked and the threading advantage cannot be utilized. The GNU Portable Threads uses User-level threading, as does State Threads.
M:N (Hybrid threading)
M:N maps some M number of application threads onto some N number of kernel entities,[3] or "virtual processors." This is a compromise between kernel-level ("1:1") and user-level ("N:1") threading. In general, "M:N" threading systems are more complex to implement than either kernel or user threads, because changes to both kernel and user-space code are required. In the M:N implementation, the threading library is responsible for scheduling user threads on the available schedulable entities; this makes context switching of threads very fast, as it avoids system calls. However, this increases complexity and the likelihood of priority inversion, as well as suboptimal scheduling without extensive (and expensive) coordination between the userland scheduler and the kernel scheduler.