What is a crashloop? - crash

I'm reading Google's Site Reliability Engineering book and ran across the word crashloop which I've never heard before and have not been able to locate a definition
"If a task tries to use more resources than it requested, Borg kills the task and restarts it (as a slowly crashlooping task is usually preferable to a task that hasn’t been restar‐ ted at all)."
What is a crashloop and how does it compare to an infinite loop if at all?

A crashloop is when a process crashes and is restarted by a watchdog daemon, indefinitely.
That is, the history is:
Process starts at time T.
Process crashes at time T+1.
Watchdog daemon restarts process.
Process started at time T+2.
Process crashes at time T+3.
Watchdog daemon restarts process.
Process starts...etc.
Here, the watchdog deamon is Borg, and the process is encapsulated into a task.
In general, in distributed computing if you want something to eventually succeed, you have to write down your intent for it to be completed and you need a worker to loop continually to act on this intent. This is "at least once delivery" of a work item.
Here, the intent is that the task runs (written down into Borg), and Borg itself is running the loop that is constantly trying to make sure the task runs. This is why when a task crashes, it is restarted. When a task crashes repeatedly, together you end up with a crashloop.

Related

Process of State

I learned that when an interrupt occurs, the process goes to the ready queue rather than going through the Blocked Queue. However, in this picture, the interrupted process has moved to the blocked queue(which is a circle with pink color). I'm confused that which case goes to the ready queue and which goes to the blocking queue.
Process management in general is much more complex than this. A task is often tied to one specific processor core. Several tasks are tied to the same processor core and each of these tasks can be blocked waiting for IO. It means that any task can be interrupted at any time by an interrupt triggered by a device controller even if the task currently running on the core had nothing to do with that specific interrupt.
The diagram is thus incomplete. It doesn't take in account the complete process lifecycle. In your diagram, the process goes on the blocked queue if it is waiting for IO (after a syscall like read()). It goes to the ready queue if it was preempted by the kernel for another process to have some time on that core.
I think people often have the misconception that each process will run all the time until completion. It cannot be that way otherwise most processes would never get time on any core. Instead, if the amount of processes is higher than the amount of cores, the kernel uses the per core local APIC's timer (local APIC is on x86-64 but you will have similar mechanisms on every architecture) to give every process tied to that core a time slice. When a certain process is scheduled for a certain core, the kernel starts the timer with its time slice. When the time slice has elapsed, the local APIC triggers an interrupt letting the kernel know that another process should be scheduled on that core. This is why a process can be preempted in the middle of its execution. The process is still considered to be ready to run. It is simply that its time slice was exhausted so the kernel decides to give some time to another process. The preempted process will be given some more timer later. Since, in human terms, the time slice of each process is very short, it gives the impression that each process is running consistently without interruption when it is not really the case. (By the way this diagram is very Linux kernel specific)

Listen or wait for a specific time without using timer

Is there a way to listen or wait for a specific time (e.g. 11:30 am) every day. The only way I know how is to set a timer that checks for the current time every 60 seconds which I have actually implemented using a backgroundworker. But is there a way to just wait and listen for the specified time (similar to monitoring for directory changes) and then take some action?
Thanks in advance.
Typically, rather than having a program resident in memory waiting, you would setup a Scheduled Task for this (or a cron job on linux). The scheduled task will run the program at the appropriate time. The program can still check (validate) the expected time if needed, but it shouldn't just always sit in the background using up resources if it's only going to run once per day.
The scheduled task is also better because it will recover automatically from computer reboots, crashes, etc. If something happens that interrupts your program's normal running, the scheduled task will still be able to run.
This is especially important in the .Net world, because .Net requires you to be very careful writing long-lived programs to avoid address space fragmentation. The .Net garbage collector is good at freeing up and returning old memory to the operating system, but over time your program's virtual address space can become fragmented and eventually you will not be able to allocate new memory any longer.
Even if this is part of a larger program, where there are also other things happening based on user interactions, it's still a good idea to split this off into a separate process.

Can a process terminate after I/O without returning to the CPU?

I have a question about the following diagram from Operating Systems Concepts: http://unboltingbinary.in/wp-content/uploads/2015/04/image028.jpg
This diagram seems to imply that after every I/O operation, the process is placed back on the ready queue before being sent to the CPU again. However, is it possible for a process to terminate after I/O but before being sent to the ready queue?
Suppose we have a program that computes a number and then writes it to storage. In this case, does the process really need to return to the CPU after the I/O operation? It seems to me that the process should be allowed to terminate right after I/O. That way, there would be no need for a context switch.
Once one process has successfully executed a termination request on another, the threads of the terminated process should never run again, no matter what state they were in - blocked on I/O, blocked on inter-thread comms, running on a core, sleeping, whatever - they all must be stopped immediately if running and all be put in a state where they will never run again.
Anything else would be a security issue - terminated threads should not be given execution at all, (else it may not be possible to terminate the process).
Process termination requires the cpu. Changes to kernel mode structures on process exit, returning memory resources, etc. all require the cpu.
A process simply just does not evaporate. The term you want here is process rundown - I think.

How processor get to know to switch process with high prioirity process?

I red that, process scheduler will replace the process that is currently processing by cpu
with high priority process. At any point only one process will be executed by processor in that case where the scheduler is running to notify cpu about high priority process, when cpu is busy in executing low priority process ?
The process scheduler is the component of the operating system that is
responsible for deciding whether the currently running process should
continue running and, if not, which process should run next.
To help the scheduler monitor processes and the amount of CPU time that they use, a programmable interval timer interrupts the processor periodically (typically 50 or 60 times per second). This timer is programmed when the operating system initializes itself. At each interrupt, the operating system’s scheduler gets to run and decide whether the currently running process should be allowed to continue running or whether it should be suspended and another ready process allowed to run. This is the mechanism used for preemptive scheduling.
So,basically,the process scheduler runs in the same main memory,when active, but are only activated after getting invoked by interrupts. Hence,they aren't all time running.
BTW,that was a great conceptual question to answer.Best wishes for your topic.
The higher-priority thread/process will preempt the lower-priority thread when an interrupt causes the scheduler to be run to decide on what set of threads to run next, and the scheduler algorithm decides that the lower-priority thread needs to be replaced by the higher-priority one.
Interrupts come in two flavours:
Software interrupts, (syscalls) from threads that are already running and change the state of threads, eg. by signaling an event, mutex or semaphore upon which another thread is waiting, and so making it ready to run.
Hardware interrupts that cause a driver to run and that driver chooses to invoke the scheduler on exit because an I/O operation has completed or some timeout interval has expired that needs to change the set of running threads, (eg. disk, NIC, KB, mouse, timer).

Will detached NSThreads always complete prior to application exit?

When using NSThread's detachNewThreadSelector:toTarget:withObject:, I'm finding that the thread will fully complete its execution before the application is terminated normally if the user were to attempt to quit the application while the background process was executing.
In this case, this is the behavior I desire, but I couldn't find anything in Apple's docs that suggests that this will always be the case. The only relevant information I was able to find was the following, from Apple's Threading Programming Guide:
Important: At application exit time, detached threads can be terminated immediately but joinable threads cannot. Each joinable thread must be joined before the process is allowed to exit. Joinable threads may therefore be preferable in cases where the thread is doing critical work that should not be interrupted, such as saving data to disk.
So from this, I know that detached threads can be terminated at the time of application exit, but will they ever be terminated automatically? Or, am I always safe to assume the thread will complete its execution before the application quits?
You cannot assume that any thread -- including the main thread -- will ever complete execution normally, regardless of the documentation.
This is because the user can quit an application at any time, the system may lose power/panic, or the app may crash.
As for detached threads, it would not be unheard of for the system frameworks to automatically terminate the app forcibly after some timeout once the main event loop has given up the ghost.