Why interrupts require very fast servicing? - interrupt-handling

Is there another reason except for slowing the system a little bit?
I ask it because of nos's comment here:
Why kernel code/thread executing in interrupt context cannot sleep?
Also, interrupts usually require very fast servicing, or you can easily get into all sorts of trouble.
Which kind of troubles could be made?

Have you ever had that your computer was busy working, for instance during startup, and you kept pressing keys, and after a while you just got a beep and those keys weren't registered/buffered anymore? That's an example of what can happen.
If you don't handle the interrupt fast enough, the inflow may be larger than you can handle, and there is no room to queue more interrupts.
Modern hardware and modern OS'es will not run into such limits as quickly as Ye Olde DOS machine, but that doesn't mean that their buffers are unlimited.


can a computer work with software interrupts only?

can software interrupts do some of hardware interrupts?
can it detect power failure and things and then rely only on software interrupts?
so then we wont need special hardware like interrupt controllers
While this may be technically possible I doubt you'll end up with a system that's stable or even reliable. Interrupts are especially important as hardware because they, well, interrupt the processing of other tasks asynchronously. This allows physical components at their lowest level to quickly and correctly respond to events.
Let's play out the scenario you mention and imagine a component on the motherboard detecting a power failure. Without an interrupt the best it can do is write to a register or cache. It must then rely on another piece of hardware or even the operating system to check that value. This basically means periodic polling which is not as efficient. Furthermore, if there is currently a large instruction set running that might be hogging the resources necessary to check the value you have no deterministic way of knowing when that check might occur. It could be near-instant, or it could be a second from now. If it's the latter, your computer loses power and shuts down before it can react.

How do you avoid interrupt starvation in a nested interrupt system?

I am learning about interrupts and couldn't understand what happens when there are too many interrupts to a point where the CPU can't process the foreground loop or complete the existing interrupts. I read through this article https://www.cs.utah.edu/~regehr/papers/interrupt_chapter.pdf but didn't completely understand how a scheduler would help, if there are simply too many interrupts?
Do we switch to a faster CPU if the interrupts can not be missed?
Yes, you had to switch to a faster CPU!
You had to ensure that there is enough time for the mainloop. Therefore it is really important to keep your Interrupt service as short as possible and do some CPU workloads tests.
Indeed, any time there is contention over a shared resource, there is the possibility of starvation. The schedulers discussed in the paper limit the interrupt rate, thus ensuring some interrupt-free processing time during each interval. During high activity periods, interrupt handling is disabled, and the scheduler switches to polling mode where it interrogates the state of the interrupt request lines periodically, effectively throttling the stream of interrupts. The operating system strives to do as little as possible in each interrupt handler - tasks are often simply queued so they can be handled later at a different stage. There are many considerations and trade-offs that go into any scheduling algorithm.
Overall you need a clue of how much time each part of your program consumes. This is pretty easy to measure in practice live with an oscilloscope. If you activate a GPIO when entering and de-activate it when leaving the interrupt, you don't only get to see how much time the ISR consumes, but also how often it kicks in. If you do this for each ISR you get a good idea how much time they need. You can then do something similar in main(), to get a rough estimate of the complete execution cycle of the program, main + interrupts.
As for the best solution, it is obviously to reduce the amount of interrupts. Use polling if possible. Use DMA. Use serial peripherals (UART, CAN etc) that are hardware-buffered instead of interrupt-intensive ones. Use hardware PWM instead of output compare timers. And so on. These things need to be considered early on when you pick a suitable MCU for your project. If you picked the wrong MCU, then you'll obviously have to change. Twiddling with the CPU clock sounds like quick & dirty fix. Get the design right instead.

How does a CPU idle (or run below 100%)?

I first learned about how computers work in terms of a primitive single stored program machine.
Now I'm learning about multitasking operating systems, scheduling, context switching, etc. I think I have a fairly good grasp of it all, except for one thing. I have always thought of a CPU as something which is just charging forward non-stop. It always knows where to go next (program counter), and it goes to that instruction, etc, ad infinitum.
Clearly this is not the case since my desktop computer CPU is not always running at 100%. So how does the CPU shut itself off or throttle itself down, and what role does the OS play in this? I'm guessing there's an input on the CPU somewhere which allows it to power down... and the OS can set this if it has nothing to schedule, but the next logical question is how does it start back up again? I'm guessing either one of two things:
It never shuts down completely, just runs at a very low frequency waiting for the scheduler to get busy again
It shuts down completely but is woken up by interrupts
I searched all over for info on this and came up fairly empty-handed. Any insight would be much appreciated.
The answer is that is depends on the hardware, the operating system and the way that the operating system has been configured.
And it could involve either or both of the strategies you proposed.
Another possibility for machines based on the x86 architecture, is that x86 has an HLT instruction that causes the core to stop until it receives an external interrupt. So the "Idle" task could simply execute HLT in a tight loop.
Just go to task manager, performance tab, and watch the cpu usage while you're doing absolutely nothing on your computer. it never stops fluctuating. Having an operating system like windows running, the cpu is going to ALWAYS be functioning, it never completely shuts down.
Having your monitor display an image requires your cpu to process a function allowing it to display anything. etc.
Everything runs through the CPU, just like your brain, it controls everything. nothing would function without it.
Some CPUs do have a 'wait for interrupt' instruction which allows the CPU to stop executing instructions when there is nothing to do, and will not re-awake until there is an interrupt event. This is particularly useful in microcontrollers, where they can sit for long periods of time waiting for something to happen.
Intel = HLT (Halt)
ARM = WFI (Wait for interrupt)
Sometimes a 'busy wait' is also used, where the CPU sits in a little 'idle' loop, checking for things to do. In this case, the CPU is still running instructions, but the operating system is in an idle state. It's not as efficient as using a HLT.
Modern CPUs can also adjust their power usage, and are capable of reducing clock rates, or shutting down parts of the CPU that aren't being used. In this way, power usage during an active idle state can be less than during active processing, even though the core CPU is still running and executing instructions.
If speaking about x86 architecture when an operating system has nothing to do it can use HLT instruction.
HLT instruction stops the CPU till next interrupt.
See http://en.m.wikipedia.org/wiki/HLT for details.
Other architectures have similar instruction to give CPU a rest.

Inter Processor Interrupt usage

An educational principle is: There is not such a thing as a stupid question. The basic idea behind this is that people learn by asking.
I was asked to: "Can you show and explain at a programming level what bad will happen if every task could execute all instructions."
I did give the code
and explained it (the system frozen for good- UP)
Then I was asked: "Is it possible give an example so that system do not freeze even this clearing interrupts is done?"
I did modify the previous example:
I did give the code
and explained it.
Trivially: If we have demand paging i=i/0 causes first a page fault (the data page not present) and an other task can be scheduled to run interrupts enabled during the disk read and later on divide by zero will throw this task away for good.
But the answers were based on UP. What about SMP? I must tell that answers are incomplete.
It still easy enough to construct:
int i;
for(i=0;i<100;i++)// Suppose we have less than 100 CPUs
{ sleep(5);//The generating task has (most probable) time to do all forks
which will disable interrupts for all CPUs, because every CPU gets a poisonous task to run.
Even so far a stupid question did reveal many things good to learn to a beginner: privileged instructions, paging, fault handling, scheduling during DMA, fork.....
But a minor doubt remains (shame on me) about the first program running on a SMP.
Will one CPU be out permanently or not?
Other CPUs continue and can send re_schedule() IPI message.
What happens then?
It can be easy to speculate that the frozen CPU do not wake up, because interrupts are disabled.
But to be perfectly sure must know more.
My question was:
Is the Inter Processor Interrupt (IPI) maskable or non-maskable?
I mean in the most common "popular" implementations?
Excuse my stupid question. It can't be very difficult to find an answer. I will seek it.
I mean interrupt pin number (telling maskable, I guess).
My own answer - correct?
I studied the issue, because nobody else did like it, coming to following thoughts:
With important real-time applications we have had for a long time a watchdog timer (HW interrupting cpu to answer somehow "I am alive").
For example we have main control computer and standby computer taking care of the system if the main computer is down.
What about Linux?
What kind watchdog- have we one?
We can compile the Linux kernel with or without watchdog.
What the Linux watchdog does?
On many(!) x86/x86-64 type hardware there is a feature that enables us to generate 'watchdog NMI interrupts'.
It's even possible to disable the NMI watchdog in run-time by writing "0" to /proc/sys/kernel/nmi_watchdog.
If any CPU in the system does not execute the period local timer interrupt for more than 5 seconds, APIC tries to fix the situation by a non-maskable interrupt (cpu executes the handler, and kills the process)!
(SCC Linux is an different case as to NMI.)
My answers (in the original question) were based on the system without watchdog!
It is problematic to answer at a general level and give examples based on some fixed system. The answers can be correct or not depending the cpu and configuration and settings.
Anyway, talking about NMI did make some sense? Did it?
If the CPU didn't restrict access to some instructions, it would be too easy to accidentally or deliberately cause a catastrophe.
push $0
push $0
lidt (%esp)
int $42
This code sequence will reset an x86 processor. Here's why:
The code loads the IDTR register with an interrupt descriptor table (IDT) at linear address 0, with a size of one byte.
Raises interrupt 42, which can't work because it is beyond the 1-byte limit of the IDT.
The CPU tries to raise a general protection fault, interrupt 13. This fails too, because interrupt 13 is beyond the one byte limit.
The CPU tries to raise a double fault exception, interrupt 8. This fails too, interrupt 8 is beyond the limit of the IDT.
This is known as a triple-fault. The CPU does a shutdown bus cycle to tell the motherboard that it is now ignoring everything and stopping execution. The motherboard asserts reset, rebooting the machine.
This is actually negligible compared to what code could do. A code sequence could easily hijack the machine altogether and start destroying all of the data on the hard drive, it could send all of your files to a malicious server on the internet, it could change your password, enable remote access, connect out to a malicious server and grant an attacker unlimited shell access. There's no limit on what a program could do.
Processors have privileged instructions for two reasons, the primary purpose is to protect the operating system from buggy programs that might accidentally do something to bring down or hijack the whole machine. The secondary purpose is to restrict deliberately malicious programs from doing the same.

Which takes longer time? Switching between the user & kernel modes or switching between two processes?

Which takes longer time?
Switching between the user & kernel modes (or) switching between two processes?
Please explain the reason too.
EDIT : I do know that whenever there is a context switch, it takes some time for the dispatcher to save the status of the previous process in its PCB, and then reload the next process from its corresponding PCB. And for switching between the user and the kernel modes, I know that the mode bit has to be changed. Isn't it all, or is there more to it?
Switching between processes (given you actually switch, not run them in parallel) by an order of oh-my-god.
Trapping from userspace to kernelspace used to be done with a processor interrupt earlier. Around 2005 (don't remember the kernel version), and after a discussion on the mailing list where someone found that trapping was slower (in absolute measures!) on a high-end xeon processor than on an earlier Pentium II or III (again, my memory), they implemented it with a new cpu instruction sysenter (which had actually existed since Pentium Pro I think). This is done in the Virtual Dynamic Shared Object (vdso) page in each process (cat /proc/pid/maps to find it) IIRC.
So, nowadays, a kernel trap is basically just a couple of cpu instructions, hence rather few cycles, compared to tenths or hundreds of thousands when using an interrupt (which is really slow on modern CPU's).
A context switch between processes is heavy. It means storing all processor state (registers, etc) to RAM (at a magic memory location in the user process space actually, guess where!), in practice dirtying all cached memory in the cpu, and reading back the process state for the new process. It will (likely) have nothing still in the cpu cache from last time it ran, so each memory read will be a cache miss, and needed to be read from RAM. This is rather slow. When I was at the university, I "invented" (well, I did come up with the idea, knowing that there is plenty of dye in a CPU, but not enough cool if it's constantly powered) a cache that was infinite size although unpowered when unused (only used on context switches i.e.) in the CPU, and implemented this in Simics. Implemented support for this magic cache I called CARD (Context-switch Active, Run-time Drowsy) in Linux, and benchmarked rather heavily. I found that it could speed-up a Linux machine with lots of heavy processes sharing the same core with about 5%. This was at relatively short (low-latency) process time slices, though.
Anyway. A context switch is still pretty heavy, while a kernel trap is basically free.
Answer to at which memory location in user-space, for each process:
At address zero. Yep, the null pointer! You can't read from this entire page from user-space anyway :) This was back in 2005, but it's probably the same now unless the CPU state information has grown larger than a page size, in which case they might have changed the implementation.