Understanding how operating systems store/retrieve IO device input - input

I am a bit confused on how I/O devices like keyboards store their input for use by the operating system or an application. If I have a computer with a single processor (a CPU with a single core), and the current executing process is a game, how is the game able to be "aware" of keyboard input? Even if a key press were to force a hardware interrupt (and thus context switch), and then "feed" the key value to the game whenever the OS gives control back to the game process, there's no guarantee that the game loop would even be checking for player input at that time, it could just as well be updating game object positions or rendering the game world.
So my question is this...
Do I/O devices like keyboards store there input to some kind of on-board hardware specific microprocessors or on-board memory storage buffer queues, which can later be "read" and flushed by external processes or the OS itself? Any insight is greatly appreciated, thanks!

Do I/O devices like keyboards store there input to some kind of on-board hardware specific microprocessors or on-board memory storage buffer queues, which can later be "read" and flushed by external processes or the OS itself?
Let's split it into 3 parts..
The Device Specific Part
For old keyboards (before USB); a microcontroller in the keyboard regularly scans a grid of switches and detects when a key is pressed or released, converts that into a code (which could be multiple bytes), then sends the code on byte at a time to the computer. The computer also has a little microcontroller to receive these bytes. That microcontroller has a 1 byte buffer (not even big enough for multi-byte codes).
For newer keyboards (USB); the keyboard's internals a mostly the same (microcontroller scanning a grid of switches, etc); but USB controller asks the keyboard "Anything happen?" regularly (typically every 8 milliseconds) and the keyboard's microcontroller replies.
In any case; the keyboard driver gets the code that came from the keyboard and processes it; typically converting it into a fixed length "key code", merging it with other data (if shift or capslock or... was active at the time; if there's unicode codepoint/s that make sense for the key, etc) and bundles all that into a data structure.
The OS Specific Part
That data structure (created by the keyboard driver) is typically standardized by the OS as a "user input event" (so, same "event" data structure for keyboard, mouse, touchscreen, joystick, ...).
That "user input event" is sent from driver via. some form of inter-process communication (pipes, messages, ...) to something else (e.g. GUI). This inter-process communication has 2 common behaviors - if the receiving program is blocked waiting to receive an event then the scheduler unblocks it (cancels the waiting) and schedules it to get CPU time again; and if the receiving program isn't waiting the event is often put on a queue (in memory) of pending events.
Of course often there's many processes involved, and the "user input event" might be forwarded from one process (e.g. input method editor) to another process (e.g. GUI) to another process (e.g. whichever window has keyboard focus). Also (for old legacy command line stuff) it might end up at a translation layer (e.g. terminal emulator) that converts the events into a character stream (stdin) while destroying most of the information (e.g. when a key is released).
The Language Specific Part
To get the event from high level code, it depends what the language is and sometimes also which library is being used. The most common is some kind of "getEvent()" that causes the program to fetch the next event from its queue (from memory); and may cause the program to wait (and not use any CPU time) if there isn't any event get yet. However, often that is buried further, such that you register a callback and then when something else calls "getEvent()" and when it receives an even it calls the callback you registered; so it might end up like (e.g. for Java) public boolean handleEvent(Event evt) { switch (evt.id) { case Event.KEY_PRESS: ....

The keyboards are mostly USB today. On most computers, including ARM computers, you have a USB controller implementing the xHCI specification developed by several major tech companies. If you google "xhci intel spec" or something similar, the first link or so should be a link to the full specification.
The xHCI spec requires implementations to be in the form of a PCI device. PCI is another spec which is developed by the PCI-Seg group. This spec specifies everything down to hardware requirements. It is not a free spec like xHCI. It is actually quite expensive to obtain (around 3k$).
The OS detects PCI devices using ACPI or similar specifications which can sometimes be board specific (especially for ARM because all x86 based computers implement ACPI). ACPI tables, found at conventionnal positions of RAM, mention where to find base addresses of the memory mapped configuration space of each PCI device.
The OS interacts with PCI devices using registers that are memory mapped in RAM. The OS reads/writes at the positions specified by ACPI tables and by the configuration spaces themselves to gather information about a certain device and to make the device do operations on its behalf. The configuration space of a PCI device have a general portion (the same for every device) and a more specific portion (device dependent). In the general portion, there are BAR registers that contain the address of the device dependent portion of the config space. Each implementer of the convention can do whatever they want with the device specific portion. The general portion must be somewhat similar for every device so that the OS can properly detect and initialize the device.
Today, each type of device have to respect a certain convention to work with current computers. For example, hard-disks will implement SATA and the board will have an AHCI (a PCI SATA controller). Similarly, keyboards will implement USB and the board has an xHCI.
The xHCI itself have complex interaction mechanisms. A summary for keyboards, is that the OS will "activate" the interrupt IN endpoint of the keyboard. The OS will then place transfer requests (TRBs) on a circular ring in RAM. For interrupt endpoints, the xHCI will read one transfer request per time interval specified by the OS in a specific xHCI register. When the xHCI executes a TRB (when it does the transfer), it will post an event to another circular ring in RAM called the Event Ring. When a new event is posted, the xHCI triggers an interrupt. In the interrupt handler, the OS will read the Event Ring to determine what happened. For a keyboard, the OS will probably see that a transfer was done and read the data. Most often, keyboards will return 8 bytes. Interpretation of each byte is the keys that were pressed at the moment of the transfer. The bytes contain conventional scancodes. So they are not directly in UTF-8 or ASCII format. There is one scancode per keyboard key. It is up to the OS to determine what to do depending on the keyboard key. For example, if the data says that the 'A' key is pressed, then the OS can look if the 'SHIFT' key is pressed to determine if the 'A' should be uppercase or lowercase. If the next report says that the 'A' key is not pressed than the OS should consider this as a release of the key (the user released the key). In other words, the keyboard is polled by the OS at certain intervals.
The interrupt handler will probably pass the key to other portions of the kernel and save the key to a process specific structure. The process will probably poll a lock protected buffer that will contain events. It could also be a system wide buffer that simply contains all events.
The higher level implementation probably varies between OS. If you understand the lower level inner workings than you can probably imagine how an OS will work. It is certain that the whole things is very complex because CPUs nowadays have several cores and caches and other complex mechanisms. The OS must make sure to work seamlessly with all these mechanisms so it is quite complex but achievable. To understand the whole thing, you'd probably need to understand the whole OS. It has lots of ramifications in other OS concepts.

Related

How often do operating systems poll key inputs?

Is it at each screen refresh or exactly when keys are pressed (through interrupts etc.)?
That largely depends on the device. There were effectively three generations of devices:
Polling
Character interrupts. Each key press generates an input.
Programmed interrupts. The device is configurable so that it only generates an interrupt when necessary. For example, some terminal devices support programming such that the user can enter a string of characters (and even edit those characters) and there is only an interrupt when the user hits <RETURN>.
On all non-trivial systems, I/O events are signaled by hardware interrupts that cause drivers to be run. No polling is required or desired.
Any thread waiting on KB input will be made ready, and hopefully running, when the KB driver exits. It can then handle the KB event.

Most common firmware update protocol

I am supposed to pick (and may be implement) the firmware update protocol/software/procedure for the embedded device without USB and with limited program memory size. That device will work autonomously most of the time but once in a while a technician will be coming and updating the firmware.
What would be the most common choice for the update protocol if I wanted to use RS232 or CAN?
The requirements for the update are: complete after interrupted update (boot loader will be needed, I assume), small memory footprint, merge user settings with the newly introduced user data fields (in EEPROM), backup the previous version of the firmware with the possibility to roll the update back, safely update the boot loader itself.
It would be nice if the implementation of the boot loader and update client software existed already too (at least for Windows).
And just out of curiosity - are there any good alternatives to DFU for devices with USB?
Thanks in advance
I am not sure about "most common"; I am not sure anyone could answer that authoritatively or whether the answer is even useful. However I can tell you that I have implemented XMODEM-CRC/XMODEM-1K on a number of devices (ARM 7, ARM Cortem-M, PIC24, TI C55xx for example) in less than 4Kbytes. The bootloader sends an XMODEM start request on each port that is to support update, then for each port if a response is received within a short timeout (a few tens of milliseconds), then transfer continues. If no response is received the application is started normally.
complete after interrupted update (boot loader will be needed, I assume)
The approach I have taken is to not program the start address immediatly to flash on receipt but to copy it sideways and then program it last. The bootloader checks the start address on start-up and if it is 0xFFFFFFFF (i.e. not programmed) the transfer did not complete, and the bootloader restarts continuously polling for XMODEM start.
merge user settings with the newly introduced user data fields (in EEPROM),
In my case I have used Intel HEX files, but EEPROM memory is not commonly memory mapped. You could solve that by using a proprietary data format or set the address of the HEX data to an area that is invalid on the processor which the bootloader code will recognise as data to be sent to the EEPROM instead.
backup the previous version of the firmware with the possibility to
roll the update back,
That is a function of the bootloader implementation rather than the protocol. It of course requires that you have space to store two copies of the application. The unused copy could possibly be zipped, but incorporating decompression in the bootloader will increase its size. A perhaps simpler and least costly approach would be to have the bootloader support output of the current application image via XMODEM allowing the back-up to be stored on the host. However by doing that you are potentially enabling a third party to access your code.
safely update the boot loader itself.
Again that is a function of your bootloader rather then the protocol. If the code runs from RAM (i.e. the bootloader is copied from ROM to RAM and executed, then it is straightforward. In this case it is safest if possible to load the entire bootloader data into RAM before programming flash memory in order to minimise the time the target has no bootloader and so that sucessful programming does not rely on the host connection being maintained throughout.
If however the bootloader runs from flash, replacing it from the bootloader itself is not possible. Instead you might load an application that the bootloader runs and which replaces the bootloader before loading (or reloading) the final application.
It would be nice if the implementation of the boot loader and update
client software existed already too (at least for Windows).
Any terminal emulator software such as TeraTerm, Hyperterminal, PuTTY etc. will support XMODEM transfer. Implementing your own custom XMODEM sender is relatively straightforward with XMODEM source code widely available.
And just out of curiosity - are there any good alternatives to DFU for
devices with USB?
What I have done is implement a CDC/ACM device class USB stack in the bootloader so that it appears to the host as a serial port, and then used the same XMODEM code as before to do the data transfer. This increases the size of the bootloader; in my case to about 12kbytes. It was implemented using a stack and CDC/ACM (virtual COM port) app-note provided by the chip vendor. Strictly speaking for this you will need a USB vendor-id (VID) registered to your company; you should not use just any old ID.

Accessing memory space / registers on externally connected devices through software

This question is a bit vague, and I apologzie for that, but a fairly vague answer will do :)
How do people typically access memory adresses of external devices (say, connected to a PC through USB, or even just say, a multipurpose microcontroller)? I'm wondering how software is able to find address to write to registers or EEPROM space.
For example if I want to write a value to register 0x1234, does software just send this information (the register and the value to be written) to some sort of driver that "talks" to the device and takes care of the value change through hardware?
Is implementation of this functionality mostly a hardware endeavor?
Thanks!
Let's use as an example a fairly common USB peripheral controller that is based on an 8-bit 8051 microcontroller core. One side of it attaches to the USB host controller on a desktop computer. The other end goes to a USB device controller that presents itself as a FIFO endpoint to the host.
Some 8051 firmware will be required to initialize the device side. A class driver will be required on the host side. Once those are in place, the application developer will have a device name on the host side which may be opened for read/write. Sometimes a vendor will provide a library to perform device specific tasks and isolate the user from the raw device. Often a Windows DLL is available to hide the low level I/O and present device operations as function calls.
Additional 8051 firmware monitors the FIFO from device end and interprets messages sent from the host application or DLL then takes actions. These actions may be low level such as read/write from a memory location or register. They may be high level such as setting the PWM value of a programmable counter array.
So your hypothetical description of a write to register 0x1234 is not far from how it is often implemented.

How CPU finds ISR and distinguishes between devices

I should first share all what I know - and that is complete chaos. There are several different questions on the topic, so please don't get irritated :).
1) To find an ISR, CPU is provided with a interrupt number. In x86 machines (286/386 and above) there is a IVT with ISRs in it; each entry of 4 bytes in size. So we need to multiply interrupt number by 4 to find the ISR. So first bunch of questions is - I am completely confused in mechanism of CPU receiving the interrupt. To raise an interrupt, firstly device shall probe for IRQ - then what ? The interrupt number travels "on IRQ" towards CPU? I also read something like device putting ISR address on data bus ; whats that then ? What is the concept of devices overriding the ISR. Can somebody tell me few example devices where CPU polls for interrupts? And where does it finds ISR for them ?
2) If two devices share an IRQ (which is very much possible), how does CPU differs amongst them ? What if both devices raise an interrupt of same priority simultaneously. I got to know there will be masking of same type and low priority interrupts - but how this communication happens between CPU and device controller? I studied the role of PIC and APIC for this problem, but could not understand.
Thanks for reading.
Thank you very much for answering.
CPUs don't poll for interrupts, at least not in a software sense. With respect to software, interrupts are asynchronous events.
What happens is that hardware within the CPU recognizes the interrupt request, which is an electrical input on an interrupt line, and in response, sets aside the normal execution of events to respond to the interrupt. In most modern CPUs, what happens next is determined by a hardware handshake particular to the type of CPU, but most of them receive a number of some kind from the interrupting device. That number can be 8 bits or 32 or whatever, depending on the design of the CPU. The CPU then uses this interrupt number to index into the interrupt vector table, to find an address to begin execution of the interrupt service routine. Once that address is determined, (and the current execution context is safely saved to the stack) the CPU begins executing the ISR.
When two devices share an interrupt request line, they can cause different ISRs to run by returning a different interrupt number during that handshaking process. If you have enough vector numbers available, each interrupting device can use its own interrupt vector.
But two devices can even share an interrupt request line and an interrupt vector, provided that the shared ISR is clever enough to go back to all the possible sources of the given interrupt, and check status registers to see which device requested service.
A little more detail
Suppose you have a system composed of a CPU, and interrupt controller, and an interrupting device. In the old days, these would have been separate physical devices but now all three might even reside in the same chip, but all the signals are still there inside the ceramic case. I'm going to use a powerPC (PPC) CPU with an integrated interrupt controller, connected to a device on a PCI bus, as an example that should serve nicely.
Let's say the device is a serial port that's transmitting some data. A typical serial port driver will load bunch of data into the device's FIFO, and the CPU can do regular work while the device does its thing. Typically these devices can be configured to generate an interrupt request when the device is running low on data to transmit, so that the device driver can come back and feed more into it.
The hardware logic in the device will expect a PCI bus interrupt acknowledge, at which point, a couple of things can happen. Some devices use 'autovectoring', which means that they rely on the interrupt controller to see to it that the correct service routine gets selected. Others will have a register, which the device driver will pre-program, that contains an interrupt vector that the device will place on the data bus in response to the interrupt acknowledge, for the interrupt controller to pick up.
A PCI bus has only four interrupt request lines, so our serial device will have to assert one of those. (It doesn't matter which at the moment, it's usually somewhat slot dependent..) Next in line is the interrupt controller (e.g. PIC/APIC), that will decide whether to acknowledge the interrupt based on mask bits that have been set in its own registers. Assuming it acknowledges the interrupt, it either then obtains the vector from the interrupting device (via the data bus lines), or may if so programmed use a 'canned' value provided by the APIC's own device driver. So far, the CPU has been blissfully unaware of all these goings-on, but that's about to change.
Now it's time for the interrupt controller to get the attention of the CPU core. The CPU will have its own interrupt mask bit(s) that may cause it to just ignore the request from the PIC. Assuming that the CPU is ready to take interrupts, it's now time for the real action to start. The current instruction usually has to be retired before the ISR can begin, so with pipelined processors this is a little complicated, but suffice it to say that at some point in the instruction stream, the processor context is saved off to the stack and the hardware-determined ISR takes over.
Some CPU cores have multiple request lines, and can start the process of narrowing down which ISR runs via hardware logic that jumps the CPU instruction pointer to one of a handful of top level handlers. The old 68K, and possibly others did it that way. The powerPC (and I believe, the x86) have a single interrupt request input. The x86 itself behaves a bit like a PIC, and can obtain a vector from the external PIC(s), but the powerPC just jumps to a fixed address, 0x00000500.
In the PPC, the code at 0x0500 is probably just going to immediately jump out to somewhere in memory where there's room enough for some serious decision making code, but it's still the interrupt service routine. That routine will first go to the PIC and obtain the vector, and also ask the PIC to stop asserting the interrupt request into the CPU core. Once the vector is known, the top level ISR can case out to a more specific handler that will service all the devices known to be using that vector. The vector specific handler then walks down the list of devices assigned to that vector, checking interrupt status bits in those devices, to see which ones need service.
When a device, like the hypothetical serial port, is found wanting service, the ISR for that device takes appropriate actions, for example, loading the next FIFO's worth of data out of an operating system buffer into the port's transmit FIFO. Some devices will automatically drop their interrupt request in response to being accessed, for example, writing a byte into the transmit FIFO might cause the serial port device to de-assert the request line. Other devices will require a special control register bit to be toggled, set, cleared, what-have-you, in order to drop the request. There are zillions of different I/O devices and no two of them ever seem to do it the same way, so it's hard to generalize, but that's usually the way of it.
Now, obviously there's more to say - what about interrupt priorities? what happens in a multi-core processor? What about nested interrupt controllers? But I've burned enough space on the server. Hope any of this helps.
I Came over this Question like after 3 years.. Hope I Can help ;)
The Intel 8259A or simply the "PIC" has 8 pins ,IRQ0-IRQ7, every pin connects to a single device..
Lets suppose that u pressed a button on the keyboard.. the voltage of the IRQ1 pin, which is connected to the KBD, is High.. so after the CPU gets interrupted, acknowledge the Interrupt bla bla bla... the PIC does simply add 8 to the number of the IRQ line so IRQ1 means 1+8 which means 9
SO the CPU sets its CS and IP on the 9th entry in the vector table.. and because the IVT is an array of longs it just multiply the number of cells by 4 ;)
CPU.CS=IVT[9].CS
CPU.IP=IVT[9].IP
the ESR deals with the device through the I/O ports ;)
Sorry for my bad english .. am an Arab though :)

How firmwares communicate to the electronic devices to perform its operations?

Almost all electronic devices comes with firmwares. I know it is stored in ROM (Read only memory) so it becomes non-volatile (no power source required to hold the contents from getting erased like RAM)
What I want to know is "How firmwares communicate to the electronic devices to perform its operations?"
Let say there is a small roller.. On press of a button, how it makes it to move?
Can someone please explain what is residing behind, to make it happen..
I think it may require a little brief explanation to unwind it..
Also what is the most popular language used for coding firmwares?
Modern hardware like you're describing has a program stored in ROM and an all-purpose microcomputer (CPU) executing that program.
The CPU reads information from ROM by setting up addresses on its address bus and then asking the ROM to tell it the value stored at that location. There's something like a read pulse being raised (on a separate line) to tell the ROM to make the value accessible on the lines of the data bus. That, in a nutshell, is reading.
To get the hardware to do something, the CPU basically executes a kind of write operation. It puts a value, which is just a bunch of bits if you want to look at it that way, on the address bus to select a certain device and perhaps function on that device, then it raises another signal line saying "write!" The device that recognizes its address on the address bus responds to that signal by accepting the data from the data bus and then performing whatever its function is. Typically, one of the data bus bits will be connected within the output device to a power output stage, i.e. a transistor stronger than the ones used just for computation, and that transistor will connect some electrical device to current sufficient to make it move/glow/whatever.
Tiny, cheap devices are coded in assembly language to save costs for ROM; in industrial quantities, even small amounts of memory can affect price. The assembly language is specific to the CPU; some chips called "8051", "6502" and "Atmel (something or other)" are popular. Bigger devices with more complex requirements may have their firmware written in C or a C-like dialect, which makes programming a little easier than assembler. The bigges ones even run C++ code. Compiled, of course.
In most systems there are special memory addresses which are used for I/O. Reading and writing on such addresses executes some function instead of just moving data around. In x86 systems there are also special I/O instructions IN and OUT for that.
The simplest case is called general parallel I/O (GPIO), where you can read or write data directly from/to external electrical pins on the device. There are several memory addresses, called registers, where you can read data from the port (voltage near 0 = 0, near supply voltage = 1), where you can write data to the port, and where you can define whether a particular pin is input (the corresponding bit is typically 0) or output (the bit is 1). Every microcontroller has GPIO.
So in your example the button could be connected to a pin set to input, which the software could sense. It would typically do this every 10ms and only react if it has a stable value for several reads, this is called debouncing. Then it would write a 1 to some output, which via some transistor for amplification could drive a motor. If it senses that you release the switch it could turn the motor off again by writing a 0. And so on, this program would run until you turn the device off.
There are lots of other I/O devices for other purposes with typically hundreds of registers for controlling them. If you want to see more you could look into the data sheet of some microcontroller. For example, here is the data sheet of ATtiny4/5/9/10, a very small controller from the Atmel AVR family.
Today most firmware is written in C, except for the smallest devices and for a little special code for handling resets and interrupts, which is written in assembly language.