Most hardware is controlled at the driver level through memory mapped handles. How is RAM controlled? Is there a spec for it? - firmware

I'm just getting into the massive topic of learning UEFI driver development and from what I understand so far, hardware peripherals are controlled using specific addresses mapped to memory. Well, the memory is hardware too. Is it not controlled by drivers?
I assume the CPU and motherboard have built-in circuits that handle this, but my curiosity is whether drivers have any hardware level control to this handling. I'd just prefer to know for sure and I'm not sure what manual would explain this.
[kernel/UEFI] driver <-> memory mapped address <-> firmware [hardware:keyboard]
[kernel/UEFI] driver <-> ? <-> firmware [hardware:RAM]
guess:
spec
driver <-> CPU microcode <-> motherboard circuit <-> firmware
I just think assumptions are bad and can't find a citation confirming the probable answer. The answer is relevant to security and which supply chain / standard we're trusting. Like PCIe or NVMe are standard specs, perhaps there's a standard for RAM <-> CPU communication?
Maybe this question is a better fit for an Engineering SE site?

From software development perspective, there isn't a driver for RAM control ‐ it's exposed as an indistinguishable part of the hardware via the Instruction Set for a given Architecture (ISA). Just like CPU is hardware, but without a driver to control it. Reason for drivers is access to hardware unknown to the ISA, such as a USB device, a particular manufacturer's SSD (which may or may not be present at power up time), graphics hardware, etc. Just like CPU, RAM must be present at power up ‐ you won't get much further than an error code via lights and beeps without RAM in your system at power up time. This isn't the case for most, if not all, other hardware; such other hardware is therefore optional, and isn't part of the ISA definition; drivers (software) are needed to access such optional hardware.
RAM is managed by both hardware (CPU; see Intel manual, volume 3), especially modern hardware, which provides virtualization support for a modern OS, including paging support, and software (the OS memory allocator), for purposes of virtualizing RAM for the running processes. No drivers though ‐ just addressing via ISA instructions.
If you're looking for an answer from a hardware perspective, such as the details of the bus circuitry, which CPU pins are involved, exact protocol, etc., then this question is a better candidate for Engineering StackExchange site.

Related

How did the first GPUs get support from CPUs?

I imagine CPUs have to have features that allow it to communicate and work with the GPU, and I can imagine this exists today, but in the early days of GPUs, how did companies get support from large CPU companies to have their devices be supported, and what features did CPU companies add to enable this?
You mean special support beyond just being devices on a bus like PCI? (Or even older, ISA or VLB.)
TL:DR: All the special features CPUs have which are useful for improved bandwidth to write (and sometimes read) video memory came after 3D graphics cards were commercially successful. They weren't necessary, just a performance boost.
Once GPUs were commercially successful and popular, and a necessary part of a gaming PC, it made obvious sense for CPU vendors to add features to make things better.
The same IO busses that let you plug in a sound card or network card already have the capabilities to access device memory and MMIO, and device IO ports, which is all that's necessary for video drivers to make a graphics card do things.
Modern GPUs are often the highest-bandwidth devices in a system (especially non-servers), so they benefit from fast buses, hence AGP for a while, until PCI Express (PCIe) unified everything again.
Anyway, graphics cards could work on standard busses; it was only once 3D graphics became popular and commercially important (and fast enough for the PCI bus to be a bottleneck), that things needed to change. At that point, CPU / motherboard companies were fully aware that consumers cared about 3D games, and thus it would make sense to develop a new bus specifically for graphics cards.
(Along with a GART, graphics address/aperture remapping table, an IOMMU that made it much easier / safer for drivers to let an AGP or PCIe video card read directly from system memory. Including I think with addresses under control of user-space, without letting user-space read arbitrary system memory, thanks to it being an IOMMU that only allows a certain address range.)
Before the GART was a thing, I assume drivers for PCI GPUs needed to have the host CPU initiate DMA to the device. Or if bus-master DMA by the GPU did happen, it could read any byte of physical memory in the system if it wanted, so drivers would have to be careful not to let programs pass arbitrary pointers.
Anyway, having a GART was new with AGP, which post-dates early 3D graphics cards like 3dfx's Voodoo and ATI 3D Rage. I don't know enough details to be sure I'm accurately describing the functionality a GART enables.
So most of the support for GPUs was in terms of busses, and thus a chipset thing, not CPUs proper. (Back then, CPUs didn't have integrated memory controllers, instead just talking to the chipset northbridge over a frontside bus.)
Relevant CPU instructions included Intel's SSE and SSE2 instruction sets, which had streaming (NT = non-temporal) stores which are good for storing large amounts of data that won't be re-read by the CPU any time soon, if at all.
SSE4.1 in 2nd-gen Core2 (2008 ish) added a streaming load instruction (movntdqa) which (still) only does anything special if used on memory regions marked in the CPU's page tables or MTRR as WC (aka USWC: uncacheable, write-combining). Copying back from GPU memory to the host was the intended use-case. (Non-temporal loads and the hardware prefetcher, do they work together?)
x86 CPUs introducing the MTRR (Memory Type Range Register) is another feature that improved CPU -> GPU write bandwidth. Again, this came after 3D graphics were commercially successful for gaming.

Why do you need a Programmable Real Time Unit (PRU) while you can have an RTOS?

The beaglebone Black processor includes two independent Programmable Real Time Units (PRUs). Hobbyists and professionals are excited about possible use of these units for real-time applications, which is understood. However, if you can have a RTOS (whether for the beaglebone or the raspberry pi), why would you need the PRUs?
EDIT-
For information, the BBB has an ARM Cortex A8 running at 1 GHz, with 1.9 DMIPS / MHz. The PRUs are simple RISCs running at 200 MHz.
Linux, even with the real-time scheduler is unsuited to many critical hard real-time tasks with response requirements at the microsecond level, on the other hand it provides or enables a great deal of functionality in terms of UI, connectivity and filesystem support. These things are either not available in an RTOS or are provided at significant cost in high end RTOS, and with much more limited hardware support.
So if you have a system that has hard-real time constraints, but needs more general purpose computing features such a networking, filesystem connection to commercial-off-the-shelf (COTS) peripherals etc., then the PRU provides a solution to that.
On the other hand I can't help but think that this is a marketing exercise on the part of TI to sell more chips. A similar solution has always been possible (and indeed common) using one or more processors to perform time critical tasks, possibly running an RTOS, while UI and connectivity are handled by a single processor with the necessary hardware and memory resources but without the real-time constraints. The PRU device does have two 32 bit cores, but XMOS xCORE devices have as many as 16 cores - with 16 communicating cores, you may not even need an RTOS.
To answer the question...
[...] if you can have a RTOS [...], why would you need the PRUs?
... directly; you probably wouldn't need them in that case, but you would loose Linux - and your application may need that. It is just one of many solutions to real-time applications using Linux. You pays your money, and takes your choice.
Most likely the processor in BeagleBone or RaspberryPI is too "heavy" for real time - after all, you could run RTOS on your PC, but it will not be very responsive entirely deterministic, even when it's faster than your typical microcontroller (I guess that these PRUs are some sort of microcontrollers with a new fancy name). In such high-level application processor as found on these boards you rarely have direct access to hardware or interrupts, which are essential for real time applications that actually do something time-critical.

cortex a9 boot and memory

I am a newbie starting out in micro-controller programming. The chip of interest here is cortex-a9. At reset or power up there has to be code at 0x0000000 from my readings. My questions though they may sound too trivial will help me in putting some concepts in perspective.
Does the memory address 0x0000000 reside in ROM?
What happens right after the code is read from that address?
Should there be some sort of boot-loader present & if so at what address should this be in & Should it also be residing in ROM?
Finally, at what point does the kernel kick in & where does the kernel code reside?
ARM sells cores not chips, what resides at that address depends on the chip vendor that has bought the ARM core and put it in their chip. Implementations vary from vendor to vendor, chip to chip.
Traditionally an ARM will boot from address zero, more correctly the reset exception vector is at address zero. Unlike other processor families, the traditional ARM model is NOT a list of addresses for exception entry points but instead the ARM EXECUTES the instruction at that address, which means you need to use either a relative branch or a load pc instruction. The newer cortex-m series, which are thumb/thumb2 only (they cannot execute ARM (32 bit) instructions) uses the traditional (non-ARM) like list of addresses, also the zero address is not an exception vector, it is the address to load in the stack pointer, then the second entry is reset and so on. Also the cortex-m exception list is different, that family has like 128 individual interrupts, where the traditional ARM has two, fast and normal. There is a recent cortex-m based question or perhaps phrased as thumb2 question for running linux on a thumb2 ARM. I think the cortex-m implementations are all microcontroller class chips and only have on chip memory in the tens of kbytes, basically these dont fall into the category you are asking about. And you are asking about cortex-a9 anyway.
A number of cores or maybe all of them have a boot option where the boot address can be 0x00000000 or something like 0xFFFF0000 as an alternate address. using that would be very confusing for ARM users, but it provides the ability for example to have a rom at one address and a ram at another allowing you to boot on power up from a rom then switch the exception table into ram for runtime operation. You probably have a chip with a core that can do this but it is up to the chip vendor whether or not to use these edge of the core features or to hardwire them to some setting and not provide you that flexibility.
You need to look at the datasheet/docs for the chip in question. Find out what the name of the ARM core is, as you mentioned cortex-a9. Ideally you want to know the rev as well r0p0 kind of a thing, then go to ARM's website and find the TRM, technical reference manual for that core. You will also want to get a copy of the ARM ARM, ARM Architectural Reference Manual. The (traditional) ARM exception vectors are described in the ARM ARM as well as quite a ton more info. You also need the chip vendors documentation, and look into their boot scheme. Some will point address zero to the boot prom on power up, then the bootloader will need to do something, flip a bit in a register, and the memory controller will switch address 0 to ram. Some might have address 0 always configured as ram, and some other address always configured as rom, lets say 0x80000000 for example, and the chip will copy some items from rom to ram for you before boot, or the chip may simply have the power up setting for the reset vector to be a branch to rom, then it is up to the bootloader to patch up the vector table. As many different schemes as you can think of, it is likely someone has tried it, so you have to study the chip vendors documentation or sample code to understand Basically the answer to your rom question, is it depends and you have to check with the chip vendor.
The ARM TRM for the core should describe, if any, the strap options on the core (like being able to boot from an alternate address), connect those strap options, if any, that are implemented by the vendor. The ARM ARM is not really going to get into that like the TRM. A vendor worth buying from though will have some of their own documentation and/or code that shows what their rom based boot strategy is.
For a system destined to be a linux system you are going to have a bootloader, some non-linux code (very much like the bios on your desktop/laptop) that brings up the system and eventually launches linux. Linux is going to need a fair amount of memory (relative to microcontroller and other well known ARM implementations), that ram may end up being sram or dram and the bootloader may have to initialize the memory interface before it can start linux up. There are popular bootloaders like redboot and uboot. both are significant overkill, but provide features for developers and users like being able to re-flash linux, etc.
ARM linux has ATAGs (ARM TAGs). You can use both the traditional linux command line to tell linux boot information like what address to find the root file system, and ATAGs. Atags are structures in memory that I think r0 or something like that is set to when you branch from the bootloader to linux. The general concept though is the chip powers up, boots from rom or ram, if prepares ram so that it is ready to use, linux might want/need to be copied from rom to ram, the root file system, if separate, might want to be copied to somewhere else in ram. ATAGs are prepared to tell arm where to decompress linux if need be, as well as where to find the command line and or where to find things like the root file system, some registers are prepared as passed parameters to linux and lastly the bootloader branches to the address containing the entry point in the linux kernel.
You have to have boot code available at the address where the hardware starts executing.
This is usually accomplished by having the hardware map some sort of flash or boot ROM to the boot address and start running from there.
Note that in micro controllers the code that starts running at boot has a pretty tough life - no hardware is initialized yet, and by no hardware I mean that even the DDR controllers that control the access to RAM are not working yet... so your code needs to run without RAM.
After the initial boot code sets enough of the hardware (e.g. sets the RAM chips, set up TLBs etc, program MACs etc.) you have the bootloader run.
In some systems, the initial boot code is just the first part of the boot loader. In some systems, a dedicated boot code sets things up and then reads the boot loader from flash and runs it.
The job of the boot loader is to bring the image of the kernel/OS into RAM, usually from flash or network (but can also be shared memory with another board, PCI buses and the like although that is more rare). Once the boot loader has the image of the kernel/OS binary in RAM it might optionally uncompress it, and hand over control (call) the start address of the kernel/OS image.
Sometime, the kernel/OS image is actually a small decompressor and blob of compressed kernel.
At any rate the end result is that the kernel/OS is available in RAM and the boot loader, optionally through the piggy back decompressor, has passed control to it.
Then the kernel/OS starts running and the OS is up.

How to find an embedded platform?

I am new to the locating hardware side of embedded programming and so after being completely overwhelmed with all the choices out there (pc104, custom boards, a zillion option for each board, volume discounts, devel kits, ahhh!!) I am asking here for some direction.
Basically, I must find a new motherboard and (most likely) re-implement the program logic. Rewriting this in C/C++/Java/C#/Pascal/BASIC is not a problem for me. so my real problem is finding the hardware. This motherboard will have several other devices attached to it. Here is a summary of what I need to do:
Required:
2 RS232 serial ports (one used all the time for primary UI, the second one not continuous)
1 modem (9600+ baud ok) [Modem will be in simultaneous use with only one of the serial port devices, so interrupt sharing with one serial port is OK, but not both]
Minimum permanent/long term storage: Whatever O/S requires + 1 MB (executable) + 512 KB (Data files)
RAM: Minimal, whatever the O/S requires plus maybe 1MB for executable.
Nice to have:
USB port(s)
Ethernet network port
Wireless network
Implementation languages (any O/S I will adapt to):
First choice Java/C# (Mono ok)
Second choice is C/Pascal
Third is BASIC
Ok, given all this, I am having a lot of trouble finding hardware that will support this that is low in cost. Every manufacturer site I visit has a lot of options, and it's difficult to see if their offering will even satisfy my must-have requirements (for example they sometimes list 3 "serial ports", but it appears that only one of the three is RS232, for example, and don't mention what the other two are). The #1 constraint is cost, #2 is size.
Can anyone help me with this? This little task has left me thinking I should have gone for EE and not CS :-).
EDIT: A bit of background: This is a system currently in production, but the original programmer passed away, and the current hardware manufacturer cannot find hardware to run the (currently) DOS system, so I need to reimplement this in a modern platform. I can only change the programming and the motherboard hardware.
I suggest buying a cheap Atom Mini-ITX board, some of which come with multi - 4+ RS232 ports.
But with Serial->USB converters, this isn't really an issue. Just get an Atom. And if you have code, port your software to Linux.
Here is a link to a Jetway Mini-Itx board, and a link to a 4 port RS232 expansion module for it. ~$170 total, some extra for memory, a disk, and a case and PSU. $250-$300 total.
Now here is an Intel Atom Board at $69 to which you could add flash storage instead of drives, and USB-serial converters for any data collection you need to do.
PC104 has a lot of value in maximizing the space used in 19" or 23" rackmount configurations - if you're not in that space, PC104 is a waste of your time and money, IMHO.
The BeagleBoard should have everything you need for $200 or so - it can run Linux so use whatever programming language you like.
A 'modern' system will run DOS so long as it is x86, I suggest that you look at an industrial PC board from a supplier such as Advantech, your existing system may well run unchanged if it adheres to PC/DOS/BIOS standards.
That said if your original system runs on DOS, the chances are that you do not need the horsepower of a modern x86 system, and can save money by using a microcontroller board using something fairly ubiquitous such as an ARM. Also if DOS was the OS, then you most likely do not need an OS at all, and could develop the system "bare-metal". The resources necessary just to support Linux are probably far greater than your existing application and OS together, and for little or no benefit unless you intend on extending the capability of the system considerably.
There are a number of resources available (free and commercial) for implementing a file system and USB on a bare-metal system or a system using a simple real-time kernel such as FreeRTOS or eCOS which have far smaller footprints than Linux.
The Windows embedded site ( http://www.microsoft.com/windowsembedded/en-us/default.mspx )
has a lot of resources and links to hardware partners, distributors and development kits. There's even a "Spark" incubation project ( http://www.microsoft.com/windowsembedded/en-us/community/spark/default.mspx )
What's also really nice about using windows ce is that it now supports Silverlight as a development environment.
I've used the jetway boards / daughter cards that Chris mentioned with success for various projects from embedded control, my home router, my HTPC front end.
You didn't mention what the actual application was but if you need something more industrial due to temperature or moisture constraints i've found http://www.logicsupply.com/ to be a good resource for mini-itx systems that can take a beating.
A tip for these board is that given your minimal storage requirements, don't use a hard drive. Use an IDE adapter for a compact flash card as the system storage or an SD card. No moving parts is usually a big plus in these applications. They also usually offer models with DC power input so you can use a laptop like or wall wart external supply which minimizes its final size.
This http://www.fit-pc.com/web/ is another option in the very small atom PC market, you'd likely need to use some USB converters to get to your desired connectivity.
The beagle board Paul mentioned is also a good choice, there are daughter cards for that as well that will add whatever ports you need and it has an on board SD card reader for whatever storage you need. This is also a substantially lower power option vs the atom systems.
There are a ton of single board computers that would fit your needs. When searching you'll normally find that they don't keep many interface connectors on the processor board itself but rather you need to look at the stackable daughter cards they offer which would provide whatever connections you need (RS-232, etc.). This is often why you see just "serial port" in the description as the final physical layer for the serial port will be defined on the daughter card.
There are a ton of arm based development boards you could also use, to many to list, these are similar to the beagle board. Googling for "System on module" is a good way to find many options. These again are usually a module with the processor/ram/flash on 1 card and then offer various carrier boards which the module plugs into which will provide the various forms of connectivity you need.
In terms of development, the atom boards will likely be the easiest if your more familiar with x86 development. ARM is strongly supported under linux though so there is little difficulty in getting these up and running.
Personally i would avoid windows for a headless design like your discussing, i rarely see a windows based embedded device that isn't just bad.
Take at look at one of the boards in the Arduino line, in particular the Arduino Mega. Very flexible boards at a low cost, and the Mega has enough I/O ports to do what you need it to do. There is no on-chip modem, but you can connect to something like a Phillips PCD3312C over the I2C connector or you can find an Arduino add-on board (called a "shield") to give you modem functionality (or Bluetooth, ethernet, etc etc). Also, these are very easy to connect to an external memory device (like a flash drive or an SD card) so you should have plenty of storage space.
For something more PC-like, look for an existing device that is powered by a VIA EPIA board. There are lot of devices out there that use these (set-top boxes, edge routers, network security devices etc) that you can buy and re-program. For example, I found a device that was supposed to be a network security device. It came with the EPIA board, RAM, a hard drive, and a power supply. All I had to do was format the hard drive, install Linux (Debian had all necessary drivers already included), and I had a complete mini-computer ready to go. It only cost me around $45 too (bought brand new, unopened on ebay).
Update: The particular device I found was an EdgeSecure i10 from Ingrian Networks.

How is the BIOS used by a modern OS?

What's the function of the BIOS in a modern OS? Is it still used after booting? And is there some kind of BIOS API?
The BIOS is still the first thing that runs on the just-started CPU and responsible for getting the motherboard hardware turned on, setting basic chipset modes and registers, initializing some hardware, and running the code that loads the kernel.
The BIOS is usually not used once the kernel is loaded, and depends on a 16-bit execution environment as opposed to the 32- or 64-bit protected mode environment that a modern kernel operates in.
The boot loader normally does require the BIOS IO calls to get the kernel into memory. The BIOS is being replaced even in this role by newer boot-time software such as Coreboot to provide faster boot times. EFI will one day replace the traditional BIOS, and hopefully the boot loader, passing control directly to the kernel after loading it from storage.
The discovered hardware configuration, memory range settings, and ACPI metadata tables are probably the only BIOS-based data used by the OS after the kernel is loaded. Any runnable ACPI code is encoded as ACPI Machine Language and is interpreted by the OS.
Any good traditional book on MS-DOS assembly programming will include information on the BIOS programming interface. Check out The Art of ASSEMBLY LANGUAGE PROGRAMMING
I wrote BIOS for notebook computers for several years. The BIOS does a lot of things while the OS is running.
A major task is to inform the OS when many events happen so the OS can look smart (as if it somehow figured these things out on its own). For example, the BIOS tells the OS when: the power button is pressed, batteries are inserted or removed, AC power comes or goes, the system connects to or disconnects from a docking station, hard drives and or certain types of optical drives are inserted or removed.
Most portable computers have features that you can access/control through Fn keys and through OS-level applications provided by the manufacturers. The BIOS responds to these hotkeys and has code to interface with the OS-level apps. Features like controlling screen brightness (which certain OSes want to appear to control) or controlling bling LEDs fall into this category.
Perhaps the most important task of the BIOS is to shut down the system when the power button is held down for more than 4 seconds (to recover from OS hangs!).
The biggest benefit to having OS control over BIOS now is to control hardware level variables such as fan speed, temperature gauges, etc.