On reset what happens in embedded system? - embedded

I have a doubt regarding the reset due to power up:
As I know that microcontroller is hardwired to start with some particular memory location say 0000H on power up. At 0000h, whether interrupt service routine is written for reset(initialization of stack pointer and program counter etc) or the reset address is there at 0000h(say 7000) so that micro controller jumps at 7000 address and there initialization of stack and PC is written.
Who writes this reset service routine? Is it the manufacturer of microcontroller chip(Intel or microchip etc) or any programmer can change this reset service routine(For example, programmer changed the PC to 4000h from 7000h on power up reset resulting into the first instruction to be fetched from 4000 instead of 7000).
How the stack pointer and program counter are initialized to the respective initial addresses as on power up microcontroller is not in the state to put the address into stack pointer and program counter registers(there is no initialization done till reset service routine).
What should be the steps in the reset service routine considering all possibilities?

With reference to your numbering:
The hardware reset process is processor dependent and will be fully described in the data sheet or reference manual for the part, but your description is generally the case - different architectures may have subtle variations.
While some microcontrollers include a ROM based boot-loader that may contain start-up code, typically such bootloaders are only used to load code over a communications port, either to program flash memory directly or to load and execute a secondary bootloader to RAM that then programs flash memory. As far as C runtime start-up goes, this is either provided with the compiler/toolchain, or you write it yourself in assembler. Normally even when start-up code is provided by the compiler vendor, it is supplied as source to be assembled and linked with your application. The compiler vendor cannot always know things like memory map, SDRAM mapping and timing, or processor clock speed or what oscillator crystal is used in your hardware, so the start-up code will generally need customisation or extension through initialisation stubs that you must implement for your hardware.
On ARM Cortex-M devices in fact the initial PC and stack-pointer are in fact loaded by hardware, they are stored at the reset address and loaded on power-up. However in the general case you are right, the reset address either contains the start-up code or a vector to the start-up code, on pre-Cortex ARM architectures, the reset address actually contains a jump instruction rather than a true vector address. Either way, the start-up code for a C/C++ runtime must at least initialise the stack pointer, initialise static data, perform any necessary C library initialisation and jump to main(). In the case of C++ it must also execute the constructors of any global static objects before calling main().

The processor cores normally have as you say a starting address of some sort of table either a list of addresses or like ARM a place where instructions are executed. Wrapped around that core but within the chip can vary. Cores that are not specific to the chip vendor like 8051, mips, arm, xscale, etc are going to have a much wider range of different answers. Some microcontroller vendors for example will look at strap pins and if the strap is wired a certain way when reset is released then it executes from a special boot flash inside the chip, a bootloader that you can for example use to program the user boot flash with. If the strap is not tied that certain way then sometimes it boots your user code. One vendor I know of still has it boot their bootloader flash, if the vector table has a valid checksum then they jump to the reset vector in your vector table otherwise they sit in their bootloader mode waiting for you to talk to them.
When you get into the bigger processors, non-microcontrollers, where software lives outside the processor either on a boot flash (separate chip from the processor) or some ram that is managed somehow before reset, etc. Those usually follow the rule for the core, start at address 0xFFFFFFF0 or start at address 0x00000000, if there is garbage there, oh well fire off the undefined instruction vector, if that is garbage just hang there or sit in an infinite loop calling the undefined instruction vector. this works well for an ARM for example you can build a board with a boot flash that is erased from the factory (all 0xFFs) then you can use jtag to stop the arm and program the flash the first time and you dont have to unsolder or socket or pre-program anything. So long as your bootloader doesnt hang the arm you can have an unbrickable design. (actually you can often hold the arm in reset and still get at it with the jtag debugger and not worry about bad code messing with jtag pins or hanging the arm core).
The short answer: How many different processor chip vendors have there been? There are many different solutions, as many as you can think of and more have been deployed. Placing a reset handler address in a known place in memory is the most common though.
EDIT:
Questions 2 and 3. if you are buying a chip, some of the microcontrollers have this protected bootloader, but even with that normally you write the boot code that will be used by the product. And part of that boot code is to initialize the stack pointers and prepare memory and bring up parts of the chip and all those good things. Sometimes chip vendors will provide examples. if you are buying a board level product, then often you will find a board support package (BSP) which has working example code to bring up the board and perhaps do a few things. Say the beagleboard for example or the open-rd or embeddedarm.com come with a bootloader (u-boot or other) and some already have linux pre-installed. boards like that the user usually just writes some linux apps/drivers and adds them to the bsp, but you are not limited to that, you are often welcome to completely re-write and replace the bootloader. And whoever writes the bootloader has to setup the stacks and bring up the hardware, etc.
systems like the gameboy advance or nds or the like, the vendor has some startup code that calls your startup code. so they may have the stack and such setup for them but they are handing off to you, so much of the system may be up, you just get to decide how to slice up the memorires, where you want your stack, data, program, etc.
some vendors want to keep this stuff controlled or a secret, others do not. in some cases you may end up with a board or chip with no example code, just some data sheets and reference manuals.
if you want to get into this business though you need to be prepared to write this startup code (in assembler) that may call some C code to bring up the rest of the system, then that might start up the main operating system or application or whatever. Microcotrollers sounds like what you are playing with, the answers to your questions are in the chip vendors users guides, some vendors are better than others. search for the word reset or boot in the document to try to figure out what their boot schemes are. I recommend you use "dollar votes" to choose the better vendors. A vendor with bad docs, secret docs, bad support, dont give them your money, spend your money on vendors with freely downloadable, well written docs, with well written examples and or user forums with full time employees trolling around answering questions. There are times where the docs are not available except to serious, paying customers, it depends on the market. most general purpose embedded systems though are openly documented. the quality varies widely, but the docs, etc are there.

Depends completely on the controller/embedded system you use. The ones I've used in game development have the IP point at a starting address in RAM. The boot strap code supplied from the compiler initializes static/const memory, sets the stack pointer, and then jumps execution to a main() routine of some sort. Older systems also started at a fixed address, but you manually had to set the stack, starting vector table, and other stuff in assembler. A common name for the starting assembler file is CRT0.s for the stuff I've done.
So 1. You are correct. The microprocessor has to start at some fixed address.
2. The ISR can be supplied by the manufacturer or compiler creator, or you can write one yourself, depending on the complexity of the system in question.
3. The stack and initial programmer counter are usually handled via some sort of bootstrap routine that quite often can be overriden with your own code. See above.
Last: The steps will depend on the chip. If there is a power interruption of any sort, RAM may be scrambled and all ISR vector tables and startup code should be rewritten, and the app should be run as if it just powered up. But, read your documentation! I'm sure there is platform specific stuff there that will answer these for your specific case.

Related

why are system calls handled using interrupts?

I have a basic question about the linux system call.
Why are the system calls not handled just like normal function calls and why is handled via software interrupts?
Is it because, there is no linking process performed for user space application with kernel during the build process of user application?
Linking between separately compiled pieces of code is a minor problem. Shared libraries have had a workaround for it for quite some time (relocatable code, export tables, etc). You pay the cost typically just once, when you load the library in the program.
The bigger problem is that you need to switch the CPU from the unprivileged, user mode into the privileged, kernel mode and you need to do it in a controllable way, without letting user code escape and wreck a havoc on the kernel. And that's typically done with special or designated instructions. You may also benefit from automatic interrupt disabling when transitioning into the kernel, which the x86 int instruction can do for you. Most CPUs have something like this instruction and it's a common way of implementing the system call interface, although not the only one.
If you asked about MS-DOS or the original MINIX, both of which ran on the i8086 in the real address mode, where the kernel couldn't protect itself or other programs from anything because all the memory and system resources were accessible to all code, then there would be less reason in using a special instruction like int, there were no two modes, only one, and in that respect int would be largely equivalent to a simple call (far).
Also noteworthy is the fact that CPUs often handle the following 3 types of events in a very similar fashion:
hardware interrupts from I/O devices
exceptions, errors from code execution (e.g. division by 0, page faults, etc)
system calls
That makes using something like the int instruction a natural choice as your entry and exit points in all of the above handlers would be if not fully then largely identical.

Reading live RAM variables from a Micro controller in VB.net

I want to read the global variables via the JTAG port, live, when a program is running on the microcontroller. Is it possible?
JTAG defines only a physical interface, it does not describe the on-chip debug capabilities of a particular processor which may or may not support access during execution.
Moreover whether it can be done in VB is not really the issue, the important issue is what hardware device and/or I/O port you are using for the JTAG interface, and whether a driver and API to access via .Net is available. That said VB.Net is not the first language I'd choose for that in any case.
A good place to start perhaps is OpenOCD, though it is not .Net specific.
"Almost-Live" is possibly doable, depending on the JTAG implementation. Often JTAG activity which reads memory does so by stealing cycles from the micro (or sometimes even inserting instructions into the pipeline). I'm not sure there's a micro which allows completely transparent access to memory over JTAG.
"All you need to do" is understand the JTAG implementation, know where the variable is located and issue a "memory read" command by wiggling the JTAG pins in the appropriate fashion. This is not a small task, which is why professional engineers are willing to pay (sometimes large amounts of) money for tools which perform this task.
Often the free (limited) toolchains the vendors provide can perform this also.
Yes, I suppose it is possible. But you'll need to drive the JTAG port (that sounds painful!) and know exactly where the data is stored on the chip, and what the formatting is.

cortex a9 boot and memory

I am a newbie starting out in micro-controller programming. The chip of interest here is cortex-a9. At reset or power up there has to be code at 0x0000000 from my readings. My questions though they may sound too trivial will help me in putting some concepts in perspective.
Does the memory address 0x0000000 reside in ROM?
What happens right after the code is read from that address?
Should there be some sort of boot-loader present & if so at what address should this be in & Should it also be residing in ROM?
Finally, at what point does the kernel kick in & where does the kernel code reside?
ARM sells cores not chips, what resides at that address depends on the chip vendor that has bought the ARM core and put it in their chip. Implementations vary from vendor to vendor, chip to chip.
Traditionally an ARM will boot from address zero, more correctly the reset exception vector is at address zero. Unlike other processor families, the traditional ARM model is NOT a list of addresses for exception entry points but instead the ARM EXECUTES the instruction at that address, which means you need to use either a relative branch or a load pc instruction. The newer cortex-m series, which are thumb/thumb2 only (they cannot execute ARM (32 bit) instructions) uses the traditional (non-ARM) like list of addresses, also the zero address is not an exception vector, it is the address to load in the stack pointer, then the second entry is reset and so on. Also the cortex-m exception list is different, that family has like 128 individual interrupts, where the traditional ARM has two, fast and normal. There is a recent cortex-m based question or perhaps phrased as thumb2 question for running linux on a thumb2 ARM. I think the cortex-m implementations are all microcontroller class chips and only have on chip memory in the tens of kbytes, basically these dont fall into the category you are asking about. And you are asking about cortex-a9 anyway.
A number of cores or maybe all of them have a boot option where the boot address can be 0x00000000 or something like 0xFFFF0000 as an alternate address. using that would be very confusing for ARM users, but it provides the ability for example to have a rom at one address and a ram at another allowing you to boot on power up from a rom then switch the exception table into ram for runtime operation. You probably have a chip with a core that can do this but it is up to the chip vendor whether or not to use these edge of the core features or to hardwire them to some setting and not provide you that flexibility.
You need to look at the datasheet/docs for the chip in question. Find out what the name of the ARM core is, as you mentioned cortex-a9. Ideally you want to know the rev as well r0p0 kind of a thing, then go to ARM's website and find the TRM, technical reference manual for that core. You will also want to get a copy of the ARM ARM, ARM Architectural Reference Manual. The (traditional) ARM exception vectors are described in the ARM ARM as well as quite a ton more info. You also need the chip vendors documentation, and look into their boot scheme. Some will point address zero to the boot prom on power up, then the bootloader will need to do something, flip a bit in a register, and the memory controller will switch address 0 to ram. Some might have address 0 always configured as ram, and some other address always configured as rom, lets say 0x80000000 for example, and the chip will copy some items from rom to ram for you before boot, or the chip may simply have the power up setting for the reset vector to be a branch to rom, then it is up to the bootloader to patch up the vector table. As many different schemes as you can think of, it is likely someone has tried it, so you have to study the chip vendors documentation or sample code to understand Basically the answer to your rom question, is it depends and you have to check with the chip vendor.
The ARM TRM for the core should describe, if any, the strap options on the core (like being able to boot from an alternate address), connect those strap options, if any, that are implemented by the vendor. The ARM ARM is not really going to get into that like the TRM. A vendor worth buying from though will have some of their own documentation and/or code that shows what their rom based boot strategy is.
For a system destined to be a linux system you are going to have a bootloader, some non-linux code (very much like the bios on your desktop/laptop) that brings up the system and eventually launches linux. Linux is going to need a fair amount of memory (relative to microcontroller and other well known ARM implementations), that ram may end up being sram or dram and the bootloader may have to initialize the memory interface before it can start linux up. There are popular bootloaders like redboot and uboot. both are significant overkill, but provide features for developers and users like being able to re-flash linux, etc.
ARM linux has ATAGs (ARM TAGs). You can use both the traditional linux command line to tell linux boot information like what address to find the root file system, and ATAGs. Atags are structures in memory that I think r0 or something like that is set to when you branch from the bootloader to linux. The general concept though is the chip powers up, boots from rom or ram, if prepares ram so that it is ready to use, linux might want/need to be copied from rom to ram, the root file system, if separate, might want to be copied to somewhere else in ram. ATAGs are prepared to tell arm where to decompress linux if need be, as well as where to find the command line and or where to find things like the root file system, some registers are prepared as passed parameters to linux and lastly the bootloader branches to the address containing the entry point in the linux kernel.
You have to have boot code available at the address where the hardware starts executing.
This is usually accomplished by having the hardware map some sort of flash or boot ROM to the boot address and start running from there.
Note that in micro controllers the code that starts running at boot has a pretty tough life - no hardware is initialized yet, and by no hardware I mean that even the DDR controllers that control the access to RAM are not working yet... so your code needs to run without RAM.
After the initial boot code sets enough of the hardware (e.g. sets the RAM chips, set up TLBs etc, program MACs etc.) you have the bootloader run.
In some systems, the initial boot code is just the first part of the boot loader. In some systems, a dedicated boot code sets things up and then reads the boot loader from flash and runs it.
The job of the boot loader is to bring the image of the kernel/OS into RAM, usually from flash or network (but can also be shared memory with another board, PCI buses and the like although that is more rare). Once the boot loader has the image of the kernel/OS binary in RAM it might optionally uncompress it, and hand over control (call) the start address of the kernel/OS image.
Sometime, the kernel/OS image is actually a small decompressor and blob of compressed kernel.
At any rate the end result is that the kernel/OS is available in RAM and the boot loader, optionally through the piggy back decompressor, has passed control to it.
Then the kernel/OS starts running and the OS is up.

How does one use dynamic recompilation?

It came to my attention some emulators and virtual machines use dynamic recompilation. How do they do that? In C i know how to call a function in ram using typecasting (although i never tried) but how does one read opcodes and generate code for it? Does the person need to have premade assembly chunks and copy/batch them together? is the assembly written in C? If so how do you find the length of the code? How do you account for system interrupts?
-edit-
system interrupts and how to (re)compile the data is what i am most interested in. Upon more research i heard of one person (no source available) used js, read the machine code, output js source and use eval to 'compile' the js source. Interesting.
It sounds like i MUST have knowledge of the target platform machine code to dynamically recompile
Yes, absolutely. That is why parts of the Java Virtual Machine must be rewritten (namely, the JIT) for every architecture.
When you write a virtual machine, you have a particular host-architecture in mind, and a particular guest-architecture. A portable VM is better called an emulator, since you would be emulating every instruction of the guest-architecture (guest-registers would be represented as host-variables, rather than host-registers).
When the guest- and host-architectures are the same, like VMWare, there are a ton of (pretty neat) optimizations you can do to speed up the virtualization - today we are at the point that this type of virtual machine is BARELY slower than running directly on the processor. Of course, it is extremely architecture-dependent - you would probably be better off rewriting most of VMWare from scratch than trying to port it.
It's quite possible - though obviously not trivial - to disassemble code from a memory pointer, optimize the code in some way, and then write back the optimized code - either to the original location or to a new location with a jump patched into the original location.
Of course, emulators and VMs don't have to RE-write, they can do this at load-time.
This is a wide open question, not sure where you want to go with it. Wikipedia covers the generic topic with a generic answer. The native code being emulated or virtualized is replaced with native code. The more the code is run the more is replaced.
I think you need to do a few things, first decide if you are talking about an emulation or a virtual machine like a vmware or virtualbox. An emulation the processor and hardware is emulated using software, so the next instruction is read by the emulator, the opcode pulled apart by code and you determine what to do with it. I have been doing some 6502 emulation and static binary translation which is dynamic recompilation but pre processed instead of real time. So your emulator may take a LDA #10, load a with immediate, the emulator sees the load A immediate instruction, knows it has to read the next byte which is the immediate the emulator has a variable in the code for the A register and puts the immediate value in that variable. Before completing the instruction the emulator needs to update the flags, in this case the Zero flag is clear the N flag is clear C and V are untouched. But what if the next instruction was a load X immediate? No big deal right? Well, the load x will also modify the z and n flags, so the next time you execute the load a instruction you may figure out that you dont have to compute the flags because they will be destroyed, it is dead code in the emulation. You can continue with this kind of thinking, say you see code that copies the x register to the a register then pushes the a register on the stack then copies the y register to the a register and pushes on the stack, you could replace that chunk with simply pushing the x and y registers on the stack. Or you may see a couple of add with carries chained together to perform a 16 bit add and store the result in adjacent memory locations. Basically looking for operations that the processor being emulated couldnt do but is easy to do in the emulation. Static binary translation which I suggest you look into before dynamic recompilation, performs this analysis and translation in a static manner, as in, before you run the code. Instead of emulating you translate the opcodes to C for example and remove as much dead code as you can (a nice feature is the C compiler can remove more dead code for you).
Once the concept of emulation and translation are understood then you can try to do it dynamically, it is certainly not trivial. I would suggest trying to again doing a static translation of a binary to the machine code of the target processor, which a good exercise. I wouldnt attempt dynamic run time optimizations until I had succeeded in performing them statically against a/the binary.
virtualization is a different story, you are talking about running the same processor on the same processor. So x86 on an x86 for example. the beauty here is that using non-old x86 processors, you can take the program being virtualized and run the actual opcodes on the actual processor, no emulation. You setup traps built into the processor to catch things, so loading values in AX and adding BX, etc these all happen at real time on the processor, when AX wants to read or write memory it depends on your trap mechanism if the addresses are within the virtual machines ram space, no traps, but lets say the program writes to an address which is the virtualized uart, you have the processor trap that then then vmware or whatever decodes that write and emulates it talking to a real serial port. That one instruction though wasnt realtime it took quite a while to execute. What you could do if you chose to is replace that instruction or set of instructions that write a value to the virtualized serial port and maybe have then write to a different address that could be the real serial port or some other location that is not going to cause a fault causing the vm manager to have to emulate the instruction. Or add some code in the virtual memory space that performs a write to the uart without a trap, and have that code instead branch to this uart write routine. The next time you hit that chunk of code it now runs at real time.
Another thing you can do is for example emulate and as you go translate to a virtual intermediate bytcode, like llvm's. From there you can translate from the intermediate machine to the native machine, eventually replacing large sections of program if not the whole thing. You still have to deal with the peripherals and I/O.
Here's an explaination of how they are doing dynamic recompilation for the 'Rubinius' Ruby interpteter:
http://www.engineyard.com/blog/2010/making-ruby-fast-the-rubinius-jit/
This approach is typically used by environments with an intermediate byte code representation (like Java, .net). The byte code contains enough "high level" structures (high level in terms of higher level than machine code) so that the VM can take chunks out of the byte code and replace it by a compiled memory block. The VM typically decide which part is getting compiled by counting how many times the code was already interpreted, since the compilation itself is a complex and time-consuming process. So it is usefull to only compile the parts which get executed many times.
but how does one read opcodes and generate code for it?
The scheme of the opcodes is defined by the specification of the VM, so the VM opens the program file, and interprets it according to the spec.
Does the person need to have premade assembly chunks and copy/batch them together? is the assembly written in C?
This process is an implementation detail of the VM, typically there is a compiler embedded, which is capable to transform the VM opcode stream into machine code.
How do you account for system interrupts?
Very simple: none. The code in the VM can't interact with real hardware. The VM interact with the OS, and transfer OS events to the code by jumping/calling specific parts inside the interpreted code. Every event in the code or from the OS must pass the VM.
Also hardware virtualization products can use some kind of JIT. A typical use cases in the X86 world is the translation of 16bit real mode code to 32 or 64bit protected mode code to not to be forced to emulate a CPU in real mode. Also a software-only VM replaces jump instructions in the executing code by jumps into the VM control software, which at each branch the following code path for jump instructions scans and them replace, before it jumps to the real code destination. But I doubt if the jump replacement qualifies as JIT compilation.
IIS does this by shadow copying: after compilation it copies assemblies to some temporary place and runs them from temp.
Imagine, that user change some files. Then IIS will recompile asseblies in next steps:
Recompile (all requests handled by old code)
Copies new assemblies (all requests handled by old code)
All new requests will be handled by new code, all requests - by old.
I hope this'd be helpful.
A virtual Machine loads "byte code" or "intermediate language" and not machine code therefore, I suppose, that it just recompiles the byte code more efficiently once it has more runtime data.
http://en.wikipedia.org/wiki/Just-in-time_compilation

How firmwares communicate to the electronic devices to perform its operations?

Almost all electronic devices comes with firmwares. I know it is stored in ROM (Read only memory) so it becomes non-volatile (no power source required to hold the contents from getting erased like RAM)
What I want to know is "How firmwares communicate to the electronic devices to perform its operations?"
Let say there is a small roller.. On press of a button, how it makes it to move?
Can someone please explain what is residing behind, to make it happen..
I think it may require a little brief explanation to unwind it..
Also what is the most popular language used for coding firmwares?
Modern hardware like you're describing has a program stored in ROM and an all-purpose microcomputer (CPU) executing that program.
The CPU reads information from ROM by setting up addresses on its address bus and then asking the ROM to tell it the value stored at that location. There's something like a read pulse being raised (on a separate line) to tell the ROM to make the value accessible on the lines of the data bus. That, in a nutshell, is reading.
To get the hardware to do something, the CPU basically executes a kind of write operation. It puts a value, which is just a bunch of bits if you want to look at it that way, on the address bus to select a certain device and perhaps function on that device, then it raises another signal line saying "write!" The device that recognizes its address on the address bus responds to that signal by accepting the data from the data bus and then performing whatever its function is. Typically, one of the data bus bits will be connected within the output device to a power output stage, i.e. a transistor stronger than the ones used just for computation, and that transistor will connect some electrical device to current sufficient to make it move/glow/whatever.
Tiny, cheap devices are coded in assembly language to save costs for ROM; in industrial quantities, even small amounts of memory can affect price. The assembly language is specific to the CPU; some chips called "8051", "6502" and "Atmel (something or other)" are popular. Bigger devices with more complex requirements may have their firmware written in C or a C-like dialect, which makes programming a little easier than assembler. The bigges ones even run C++ code. Compiled, of course.
In most systems there are special memory addresses which are used for I/O. Reading and writing on such addresses executes some function instead of just moving data around. In x86 systems there are also special I/O instructions IN and OUT for that.
The simplest case is called general parallel I/O (GPIO), where you can read or write data directly from/to external electrical pins on the device. There are several memory addresses, called registers, where you can read data from the port (voltage near 0 = 0, near supply voltage = 1), where you can write data to the port, and where you can define whether a particular pin is input (the corresponding bit is typically 0) or output (the bit is 1). Every microcontroller has GPIO.
So in your example the button could be connected to a pin set to input, which the software could sense. It would typically do this every 10ms and only react if it has a stable value for several reads, this is called debouncing. Then it would write a 1 to some output, which via some transistor for amplification could drive a motor. If it senses that you release the switch it could turn the motor off again by writing a 0. And so on, this program would run until you turn the device off.
There are lots of other I/O devices for other purposes with typically hundreds of registers for controlling them. If you want to see more you could look into the data sheet of some microcontroller. For example, here is the data sheet of ATtiny4/5/9/10, a very small controller from the Atmel AVR family.
Today most firmware is written in C, except for the smallest devices and for a little special code for handling resets and interrupts, which is written in assembly language.