First some background:
We have the following setup in our iMX6-based embedded system. There are two U-Boot partitions and two system (Linux) partitions. Currently we use only the first U-Boot partition and it uses a standard method for selecting, running and (if need be) rolling back the system partitions.
We are now looking into a similar scheme for upgrading U-Boot itself (this will happen very rarely but we do want the ability to do this without having to return the devices to base).
However, this is more fraught with danger because, once you tell the iMX6 device to boot from the alternate U-Boot partition, that's it - there's no U-Boot/watchdog combo that will revert to the previous one if boot fails, so a bad update runs a serious risk of bricking the device until we can return it to base for repair (a costly option which is why we're trying to mitigate it as much as possible).
The method chosen is a two-step U-Boot install procedure, consisting of 'write' and 'activate'. It relies on our ability to successfully figure out which U-Boot partition will be run if the device reboots (the selected one) and which is currently being run (the booted one). We've got this bit sorted out already.
But the bit that we're missing is the ability for UBoot to transfer control to the other UBoot partition under some circumstances. We got it doing different actions based solely on the UBoot environment as follows:
First, mmcboot has a prefix added so that it checks for the control transfer, specifically it's set to run ub_xfer_chk ; <original content of mmcboot>.
Secondly, we have a variable ub_xfer_flag normally set to 0.
Thirdly, we have the checking function ub_xfer_chk, defined as:
if test ${ub_xfer_flag} -eq 1 ; then
echo Soft-booting other UBoot...
setenv ub_xfer_flag 0
saveenv
weave_magic
fi
The weave_magic code is where we are having trouble :-) The idea is that this will load the other UBoot partition into memory (at our CONFIS_SYS_TEXT_BASE of 0x1780000) and execute it as if the actual device had done it.
We've tested the meat of this solution by using reset in place of weave_magic and it successfully restarts the device once, so we're certain we can make it safe.
My specific question then is :how can I convince U-Boot to load a second copy from another partition and run it?
The two UBoot partitions live in the /dev/mmcblk3boot0 and /dev/mmcblk3boot1 devices accessible from the system partition and are 2M files, including the 1K lead-in header and a fair bit of padding at the end.
Update:
We have actually had some success and managed to load an IMX image from the boot partition with the command:
ext4load mmc ${mmcdev}:${mmcpart} 0x17800000 ${bootdir}/u-boot.imx
but, when trying to execute it with:
go 0x17800000
we get an illegal instruction and immediate reboot:
pc : [<17800070>] lr : [<4ff83c64>]
sp : 4f579ac0 ip : 00000030 fp : 4f57be58
r10: 00000002 r9 : 4f579efc r8 : 4ffbe2b0
r7 : 4f57be68 r6 : 17800000 r5 : fffff200 r4 : 000002cc
r3 : 17800000 r2 : 4f57be6c r1 : 4f57be6c r0 : 00000000
Flags: nZCv IRQs off FIQs off Mode SVC_32
Resetting CPU ...
So I'm guessing that's not executable code at the start of that file. Any ideas on where to go from here?
The actual code in the IMX file is not at the beginning. You can discover this fact by using the excellent on-line disassembler with armv5 big-endian no-thumb architecture to figure out that the bytes at the beginning frequently give you invalid and/or not-very-sensible code:
ldtdmi a1, [a1], -a2 ; <UNPREDICTABLE>
strne a1, [a1, a1]
andeq a1, a1, a1
ldrbne pc, [pc, -ip, lsr #8]! ; <UNPREDICTABLE>
In any case, the data at the start of the IMX file is known to be header information (the d1 at the start is a "magic" marker indicating the IVT header and there should also be a DCD block after that. However, even beyond the IVT and DCD blocks (based on their purported lengths in the header fields), the code is not sensible.
However, there's viable information at offset 0xc00 following a large chunk of 0x00 bytes:
00000be0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000bf0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000c00: 0f00 00ea 14f0 9fe5 14f0 9fe5 14f0 9fe5 ................
00000c10: 14f0 9fe5 14f0 9fe5 14f0 9fe5 14f0 9fe5 ................
Putting the hex bytes at offset 0xc00 into the disassembler, and adjusting for areas that are skipped by branches, shows both valid and sane ARM code.
And, indeed, stripping the IMX file with:
dd if=u-boot.imx bs=1 skip=3072 of=ub-at-c00.imx
should give you a file you can boot with:
ext4load mmc ${mmcdev}:${mmcpart} 0x17800000 ${bootdir}/ub-at-c00.imx
go 0x17800000
When we do this, it outputs:
U-Boot 2014.04 (Nov 07 2018 - 19:05:32)
CPU: Freescale i.MX6Q rev1.5 at 792 MHz
CPU: Temperature 32 C, calibration data: 0x5764e169
Reset cause: unknown reset
Board: DTI BRD0208 (Spitfire I) 05/01/2017
I2C: ready
DRAM: 1 GiB
We know this is the newer UBoot simply because the normal one we're using outputs an October date rather than a November one.
Unfotunately, it hangs at that point, with the watchdog timer eventually kicking in and rebooting back to the original UBoot but I suspect that has to do with UBoot not liking the current state of the device (i.e., it doesn't like initialising it twice).
So we'll have to figure out how to convince it to do so but at least we've gotten it booting another copy of itself, which is what the question was about.
I'm new to controller coding. Please anyone help me to understand the below points.
How code executes in the controller?
If we dump the code to the controller it will save it in the Flash memory. after reset how the code will fetch from the memory?
what all the process will be execute in the controller?
I came to know that at the run time code will be copied to RAM memory(?) and executes from the RAM. is this statement is correct? if so when flash code move to RAM?
5.If code will copy from flash to RAM, then it will use the RAM space. then that much of RAM bytes is occupied, so Stack and heap need to be used after this memory?
I'm really confused how it works.
You say controller do you mean microcontroller?
Microcontrollers are designed to be systems on a chip, this includes the non-volatile storage where the program lives. Namely flash or some other form of rom. Just like on your x86 desktop/laptop/server there is some rom/flash in the address space of the processor at the address that the processor uses to boot. You have not specified a microcontroller so it depends on which microcontroller you are talking about as to the specific address and those details, but that doesnt matter in general they all tend to be designed to work the same way.
So there is some flash to use as a general term mapped into the address space of the processor, your reset/interrupt vector tables or start address or whatever the architecture requires PLUS your program/application are in flash in the address space. Likewise some amount of ram is there, generally you do NOT run your programs from ram like you would with your laptop/desktop/server, the rams tend to be relatively small and the flash is there for your program to live. There are exceptions, for example performance, sometimes the flash operates with wait states, and often the sram can run at the cpu rate so you might want to copy some execution time sensitive routines to ram to be run. Generally not though.
There are exceptions of course, these would include situations where the logic ideally but sometimes there is a semi-secret rom with a bootloader in the chip, but your program is loaded from outside the chip into ram then run. Sometimes you may wish to design your application that way for some reason, and having bootloaders is not uncommon, a number of microcontrollers have a chip vendor supplied bootloader in a separate flash space that you may or may not be able to replace, these allow you to do development or in circuit programming of the flash.
A microcontroller contains a processor just like your desktop/laptop/server or phone or anything else like that. It is a system on a chip rather than spread across a board, so you have the processor itself, you have some non-volatile storage as mentioned above and you have ram and the peripherals all on the same chip. So just like any other processor there are logic/design defined rules for how it boots and runs (uses a vector table of addresses or uses well known entry point addresses) but beyond that it is just machine code instructions that are executed. Nothing special. What all processes are run are the ones you write and tell it to run, it runs the software you write which at the end of the day is just machine code. Processes, functions, threads, tasks, procedures, etc these are all human terms to try to manage software development, you pick the language (although the vast majority are programmed in C with a little assembly) and the software design so long as it fits within the constraints of the system.
EDIT
So lets say I had an arm microcontroller with flash starting at address 0x00000000 and ram starting at address 0x20000000. Assume an older arm like the ARM7TDMI which was used in microcontrollers (some of which can still be purchased). So the way that processor boots is there are known addresses that execution starts for reset and for interrupts and undefined exceptions and things like that. The reset address is 0x00000000 so after reset the processor starts execution at address 0x00000000 it reads that instruction first and runs it. The next exception handler starts execution at address 0x00000004 and so on for several possible exceptions, so as you will see we have to branch out of this exception table. as the first thing we do.
here is an example program that would run but doesnt do anything interesting, just demonstrates a few things.
vectors.s
.globl _start
_start:
b reset
b hang
b hang
b hang
b hang
b hang
b hang
b hang
reset:
mov sp,#0x20000000
orr sp,sp,0x8000
bl one
hang: b hang
one.c
unsigned int hello;
unsigned int world;
extern unsigned int two ( unsigned int );
unsigned int one ( void )
{
hello=5;
world=6;
world+=two(hello);
return(hello+world);
}
two.c
extern unsigned int hello;
extern unsigned int world;
unsigned int two ( unsigned int temp )
{
hello++;
world+=2;
return(hello+world+temp);
}
memmap (the linker script)
MEMORY
{
rom : ORIGIN = 0x00000000, LENGTH = 0x10000
ram : ORIGIN = 0x20000000, LENGTH = 0x8000
}
SECTIONS
{
.text : { *(.text*) } > rom
.bss : { *(.bss*) } > ram
}
and then I build it
arm-none-eabi-as --warn --fatal-warnings vectors.s -o vectors.o
arm-none-eabi-gcc -Wall -Werror -O2 -nostdlib -nostartfiles -ffreestanding -c one.c -o one.o
arm-none-eabi-gcc -Wall -Werror -O2 -nostdlib -nostartfiles -ffreestanding -c two.c -o two.o
arm-none-eabi-ld vectors.o one.o two.o -T memmap -o so.elf
arm-none-eabi-objdump -D so.elf > so.list
before we look at the linked output we can look at the individual parts
arm-none-eabi-objdump -D vectors.o
vectors.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <_start>:
0: ea000006 b 20 <reset>
4: ea000008 b 2c <hang>
8: ea000007 b 2c <hang>
c: ea000006 b 2c <hang>
10: ea000005 b 2c <hang>
14: ea000004 b 2c <hang>
18: ea000003 b 2c <hang>
1c: ea000002 b 2c <hang>
00000020 <reset>:
20: e3a0d202 mov sp, #536870912 ; 0x20000000
24: e38dd902 orr sp, sp, #32768 ; 0x8000
28: ebfffffe bl 0 <one>
0000002c <hang>:
2c: eafffffe b 2c <hang>
That is what is in the object file, an object file is not just machine code or data, it also includes various other things, how much data there is how much program there is, it might as in this case contain label names to make debugging easier, the label "hang" and "reset" and others are not in the machine code, these are for the human to make programming easier the machine code has no notion of labels. But the object file depending on the format (there are many, elf, coff, etc) and depending on the tool and default and command line options determine how much stuff goes in this file.
Notice since we have not "linked" the program the branch to the function one() is actually incomplete as you will see in the final linked binary. The one label (function name) is not defined in this code so it cannot yet be resolved, the linker has to do it.
same story with the one function
arm-none-eabi-objdump -D one.o
one.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <one>:
0: e3a03005 mov r3, #5
4: e3a02006 mov r2, #6
8: e92d4070 push {r4, r5, r6, lr}
c: e59f402c ldr r4, [pc, #44] ; 40 <one+0x40>
10: e59f502c ldr r5, [pc, #44] ; 44 <one+0x44>
14: e1a00003 mov r0, r3
18: e5853000 str r3, [r5]
1c: e5842000 str r2, [r4]
20: ebfffffe bl 0 <two>
24: e5943000 ldr r3, [r4]
28: e5952000 ldr r2, [r5]
2c: e0800003 add r0, r0, r3
30: e5840000 str r0, [r4]
34: e0800002 add r0, r0, r2
38: e8bd4070 pop {r4, r5, r6, lr}
3c: e12fff1e bx lr
...
that is the machine code and a disassembly that makes up the one function, the function two is not resolved in this code so it also has a placeholder as well as the global variables hello and world.
these two are getting the address of hello and world from locations
that have to be filled in by the linker
c: e59f402c ldr r4, [pc, #44] ; 40 <one+0x40>
10: e59f502c ldr r5, [pc, #44] ; 44 <one+0x44>
and these two perform the initial write of values to hello and world as the code shows
18: e5853000 str r3, [r5]
1c: e5842000 str r2, [r4]
hello=5;
world=6;
Notice all the addresses are zero based, they have not been linked.
two is similar if you look at it yourself.
The linker script tells the linker that we want .text the program, the machine code to live at 0x00000000 and .bss to be at 0x20000000. bss is global things that are not initialized like
unsigned int this:
.data which I dont deal with here are things like
unsigned int this=5;
global things that are initialized, .bss is assumed by programmers to be zero, but I cheated here and did not zero out the .bss memory space which you will see, instead I initialized the variables in the program rather than pre-initialized them and had to do different work.
reset:
mov sp,#0x20000000
orr sp,sp,#0x8000
bl one
hang: b hang
normally a bootstrap like above would need to deal with the stack as needed (certainly in the case of baremetal microcontroller code like this) as well as zero .bss and copy .data to ram. It takes more linker and compiler magic to put the initalized variables
unsigned int like_this=7;
in flash, as we need to remember that that variable boots with the value 7 and ram is volatile, doesnt survive a power outage. so to support .data you have to tell the linker it wants to live in 0x2000xxxx but put it in flash somewhere and I will copy it over. I didnt demonstrate that here.
from the so.list output of commands above, fully linked program.
Disassembly of section .text:
00000000 <_start>:
0: ea000006 b 20 <reset>
4: ea000008 b 2c <hang>
8: ea000007 b 2c <hang>
c: ea000006 b 2c <hang>
10: ea000005 b 2c <hang>
14: ea000004 b 2c <hang>
18: ea000003 b 2c <hang>
1c: ea000002 b 2c <hang>
00000020 <reset>:
20: e3a0d202 mov sp, #536870912 ; 0x20000000
24: e38dd902 orr sp, sp, #32768 ; 0x8000
28: eb000000 bl 30 <one>
0000002c <hang>:
2c: eafffffe b 2c <hang>
00000030 <one>:
30: e3a03005 mov r3, #5
34: e3a02006 mov r2, #6
38: e92d4070 push {r4, r5, r6, lr}
3c: e59f402c ldr r4, [pc, #44] ; 70 <one+0x40>
40: e59f502c ldr r5, [pc, #44] ; 74 <one+0x44>
44: e1a00003 mov r0, r3
48: e5853000 str r3, [r5]
4c: e5842000 str r2, [r4]
50: eb000008 bl 78 <two>
54: e5943000 ldr r3, [r4]
58: e5952000 ldr r2, [r5]
5c: e0800003 add r0, r0, r3
60: e5840000 str r0, [r4]
64: e0800002 add r0, r0, r2
68: e8bd4070 pop {r4, r5, r6, lr}
6c: e12fff1e bx lr
70: 20000004 andcs r0, r0, r4
74: 20000000 andcs r0, r0, r0
00000078 <two>:
78: e59fc02c ldr r12, [pc, #44] ; ac <two+0x34>
7c: e59f102c ldr r1, [pc, #44] ; b0 <two+0x38>
80: e59c2000 ldr r2, [r12]
84: e5913000 ldr r3, [r1]
88: e2822001 add r2, r2, #1
8c: e2833002 add r3, r3, #2
90: e52de004 push {lr} ; (str lr, [sp, #-4]!)
94: e082e003 add lr, r2, r3
98: e08e0000 add r0, lr, r0
9c: e58c2000 str r2, [r12]
a0: e5813000 str r3, [r1]
a4: e49de004 pop {lr} ; (ldr lr, [sp], #4)
a8: e12fff1e bx lr
ac: 20000000 andcs r0, r0, r0
b0: 20000004 andcs r0, r0, r4
Disassembly of section .bss:
20000000 <hello>:
20000000: 00000000 andeq r0, r0, r0
20000004 <world>:
20000004: 00000000 andeq r0, r0, r0
at address 0x00000000 the address that the first instruction executes after reset for this architecture is a branch to address 0x20 and then we do more stuff and call the one() function. main() is to some extent arbitrary and in this case I can make whatever function names I want I dont need main() specifically so didnt feel like using it after reset the bootstrap calls one() and one() calls two() and then both return back.
We can see that not only did the linker put all of my program in the 0x00000000 address space, it patched up the addresses to branch to the nested functions.
28: eb000000 bl 30 <one>
50: eb000008 bl 78 <two>
It also defined the addresses for hello and there in ram
20000000 <hello>:
20000000: 00000000 andeq r0, r0, r0
20000004 <world>:
20000004: 00000000 andeq r0, r0, r0
in the address space we asked for and patched up the functions so they could access these global variables
78: e59fc02c ldr r12, [pc, #44] ; ac <two+0x34>
7c: e59f102c ldr r1, [pc, #44] ; b0 <two+0x38>
80: e59c2000 ldr r2, [r12]
84: e5913000 ldr r3, [r1]
ac: 20000000 andcs r0, r0, r0
b0: 20000004 andcs r0, r0, r4
I used the disassembler, the word at 0xAC for example is not an andcs instruction it is the address 0x20000000 where we have the variable hello stored. This disassembler tries to disassemble everything, instructions or data so we know that is not instructions so just ignore the disassembly.
Now this elf file format is not the exact bytes you put in the flash when programming, some tools you use to program a flash might accept this file format and then extract from it the actual bytes that go in the flash, ignoring the rest of the file (or using it to find those bytes).
arm-none-eaby-objcopy so.elf -O binary so.bin
would create a file that represents just the data that would go in flash.
arm-none-eabi-objcopy so.elf -O binary so.bin
calvin so # hexdump so.bin
0000000 0006 ea00 0008 ea00 0007 ea00 0006 ea00
0000010 0005 ea00 0004 ea00 0003 ea00 0002 ea00
0000020 d202 e3a0 d902 e38d 0000 eb00 fffe eaff
0000030 3005 e3a0 2006 e3a0 4070 e92d 402c e59f
0000040 502c e59f 0003 e1a0 3000 e585 2000 e584
0000050 0008 eb00 3000 e594 2000 e595 0003 e080
0000060 0000 e584 0002 e080 4070 e8bd ff1e e12f
0000070 0004 2000 0000 2000 c02c e59f 102c e59f
0000080 2000 e59c 3000 e591 2001 e282 3002 e283
0000090 e004 e52d e003 e082 0000 e08e 2000 e58c
00000a0 3000 e581 e004 e49d ff1e e12f 0000 2000
00000b0 0004 2000
00000b4
this is dumping little endian halfwords (16 bit) but you can still see
that the machine code from above is in there and that is all that is
in there.
0000000 0006 ea00 0008 ea00 0007 ea00 0006 ea00
00000000 <_start>:
0: ea000006 b 20 <reset>
4: ea000008 b 2c <hang>
8: ea000007 b 2c <hang>
...
If/when you dump the flash back out you only have the machine code and maybe some .data depending on how you build your project. The microcontroller can as mentioned above execute this code directly from flash and that is the primary use case, and generally it is fast enough for the type of work microcontrollers are used for. Sometimes you can speed up the microcontroller, but the flash generally has a speed limit that might be slower and they might have to add wait states so that it doesnt push the flash too fast and cause corruption. And yes with some work you can copy some or all of your program to ram and run it there if you have enough resources (ram) and are that pushed for performance (and have exhausted other avenues like examining what the compiler is producing and if you can affect that with command line options or by adjusting or cleaning up your code).
Code executes on the microcontroller similar to any other microprocessor, though code if often organized separate from data (google "Harvard Architecture"). The program counter starts at the reset vector (see next answer) and advances every instruction, changing when branching instructions occur.
Typically your compiler will insert into your code a number of "vectors". These vectors usually include a "reset vector" that points at the place where your microcontroller expects the first instruction. It might be at memory location zero, or it might be elsewhere. From there, it operates on the code similar to any other computer. Every microprocessor and microcontroller expects code to start at a certain memory location upon reset, though it varies among different parts. For more information on vectors, [here's a handy reference(http://www.avrbeginners.net/architecture/int/int.html). Note the second sentence which talks about the reset vector and its address at 0x0000.
Microcontrollers are often coded in assembly language or C, so that programmers can control to the byte what code is running. Those exact processes are what will run.
This might vary from chip to chip, but with the chips I'm expert in, code is not copied to RAM to execute. Again, it's the Harvard architecture at work. Small microcontrollers might have as little as zero RAM and as much as a few Kbytes, but typically the instructions are read directly from flash. Proper programming in these environments means the heap is tiny, the stack is carefully controlled, and RAM is used very sparingly.
I recommend you pick a processor line -- I'm expert at the Atmel ATtiny and ATmega controllers -- and read their datasheets to understand in detail how they work. Atmel documentation is thorough and they also publish many application notes for specific applications, often with useful code examples. There are also internet forums dedicated to discussion and learning on the Atmel AVR line.
How code executes in the controller?
If you mean, "how does the code start executing", the answer is that once the MCU has determined that the supply voltage and clocks are ok, it will automatically start executing at the boot address. But, now we're getting into the gory details. I am mostly into MMU-less controllers such as ARM Cortex-M, 8051, PIC, AVR etc., so my answer might not apply fully to your questions.
The boot address is typically the first address in the flash for most small MCUs, but in some MCUs, the flash is expected to contain a vector at a specific location, which in turns points to the first start address. Other MCUs, such as ARM, allows the electronic designer to select if the MCU shall start executing from internal flash, external flash, system boot ROM (if such exists), enter some kind of bootloader mode etc., by setting certain pins high or low.
If we dump the code to the controller it will save it in the Flash memory. after reset how the code will fetch from the memory?
See the above answer.
what all the process will be execute in the controller?
I don't understand the question. Can you please rephrase it?
I came to know that at the run time code will be copied to RAM memory(?) and executes from the RAM. is this statement is correct?
This depends on the design of the firmware. If you really need to, you would copy the code from Flash to RAM and execute from RAM, but if the internal flash is large enough and you don't need to squeeze every clock of the MCU, you would simply execute from flash. It's so much easier. And safer, too, since it's harder for a bug to accidentally overwrite the code-space.
But, in case you need a lot of code, your MCU might not have enough flash to fit everything. In that case, you would need to store the code in an external flash. Depending on how price-sensitive you are, you will possibly choose an SPI-flash. Since it is impossible to execute from those flash:es, you must copy the code to RAM and execute from RAM.
if so when flash code move to RAM?
This would normally be implemented in a boot-loader, or very early in the main() function. If your RAM is smaller than the flash, you will need to implement some kind of page-swap algorithm, dynamically copying code from flash as you need it. This is basically similar to how any Linux-based MCU works, but you might need to carefully design the memory layout.
If code will copy from flash to RAM, then it will use the RAM space. then that much of RAM bytes is occupied, so Stack and heap need to be used after this memory?
Yes. You will certainly need to adjust the memory map, using compile-time switches to the linker and compiler.
I would like to come up with the byte code in assembler (assembly?) for Windows machines to add two 32-bit longs and throw away the carry bit. I realize the "Windows machines" part is a little vague, but I'm assuming that the bytes for ADD are pretty much the same in all modern Intel instruction sets.
I'm just trying to abuse VB a little and make some things faster. So as an example of running direct assembly in VB, the hex string "8A4C240833C0F6C1E075068B442404D3E0C20800" is the assembly code for SHL that can be "injected" into a VB6 program for a fast SHL operation expecting two Long parameters (we're ignoring here that 32-bit longs in VB6 are signed, just pretend they are unsigned).
Along those same lines, what is the hex string of bytes representing assembler instructions that will do the same thing to return the sum of two 32-bit unsigned integers?
The hex code above for SHL is, according to the author:
mov eax, [esp+4]
mov cl, [esp+8]
shl eax, cl
ret 8
I spit those bytes into a file and tried unassembling them in a windows command prompt using the old debug utility, but I figured out it's not working with the newer instruction set because it didn't like EAX when I tried assembling something but it was happy with AX.
I know from comments in the source code that SHL EAX, CL is D3E0, but I don't have any reference to know what the bytes are for instruction ADD EAX, CL or I'd try it. (Though I know now that the operands have to be the same size.)
I tried flat assembler and am not getting anything I can figure out how to use. I used it to assemble the original SHL code and got a very different result, not the same bytes. Help?
I disassembled the bytes you provided and got the following code:
(__TEXT,__text) section
f:
00000000 movb 0x08(%esp),%cl
00000004 xorl %eax,%eax
00000006 testb $0xe0,%cl
00000009 jne 0x00000011
0000000b movl 0x04(%esp),%eax
0000000f shll %cl,%eax
00000011 retl $0x0008
Which is definitely more complicated than the source code the author provided. It checks that the second operand isn't too large, for example, which isn't in the code you showed at all (see Edit 2, below, for a more complete analysis). Here's a simple stdcall function that adds two arguments together and returns the result:
mov 4(%esp), %eax
add 8(%esp), %eax
ret $8
Assembling that gives me this output:
(__TEXT,__text) section
00000000 8b 44 24 04 03 44 24 08 c2 08 00
I hope those bytes do what you want them to!
Edit: Perhaps more usefully, I just did the same in C:
__attribute__((__stdcall__))
int f(int a, int b)
{
return a + b;
}
Compiled with -Oz and -fomit-frame-pointer it generates exactly the same code (well, functionally equivalent, anyway):
$ gcc -arch i386 -fomit-frame-pointer -Oz -c -o example.o example.c
$ otool -tv example.o
example.o:
(__TEXT,__text) section
_f:
00000000 movl 0x08(%esp),%eax
00000004 addl 0x04(%esp),%eax
00000008 retl $0x0008
The machine code output:
$ otool -t example.o
example.o:
(__TEXT,__text) section
00000000 8b 44 24 08 03 44 24 04 c2 08 00
Sure beats hand-writing assembly code!
Edit 2:
#ErikE asked in the comments below what would happen if a shift of 32 bits or greater was attempted. The disassembled code at the top of this answer (for the bytes provided in the original question) can be represented by the following higher-level code:
unsigned int shift_left(unsigned int a, unsigned char b)
{
if (b > 32)
return 0;
else
return a << b;
}
From this logic it's pretty easy to see that if you pass a value greater than 32 as the second parameter to the shift function, you'll just get 0 back.