Creating coredump for ARM based embedded system - embedded

I am basically following up on core dump note section. I didn't post that question but I am trying to do the same thing: write a program to create core dump file from scratch; except that I am trying to do that for a custom, single threaded firmware running on embedded ARM processor.
I am also referring to Google coredumper source to understand how corefiles are usually created. So far I have successfully created a core file with a PT_NOTE and a PT_LOAD program headers which is read by GDB.
Note that, I am trying to create this core file for a custom firmware and this is not Linux environment. My question is regarding PT_LOAD program headers. From what I understood, I just need to create as many PT_LOAD program headers as active threads (for which core needs to be created) with headers representing each thread's memory mappings. Since my firmware is single threaded, I created only one PT_LOAD program header with memory mapping being address values on stack.
When I load up ELF image of the firmware with this newly created core file, GDB prints registers accurately with "info reg". GDB also identifies PC (program counter) value and displays the symbol accurately. It, however, cannot display remaining frames from stack ("bt" doesn't work). It complains that it "Cannot access memory at address (SP+4)".
I've already provided firmware's stack mappings in the core file and GDB should have been able to read at address (SP+4). Note that, I can examine the value at (SP+4) with "x 0x(SP+4)".
Can anyone tell me what am I missing here?
Thanks

I figured this out. Apparently, contents of the PT_LOAD program header - stack mappings - were not complete. The problem was that it needed entire mapping of the one thread that is running. After I included contents of entire CPU SRAM, GDB "bt" and all other commands worked just fine.
Also, from what I understood, the executable has address to all variables and core file has run-time values for those variables. So if any of the symbols are memory (RAM) resident then a separate PT_LOAD program header with RAM mapping should be added. After that GDB should be able to print runtime value of those variable accurately. Without the mapping, the value of the variable would be 0 (as shown by GDB).

Related

windows how to show memory segment of a process?

We have tools like objdump, readelf, and dumbin to show executable file contents.
But when an executable file is loaded into memory (a process is created), the segments in memory is usually different from the segments in the executable file. For example, when loaded, two extra segments namely stack and heap are allocated (we overlook details of page mapping here).
Is there a tool that help show the in-time memory segment/status of a process?
Windows executables use the Portable Executable format. This format describes sections of memory that are allocated when the process is loaded, and optionally raw data (.text, .data sections) to be loaded into those sections.
Each section will typically have a file offset specifying where in the raw file the data is located, and a Virtual Address at which the data will be loaded. These may or may not resemble each other.
PE Explorer can give you details on the sections (and everything else about a PE file) of an executable.
Immunity Debugger will allow you to attach to a running process and see its memory map.

ELF files - What is a section and why do we need it?

I've been reading ELF standard here. From what I understand, each ELF contains ELF header, program headers (why more than one?) and section headers. Can anyone please explain:
How are ELF files generated? is it the compiler responsibility?
What are sections and why do we need them?
What are program headers and why do we need them?
Inside program headers, what's the meaning of the fields p_vaddr and p_paddr?
Does each section have it's own section header?
Alternatively, does any one have a link to a more friendly documenation of ELF?
How are ELF files generated? is it the compiler responsibility?
They can be generated by a compiler, an assembler, or any other tool that can generate them. Even your own program you wrote for generating ELF files ;) They're just streams of bytes after all, so they can be generated by just writing bytes into a file in binary mode. You can do that too.
What are sections and why do we need them?
ELF files are subdivided into sections. Sections are the smallest continuous regions in the file. You can think of them as pages in an organizer, each with its own name and type that describes what does it contain inside. Linkers use this information to combine different parts of the program coming from different modules into one executable file or a library, by merging sections of the same type (gluing pages together, if you will).
In executable files, sections are optional, but they're usually there to describe what's in the file and where does it begin, and how much bytes does it take.
What are program headers and why do we need them?
They're mostly for making executable files. In order to run a program, sections aren't enough, because you have to specify not only what's there in the file, but also where should it be loaded into memory in the running process. Program headers are just for that purpose: they describe segments, which are regions of memory in the running process, with different access privileges & stuff.
Each program header describes one segment. It tells the loader where should it load a certain region in the file into memory and what permissions should it set for that region (e.g. should it be allowed to execute code from it? should it be writable or just for reading?)
Segments can be further subdivided into sections. For example, if you have to specify that your code segment is further subdivided into code and static read-only strings for the messages the program displays. Or that your data segment is subdivided into funky data and hardcore data :J It's for you to decide.
In executable files, sections are optional, but it's nice to have them, because they describe what's in the file and allow for dumping selected parts of it (e.g. with the objdump tool). Sometimes they are needed, though, for storing dynamic linking information, symbol tables, debugging information, stuff like that.
Inside program headers, what's the meaning of the fields p_vaddr and p_paddr?
Those are the addresses at which the data in the file will be loaded. They map the contents of the file into their corresponding memory locations. The first one is a virtual address, the second one is physical address.
Physical addresses are the "raw" memory addresses. On modern operating systems, those are no longer used in the userland. Instead, userland programs use virtual addresses. The operating system deceives the userland program that it is alone in memory, and that the entire address space is available for it. Under the hood, the operating system maps those virtual addresses to physical ones in the actual memory, and it does it transparently to the program.
Of course, not every address in the virtual address space is available at the same time. There are limitations imposed by the actual physical memory available. So the operating system just maps the memory for the segments the program actually uses (here's where the "segments" part from the ELF file's program headers comes into play). If the process tries to access some unmapped memory, the operating system steps in and says, "sorry, chap, this memory doesn't belong to you". (The program can address it, but it cannot access it.)
Does each section have it's own section header?
Yes. If it doesn't have an entry in the Section Headers Table, it's not a section :q Because they only way to tell if some part of the file is a section, is by looking in to the Section Headers Table which tells you what sections are defined in the file and where you can find them.
You can think of the Section Headers Table as a table of contents in a book. Without the table of contents, there aren't any chapters after all, because they're not listed anywhere. The book may have headings, but the content is not subdivided into logical chapters that can be found through the table of contents. Same goes with sections in ELF files: there can be some regions of data, but you can't tell without the "table of contents" which is the SHT.
This link includes a better explaination.
How are ELF files generated? is it the compiler responsibility?*
It is architecture dependent.
What are sections and why do we need them?
Different section have different information such as code, initialized data, uninitialized data etc. These information will be used by the compiler and linker.
What are program headers and why do we need them?
Program headers are used by the operating system when it loads the executable. These headers contains information about the segments (contiguous memory block with some permissions) such as which parts needs to be loaded, interpreter infor etc.
Inside program headers, what's the meaning of the fields p_vaddr and p_paddr?
In general virtual address and the physical address are same. But could be different depends on the system.
Does each section have it's own section header?
yes. Each section have a section header entry at section header table.
This is the best documentation I've found: http://www.skyfree.org/linux/references/ELF_Format.pdf
Each section has only one section header, but there can be section headers without sections
2 - There are many different sections, ex: relocation section recoeds many infomation for relocation symbol. I use the infomation to load a elf object and run/relocate the object.
Antoher example: debug section records debug information, gdb use the data for showing debug message.
Symbol section records symbol information.
3 - programming header used by loader, loader loads a elf execute file by looking up programming header.

What is the difference in byte code like Java bytecode and files and machine code executables like ELF?

What are the differences between the byte code binary executables such as Java class files, Parrot bytecode files or CLR files and machine code executables such as ELF, Mach-O and PE.
what are the distinctive differences between the two?
such as the .text area in the ELF structure is equal to what part of the class file?
or they all have headers but the ELF and PE headers contain Architecture but the Class file does not
Java Class File
Elf file
PE File
Byte code is, as imulsion noted, an intermediate step, right before compilation into machine code. Because the last step is left to load time (and often runtime, as is the case with Just-In-Time (JIT) compilation, byte code is architecture independent: The runtime (CLR for .net or JVM for Java) is responsible for mapping the byte code opcodes to their underlying machine code representation.
By comparison, native code (Windows: PE, PE32+, OS X/iOS: Mach-O, Linux/Android/etc: ELF) is compiled code, suited for a particular architecture (Android/iOS: ARM, most else: Intel 32-bit (i386) or 64-bit). These are all very similar, but still require sections (or, in Mach-O parlance "Load Commands") to set up the memory structure of the executable as it becomes a process (Old DOS supported the ".com" format which was a raw memory image). In all the above, you can say , roughly, the following:
Sections with a "." are created by the compiler, and are "default" or expected to have default behavior
The executable has the main code section, usually called "text" or ".text". This is native code, which can run on the specific architecture
Strings are stored in a separate section. These are used for hard-coded output (what you print out) as well as symbol names.
Symbols - which are what the linker uses to put together the executable with its libraries (Windows: DLLs, Linux/Android: Shared Objects, OS X/iOS: .dylibs or frameworks) are stored in a separate section. Usually there is also a "PLT" (Procedure Linkage Table) which enables the compiler to simply put in stubs to the functions you call (printf, open, etc), that the linker can connect when the executable loads.
Import table (in Windows parlance.. In ELF this is a DYNAMIC section, in OS X this is a LC_LOAD_LIBRARY command) is used to declare additional libraries. If those aren't found when the executable is loaded, the load fails, and you can't run it.
Export table (for libraries/dylibs/etc) are the symbols which the library (or in Windows, even an .exe) can export so as to have others link with.
Constants are usually in what you see as the ".rodata".
Hope this helps. Really, your question was vague..
TG
Byte code is a 'halfway' step. So the Java compiler (javac) will turn the source code into byte code. Machine code is the next step, where the computer takes the byte code, turns it into machine code (which can be read by the computer) and then executes your program by reading the machine code. Computers cannot read source code directly, likewise compilers cannot translate immediately into machine code. You need a halfway step to make programs work.
Note that ELF binaries don't necessarily need to be machine/arch specific per se.
The interesting piece is the "interpreter" header field: it holds a path name to a loader program that's executed instead of the actual binary. This one then is responsible for loading the actual program, loading and linking libraries, etc. This is the way how eg. ld.so comes in.
Theoretically one could create an ELF binary that holds java bytecode (or a complete jar). This just needs some appropriate "interpreter" program which starts up a JVM and loads the code from the binary into it.
Not sure whether this actually has been done before, but certainly possible.
The same can be done w/ quite any non-native code.
It also could serve for direct multiarch support via some VM like qemu:
Let the target platform (libc+linker scripts) put the arch name into the interpreter program name (eg. /lib/ld.so.x86_64, /lib/ld.so.armhf, ...).
Then, on a particular arch (eg. x86_64), the one with native arch name will point to the original ld.so, while the others point to some special one that calls up something like qemu-system-XXX.

Creating a Custom Media Library - Loading Images for Rendering (VB.net)

OK, I'm working on a project right now and I need to create a graphic library.
The game I'm experimenting with is an RPG; this project is expected to contain many big graphic files to use and I would prefer not to load everything into memory at once, like I've done before with other smaller projects.
So, does anyone have experience with libraries such as this one? Here's what I've came up with:
Have graphic library files and paths in an XML file
Each entry in the XML file would be designated "PERMANENT" or "TEMPORARY", with perm. being that once loaded it stays in memory and won't be cleared (like menu-graphics)
The library that the XML file loads into would have a CLEAR command, that clears out all non-PERMANENT graphics
I have experience throwing everything into memory at startup, and with running the program running with the assumption that all necessary graphics are currently in memory. Are there any other considerations I might need to think about?
Ideally everything would be temporary and you would have a sensible evict function that chooses the right objects to victimize (based on access patterns) when your program decides it needs more memory.
There'll be some minimum amount of RAM your game needs to run, otherwise stuff will be constantly swapping, but this approach does mean you're not dumping objects marked TEMPORARY that you will just need to reload next frame because you happen to be using it currently.

How to run unmanaged executable from memory rather than disc

I want to embed a command-line utility in my C# application, so that I can grab its bytes as an array and run the executable without ever saving it to disk as a separate file (avoids storing executable as separate file and avoids needing ability to write temporary files anywhere).
I cannot find a method to run an executable from just its byte stream. Does windows require it to be on a disk, or is there a way to run it from memory?
If windows requires it to be on disk, is there an easy way in the .NET framework to create a virtual drive/file of some kind and map the file to the executable's memory stream?
You are asking for a very low-level, platform-specific feature to be implemented in a high-level, managed environment. Anything's possible...but nobody said it would be easy...
(BTW, I don't know why you think temp file management is onerous. The BCL does it for you: http://msdn.microsoft.com/en-us/library/system.io.path.gettempfilename.aspx )
Allocate enough memory to hold the executable. It can't reside on the managed heap, of course, so like almost everything in this exercise you'll need to PInvoke. (I recommend C++/CLI, actually, so as not to drive yourself too crazy). Pay special attention to the attribute bits you apply to the allocated memory pages: get them wrong and you'll either open a gaping security hole or have your process be shut down by DEP (i.e., you'll crash). See http://msdn.microsoft.com/en-us/library/aa366553(VS.85).aspx
Locate the executable in your assembly's resource library and acquired a pinned handle to it.
Memcpy() the code from the pinned region of the managed heap to the native block.
Free the GCHandle.
Call VirtualProtect to prevent further writes to the executable memory block.
Calculate the address of the executable's Main function within your process' virtual address space, based on the handle you got from VirtualAlloc and the offset within the file as shown by DUMPBIN or similar tools.
Place the desired command line arguments on the stack. (Windows Stdcall convention). Any pointers must point to native or pinned regions, of course.
Jump to the calculated address. Probably easiest to use _call (inline assembly language).
Pray to God that the executable image doesn't have any absolute jumps in it that would've been fixed up by calling LoadLibrary the normal way. (Unless, of course, you feel like re-implementing the brains of LoadLibrary during step #3).
Retrieve the return value from the #eax register.
Call VirtualFree.
Steps #5 and #11 should be done in a finally block and/or use the IDisposable pattern.
The other main option would be to create a RAMdrive, write the executable there, run it, and cleanup. That might be a little safer since you aren't trying to write self-modifying code (which is tough in any case, but especially so when the code isn't even yours). But I'm fairly certain it will require even more platform API calls than the dynamic code injection option -- all of them requiring C++ or PInvoke, naturally.
Take a look at the "In Memory" section of this paper. Realize that it's from a remote DLL injection perspective, but the concept should be the same.
Remote Library Injection
Creating a RAMdisk or dumping the code into memory and then executing it are both possible, but extremely complicated solutions (possibly more so in managed code).
Does it need to be an executable? If you package it as an assembly, you can use Assembly.Load() from a memory stream - a couple of trivial lines of code.
Or if it really has to be an executable, what's actually wrong with writing a temp file? It'll take a few lines of code to dump it to a temp file, execute it, wait for it to exit, and then delete the temp file - it may not even get out of the disk cache before you've deleted it! Sometimes the simple, obvious solution is the best solution.
This is explicitly not allowed in Vista+. You can use some undocumented Win32 API calls in XP to do this but it was broken in Vista+ because it was a massive security hole and the only people using it were malware writers.