Block Reloc Tables in DLL Files - dll

I write assemblers and compilers, and a couple of years was able to directly write PE+ EXE files.
Now I'm extending that to DLL files, and I'm slowly getting there. I'm just about to start generating a Block Relocation Table (I only found yesterday I needed one).
However, this table specifies relocations grouped within 4KB blocks. My question is, what happens if a relocatable field (it'll be 32- or 64-bits) crosses over into the next 4K block? That is, crosses the boundary from this 4096-byte virtual address page into the next.
For example, if a 64-bit relocatable field starts at offset 0xFFE in this page, and finishes at 0x005 in the next (2 bytes in one and 6 in the other). Will the relocation mechanism deal with that? If not, what do I have to do?

Never mind. I set up a test to see what would happen. This was an array of some 700 64-bit addresses, deliberately misaligned so that one word would cross a 4K boundary, and examined the block table produced by an existing linker. (Harder with code because the linker I used - gcc/ld - injected lots of code ot its own. Another reason I need my own solution).
And there was indeed a 64-bit relocation item that started at offset 0xFFB.
I guess this was the only pracical way of doing it (relocated fields in code will not usually be aligned), but it would be nice if the docs mentioned it.n

Related

Reading DIEs in ELF file

Hello I fairly new to the DWARF standard and ELF format. I have a few questions. I am using the DWARF 2 standard and I have a pretty basic understanding of how DIEs work and I was needing more clarity on how they are represented in bytes.
ELF Wiki provides a good table for in which order the bytes go in the program header, sections, and segments. But what is the correct way to represent DIEs in bytes for the DWARF 2 standard?
I have tried to dive deep into Dwarf Standards pdf documents to try to understand how DIEs are represented in bytes. Perhaps there is a section I am missing?
I would like to use this information to be able to delete certain DIEs to save space in the debugging section. I am only interest in the DIEs that provide variable address's.
I recommend that anyone starting out in DWARF begin with the Introduction to the DWARF Debugging Format. It's a very concise overview that provides an excellent foundation for exploring the format in further depth. Armed with this background, compile a debug version of a very simple program and compare a hex dump of the two ELF sections .debug_abbrev and .debug_info with the output of dwarfdump or readelf.
Once you are broadly familiar with the encoding of a DIE you will see that simply deleting its corresponding bytes from .debug_info would corrupt the entire file — in terms of both DWARF and ELF. For example, each DIE is identified by its relative file offset; deleting one DIE's bytes would alter the offsets of all subsequent DIEs and any references to them would therefore be broken. A robust solution would require parsing the DWARF to create an internal representation of the tree before eliminating unwanted nodes and writing out new DWARF. After modifying .debug_info you'd then need to edit the fabric of the ELF itself: at the very least, this would involve updating the section header table to reflect the new offsets for any shifted sections and updating any relocations.
If your principal concern is indeed space saving then I suggest you instead investigate what compiler options you have. The Oracle Studio Compilers, for example, allow fine control over the content included in the DWARF. Depending on your compiler and OS it may also be possible to emit files with compressed DWARF sections (e.g. .zdebug_info) or even leave the DWARF in different files altogether. The problem of DWARF bloat is well known and, if you are interested in tackling it at a low level yourself, you will find other suggestions in Michael Eager's introduction and in later versions of the standard.
The format is explained page 66 in sections 7.5.2 and 7.5.3.
The example in appendix 2, page 93 is is much clearer:
Each DIE references a corresponding entry in .debug_abbrev which defines a given DIE "signature" i.e.
its type (DW_TAG_*)
it has child DIE
its attribute (DW_AT_*) and their form (DW_FORM_*).
The format od the DIE is:
reference to a abbreviation (LEB128 i.e. variable length);
0 is used for ending a list of children̄;
une value per attribute (using the encoding associated with the given form).

Why Does a .dll file allow programs to be modularized?

Excerpt From Micrsoft's "What is a .dll?":
"By using a DLL, a program can be modularized into separate
components. For example, an accounting program may be sold by module.
Each module can be loaded into the main program at run time if that
module is installed. Because the modules are separate, the load time
of the program is faster, and a module is only loaded when that
functionality is requested. Additionally, updates are easier to apply
to each module without affecting other parts of the program. For
example, you may have a payroll program, and the tax rates change each
year. When these changes are isolated to a DLL, you can apply an
update without needing to build or install the whole program again."
Ref:http://support.microsoft.com/kb/815065
DLL's are:
loaded at runtime
can "dynamically loaded" (by multiple programs at the same time)
- which allows saving of resources
- lowers disk space requirements
But why do they promote "modulizing" programs?What would happen if there weren't .dll files?Could someone provide/expand on the example
Modular programs provide a way of making a particular functionality available to many programs without having to include the same code in all of them. Also, they allow greater compatibility between programs since they would essentially use the same methods in common DLLs to obtain the same results.
One would write a program in a modular fashion such that different parts of the program could be maintained separately. Say you had some clever way of reading and writing your own data format to files. Say you make improvements to that technique. If the code for reading and writing the files lived in a DLL, you would only need to update the DLL. The program itself would remain unchanged.
If you have one monolithic EXE, you have to
pay for all the extra time relinking it, even if 1 source file changed (this is painful if it's > 80 MB, as is the case in large projects),
ship the entire EXE, when you could only ship a single DLL which is a fraction of the size (for patches/updates).
Breaking it up into DLLs you
have pluggability: The EXE is the host application and others can write DLLs that "plug into" the host via a well-defined interface. DLLs can be interchanged as long as they conform to the interface.
can share code across other DLLs and EXEs.
can have some DLLs be optionally loaded on demand, only if they're used, and unloaded when they're not needed
similar to above, have optional functionality. With a single EXE you have to download everything, even if some components are rarely used. With DLLs, you could have a system that downloads and installs features as needed.
The biggest advantage of dlls is probably during development of the original program. Without dlls you wouldn't be able to integrate with existing libraries without including the original source code. By including an existing library as a dll you don't need the source since it's all encapsulated in the dll. It would be a nightmare to develop in frameworks like .Net without dlls since you constantly include other libraries...
The alternative to breaking your program down in n > 1 pieces is to keep it in n == 1 piece. Why is this bad? Well it isn't always bad (maybe the BIOS is a good example?). But for user programs it usually is. Why? First we need to define what a program is.
What is a program?
A simple "program", roughly speaking, consists of an entry point (i.e. offset to the main function), functions and global variables. A function consists of instructions and information about what local variables are needed to run the function. To be executed a program must be loaded in primary memory/RAM (the aforementioned information). Because our program has functions (and not just jump statements), that implies the existence of a stack, which implies the existence of a containing environment managing the stack. (I suppose you could have a program that manages its own stack but I'd argue then your program is not a program anymore but an environment.) This environment contains the program, starts in the entry point and executes each instruction, be it "go to this part of the RAM and add it to whatever is in this register" or "If this register is all 0 then jump ahead this many instructions and resume execution there" indefinitely or until the program gives control back to its environment. (This is somewhat simplified - context switches in multi-process environments, illegal memory access, illegal instructions, etc. can also cause control to be taken from the program.)
Anyway, so we have two options: either load the entire program at once or have it stored and loaded in pieces.
n == 1
There are some advantages to doing it all at once:
Once the program is in memory no disk access is required to execute further (unless the program explicitly asks there to be).
Since the program is compiled/linked before execution begins you can do everything without any sort of string names/comparisons - go directly to the address (or an offset).
Functions are never out of sync with one another.
n > 1
There are some disadvantages, though, which mirror the advantages:
Most programs don't execute all code paths most of the time. I think there's some studies that in most programs most of the time spent executing is spent in a fraction of the instructions present in the program. In other words something like 20% of the program is executed 80% of the time (I just made that particular figure up - but you get the idea). If we divide our program up enough and only load instruction sets (i.e. functions) as they are needed then we won't waste time loading the 80% we'll never use this execution of the program. Along these lines we can ultimately fit more concurrently executing programs in our RAM at once if we only end up loading the fraction of the program we need.
Most programs share similar functions (i.e. storing data/trees/hashes/sorting/etc., reading input, writing output, etc.) and if each program has its own local copy then you can't reuse instruction code.
Many programs depend on the existence of others and are maintained by separate companies/groups/individuals. By releasing versioned modules we don't have to synchronize releases all the time.
Conclusion
These aren't the only points to consider but the first ones that came to my mind. I'd recommend reading about compilers, linkers and operating systems. That will answer this question more thoroughly than I and other questions I'm sure this has brought up. To recap dll's aren't the "best" way of packaging executable programs in all situations and circumstances - they have a particular use and advantages and disadvantages.

ELF files - What is a section and why do we need it?

I've been reading ELF standard here. From what I understand, each ELF contains ELF header, program headers (why more than one?) and section headers. Can anyone please explain:
How are ELF files generated? is it the compiler responsibility?
What are sections and why do we need them?
What are program headers and why do we need them?
Inside program headers, what's the meaning of the fields p_vaddr and p_paddr?
Does each section have it's own section header?
Alternatively, does any one have a link to a more friendly documenation of ELF?
How are ELF files generated? is it the compiler responsibility?
They can be generated by a compiler, an assembler, or any other tool that can generate them. Even your own program you wrote for generating ELF files ;) They're just streams of bytes after all, so they can be generated by just writing bytes into a file in binary mode. You can do that too.
What are sections and why do we need them?
ELF files are subdivided into sections. Sections are the smallest continuous regions in the file. You can think of them as pages in an organizer, each with its own name and type that describes what does it contain inside. Linkers use this information to combine different parts of the program coming from different modules into one executable file or a library, by merging sections of the same type (gluing pages together, if you will).
In executable files, sections are optional, but they're usually there to describe what's in the file and where does it begin, and how much bytes does it take.
What are program headers and why do we need them?
They're mostly for making executable files. In order to run a program, sections aren't enough, because you have to specify not only what's there in the file, but also where should it be loaded into memory in the running process. Program headers are just for that purpose: they describe segments, which are regions of memory in the running process, with different access privileges & stuff.
Each program header describes one segment. It tells the loader where should it load a certain region in the file into memory and what permissions should it set for that region (e.g. should it be allowed to execute code from it? should it be writable or just for reading?)
Segments can be further subdivided into sections. For example, if you have to specify that your code segment is further subdivided into code and static read-only strings for the messages the program displays. Or that your data segment is subdivided into funky data and hardcore data :J It's for you to decide.
In executable files, sections are optional, but it's nice to have them, because they describe what's in the file and allow for dumping selected parts of it (e.g. with the objdump tool). Sometimes they are needed, though, for storing dynamic linking information, symbol tables, debugging information, stuff like that.
Inside program headers, what's the meaning of the fields p_vaddr and p_paddr?
Those are the addresses at which the data in the file will be loaded. They map the contents of the file into their corresponding memory locations. The first one is a virtual address, the second one is physical address.
Physical addresses are the "raw" memory addresses. On modern operating systems, those are no longer used in the userland. Instead, userland programs use virtual addresses. The operating system deceives the userland program that it is alone in memory, and that the entire address space is available for it. Under the hood, the operating system maps those virtual addresses to physical ones in the actual memory, and it does it transparently to the program.
Of course, not every address in the virtual address space is available at the same time. There are limitations imposed by the actual physical memory available. So the operating system just maps the memory for the segments the program actually uses (here's where the "segments" part from the ELF file's program headers comes into play). If the process tries to access some unmapped memory, the operating system steps in and says, "sorry, chap, this memory doesn't belong to you". (The program can address it, but it cannot access it.)
Does each section have it's own section header?
Yes. If it doesn't have an entry in the Section Headers Table, it's not a section :q Because they only way to tell if some part of the file is a section, is by looking in to the Section Headers Table which tells you what sections are defined in the file and where you can find them.
You can think of the Section Headers Table as a table of contents in a book. Without the table of contents, there aren't any chapters after all, because they're not listed anywhere. The book may have headings, but the content is not subdivided into logical chapters that can be found through the table of contents. Same goes with sections in ELF files: there can be some regions of data, but you can't tell without the "table of contents" which is the SHT.
This link includes a better explaination.
How are ELF files generated? is it the compiler responsibility?*
It is architecture dependent.
What are sections and why do we need them?
Different section have different information such as code, initialized data, uninitialized data etc. These information will be used by the compiler and linker.
What are program headers and why do we need them?
Program headers are used by the operating system when it loads the executable. These headers contains information about the segments (contiguous memory block with some permissions) such as which parts needs to be loaded, interpreter infor etc.
Inside program headers, what's the meaning of the fields p_vaddr and p_paddr?
In general virtual address and the physical address are same. But could be different depends on the system.
Does each section have it's own section header?
yes. Each section have a section header entry at section header table.
This is the best documentation I've found: http://www.skyfree.org/linux/references/ELF_Format.pdf
Each section has only one section header, but there can be section headers without sections
2 - There are many different sections, ex: relocation section recoeds many infomation for relocation symbol. I use the infomation to load a elf object and run/relocate the object.
Antoher example: debug section records debug information, gdb use the data for showing debug message.
Symbol section records symbol information.
3 - programming header used by loader, loader loads a elf execute file by looking up programming header.

What form is DLL & what makes it processor dependent

I know DLL contains one or more exported functions that are compiled, linked, and stored separately..
My question is about not about how to create it.. but it is all about in what form it is stored.. Is it going to be in the form of 0's & 1's.. or in assembly commands ADD, MUL, DIV, MOV, CALL, RETURN etc..
Also what makes it to be processor dependent.. (like x86, x87, IBM 700 instruction set)..
Can someone please explain it little briefly..!
First of all, everything in a computer is in the form of "0's & 1's" . The fact that the computer can display some of these as text, pictures, sounds, 3D models, etc. is just a matter of how you interpret them. But down there, at the metal, it's all just "0's & 1's" (also known as bits). Note though that they are always grouped together in groups of 8, and these are called "bytes". It's really for the sake of efficiency, because operating with every bit individually would be too tedious. Actually, todays computers don't even operate on single bytes anymore (or rather - they do it very rarely). Mostly you operate with 4 or 8 bytes at a time, depending on whether you have a 32-bit or 64-bit CPU (that's in layman's terms, it's actually a bit more complicated than that).
As for a .DLL file - like an .EXE file, it contains bytes that describe instructions that a CPU can execute. The CPU takes these bytes directly from the .DLL/.EXE and executes them without any further modifications. That's why these files are CPU-specific. In different CPU architectures the same combination of bytes means different things, so a .DLL/.EXE will run correctly only on the CPU for which it was designed. On other CPUs these bytes will mean some other instructions, and when run, the program will most likely do some utter nonsense and crash immediately.
The assembly commands you mentioned also deserve an explanation. "Assembler" is not a language that a CPU can understand. It's a language a human can understand. It was created because writing directly in machine code (the bytes that the CPU actually understands) is very difficult. What you get is utter gibberish on the screen (try opening some .EXE file in Notepad!) but every bit has to be precisely set for it to work.
So assembly language is basically the same thing, except these instructions are written in text that humans can read. For every machine code that a CPU can understand, there is am instruction with a human-friendly name. An assembly compiler simply reads these instructions and replaces them with the bytes that represent the actual instructions for the CPU to execute. It's a 1:1 operation. Every command in assembly language matches a single machine instruction (again, in layman's terms).
So you see, there isn't even a single assembly language. Every CPU architecture has its own assembly language, because they each have different instructions.
Note though that all this applies to native .DLL/.EXE files. .NET files are different - they don't contain machine code, but rather instructions for an abstract, nonexistent CPU. It's like Java bytecodes. When a .NET .DLL/.EXE is run, the .NET runtime translates it from the abstract instructions to the instructions that the specific CPU can understand. They use a lot of tricks to make this very fast, so these files run almost as fast as simple .DLL/.EXE files.
Does this clear things up? :)
Native DLLs (not .NET assemblies) usually contain machine code that can only be run on a certain platform. The machine code is a sequence of bytes that the processor treats as instructions (ADD, MOV, etc.).
In Windows, dll's are stored in the PE format which is basically a collection of sections that holds the information about how to map it into memory. some sections contains the program's code (which is of course processor dependent), others contains the program's data, other the exported and imported functions and so on.
Managed code is compiled to some intermediate language that is JITed by the run-time as it is executed. therefore, your dll won't contain any processor dependent code and you'll be able to execute your program on any platform with the relevant run-time.
it depends on your DLL. generally, a DLL contains executable code as an EXE file. those code DLLs are processor dependent since the code can only be executed on a specific platform. the code is stored using the same "format" as an EXE file (binary machine code).
however, a DLL can sometimes contains only data: they are then called "resource DLL" and are not processor dependent at all. they act as a container for data files used by applications.
note that many DLLs are hybrids: they contain both code and resources. for example, most DLLs which comprises the user part of the Windows operating system are hybrid: you can open them using Visual Studio or a Resource Explorer to see the resources (the data segments) they contain, or open them with Dependency Walker or dumpbin to see the functions (the code segments) they contain.
(of course this answer is really Windows specific, i don't know for .so files which are the linux equivalent of a DLL)
Both a DLL and an EXE contain executable code.
In the case of a DLL it doesn't have the necessary parts to be directly executable. It must be called from an other piece of executable code. One DLL can call another, but all must ultimately be called from and EXE.
So the rules about what's compatible with what processor that apply to EXEs also apply to DLLs.

Process for reducing the size of an executable

I'm producing a hex file to run on an ARM processor which I want to keep below 32K. It's currently a lot larger than that and I wondered if someone might have some advice on what's the best approach to slim it down?
Here's what I've done so far
So I've run 'size' on it to determine how big the hex file is.
Then 'size' again to see how big each of the object files are that link to create the hex files. It seems the majority of the size comes from external libraries.
Then I used 'readelf' to see which functions take up the most memory.
I searched through the code to see if I could eliminate calls to those functions.
Here's where I get stuck, there's some functions which I don't call directly (e.g. _vfprintf) and I can't find what calls it so I can remove the call (as I think I don't need it).
So what are the next steps?
Response to answers:
As I can see there are functions being called which take up a lot of memory. I cannot however find what is calling it.
I want to omit those functions (if possible) but I can't find what's calling them! Could be called from any number of library functions I guess.
The linker is working as desired, I think, it only includes the relevant library files. How do you know if only the relevant functions are being included? Can you set a flag or something for that?
I'm using GCC
General list:
Make sure that you have the compiler and linker debug options disabled
Compile and link with all size options turned on (-Os in gcc)
Run strip on the executable
Generate a map file and check your function sizes. You can either get your linker to generate your map file (-M when using ld), or you can use objdump on the final executable (note that this will only work on an unstripped executable!) This won't actually fix the problem, but it will let you know of the worst offenders.
Use nm to investigate the symbols that are called from each of your object files. This should help in finding who's calling functions that you don't want called.
In the original question was a sub-question about including only relevant functions. gcc will include all functions within every object file that is used. To put that another way, if you have an object file that contains 10 functions, all 10 functions are included in your executable even if one 1 is actually called.
The standard libraries (eg. libc) will split functions into many separate object files, which are then archived. The executable is then linked against the archive.
By splitting into many object files the linker is able to include only the functions that are actually called. (this assumes that you're statically linking)
There is no reason why you can't do the same trick. Of course, you could argue that if the functions aren't called the you can probably remove them yourself.
If you're statically linking against other libraries you can run the tools listed above over them too to make sure that they're following similar rules.
Another optimization that might save you work is -ffunction-sections, -Wl,--gc-sections, assuming you're using GCC. A good toolchain will not need to be told that, though.
Explanation: GNU ld links sections, and GCC emits one section per translation unit unless you tell it otherwise. But in C++, the nodes in the dependecy graph are objects and functions.
On deeply embedded projects I always try to avoid using any standard library functions. Even simple functions like "strtol()" blow up the binary size. If possible just simply avoid those calls.
In most deeply embedded projects you don't need a versatile "printf()" or dynamic memory allocation (many controllers have 32kb or less RAM).
Instead of just using "printf()" I use a very simple custom "printf()", this function can only print numbers in hexadecimal or decimal format not more. Most data structures are preallocated at compile time.
Andrew EdgeCombe has a great list, but if you really want to scrape every last byte, sstrip is a good tool that is missing from the list and and can shave off a few more kB.
For example, when run on strip itself, it can shave off ~2kB.
From an old README (see the comments at the top of this indirect source file):
sstrip is a small utility that removes the contents at the end of an
ELF file that are not part of the program's memory image.
Most ELF executables are built with both a program header table and a
section header table. However, only the former is required in order
for the OS to load, link and execute a program. sstrip attempts to
extract the ELF header, the program header table, and its contents,
leaving everything else in the bit bucket. It can only remove parts of
the file that occur at the end, after the parts to be saved. However,
this almost always includes the section header table, and occasionally
a few random sections that are not used when running a program.
Note that due to some of the information that it removes, a sstrip'd executable is rumoured to have issues with some tools. This is discussed more in the comments of the source.
Also... for an entertaining/crazy read on how to make the smallest possible executable, this article is worth a read.
Just to double-check and document for future reference, but do you use Thumb instructions? They're 16 bit versions of the normal instructions. Sometimes you might need 2 16 bit instructions, so it won't save 50% in code space.
A decent linker should take just the functions needed. However, you might need compiler & linke settings to package functions for individual linking.
Ok so in the end I just reduced the project to it's simplest form, then slowly added files one by one until the function that I wanted to remove appeared in the 'readelf' file. Then when I had the file I commented everything out and slowly add things back in until the function popped up again. So in the end I found out what called it and removed all those calls...Now it works as desired...sweet!
Must be a better way to do it though.
To answer this specific need:
•I want to omit those functions (if possible) but I can't find what's
calling them!! Could be called from any number of library functions I
guess.
If you want to analyze your code base to see who calls what, by whom a given function is being called and things like that, there is a great tool out there called "Understand C" provided by SciTools.
https://scitools.com/
I have used it very often in the past to perform static code analysis. It can really help to determine library dependency tree. It allows to easily browse up and down the calling tree among other things.
They provide a limited time evaluation, then you must purchase a license.
You could look at something like executable compression.