Comparing Object Files with ObjDump - elf

So I am trying to compare two ELF files original/gen_twiddle_fft16x16_imre.oe674 and new/gen_twiddle_fft16x16_imre.oe674 to see if they are the same thing. I have a hunch that they are, but, I can't tell exactly.
I can't just compare the code size and hope for the best, because their sizes are a few hundred bytes.
I run the dissassembler:
/ti_packages/all_packages/ccs1200/ccs/tools/compiler/ti-cgt-armllvm_2.1.0.LTS/bin/tiarmobjdump -D directory/gen_twiddle_fft16x16_imre.oe674
and I get:
gen_twiddle_fft16x16_imre.oe674: file format elf32-unknown
/ti_packages/all_packages/ccs1200/ccs/tools/compiler/ti-cgt-armllvm_2.1.0.LTS/bin/tiarmobjdump: error: 'gen_twiddle_fft16x16_imre.oe674': can't find target: : error: unable to get target for 'unknown--', see --version and --triple
for both files. I look at the symbol table, and they are exactly the same, except for this part at the top:
original/:
00000000 l df *ABS* 00000000 .hidden **TIsUkimy7q4**
new/:
00000000 l df *ABS* 00000000 .hidden **TIgFVpK1DaG**
Questions:
What could be the reason for this difference / what does this difference mean?
Is there anything else I should try?

What could be the reason for this difference / what does this difference mean?
Looks like two randomly-generated symbol names. The difference likely doesn't mean anything.
Is there anything else I should try?
Comparing symbol tables isn't going to tell you anything useful.
You should compare disassembly of the two objects (objdump -dr output), and also compare .data and .rodata in them (which you can dump with objdump -sj.data ..., etc.).
If they have identical .text, .data and .rodata, it's likely that the two objects are effectively the same.

Related

Filter the output of GNU nm by section

I'm trying to identify the largest symbols in an .elf file for each memory section (.text, .data, .bss). So far I'm using GNU nm to get the largest symbols:
nm foo.elf --size-sort --reverse-sort --radix=d --demangle --line-numbers
Is there a builtin way in nm to filter the ouput by section or do I need to resort to text filtering?
nm outputs a section type for every symbol as single letter code (B: .bss, D: .data, T: .text), but there seems no way to filter by symbol type.
Background: The code runs on a microcontroller which is able to execute instruction directly from flash memory. The instructions from the .text section stay in the flash memory during execution, .bss and .data are loaded into the RAM. That's way I would like to be able to identify the largest symbols in each section independently.
there seems no way to filter by symbol type.
Just use grep to perform any filtering you may need.
You may also want to look at Bloaty McBloatface: a size profiler for binaries.

Syntax error in INQUIRE(inpunit,flen=iflen) in gfortran but not in Lahey

I try to compile my code using gfortran. I got this error:
**INQUIRE(inpunit,flen=iflen)
1
Error: Syntax error in INQUIRE statement at (1)**
This code was compiled before with lahey. With a quick research I find that parameters of INQUIRE have different meaning in gfortran compared to lahey.
inpunit is a scalar INTEGER expression that evaluates to the
input/output unit number of an external file.
flen is a scalar default INTEGER variable having the length of the file in bytes.
My question is when using gfortran is this statement correct to get the same functionality as in Lahey:
**INQUIRE(inpunit,RECL=iflen) **
Are these two statements similar?
Thanks
No, these two are completely different.
flen= is a nonstandard extension specific to the Leahy compiler and returns the length of the file.
recl= is the maximum record length in the file (if the file is connected - opened, otherwise it is 0) https://software.intel.com/en-us/fortran-compiler-developer-guide-and-reference-inquire-recl-specifier
To be standard conforming you should use size=. Be aware that the result will be in file storage units. Gfortran uses bytes, but other compilers may use 4-byte words.
See What is a good way to get file size in bytes using Fortran (ifort)? Find input file size in fortran90

Strange variables with gdb

I am using gdb to debug a program in x86 assembly. Though I have a strange behavior of some variables and I can't understand why.
This is how I define and view them:
section .data
CountDied: dd 0000
OnesFound: db 00
section .text
global _start
_start:
nop
... code
When I run gdb step by step I check if the variable have the correct value at the very first instruction and I get the following:
print CountDied
$1=0
print OnesFound
$2=167772672
Though in the next instructions OnesFound seems to behave in a correct way. I'm really puzzled. Thanks for your suggestions.
An assembly "variable" is just a label for a specific point in memory. GDB doesn't know how big it is supposed to be, it's just assuming that it's a 32-bit value.
The hex representation of the number you're getting is 0x0A000200. x86 is a little endian platform, so that will actually be stored in memory as 00 02 00 0A. Only the first byte is actually part of the value you set, and it is set correctly.
You can view just the specific byte you want with by using the command x/b &OnesFound instead of using print.

How do I delete a program header from an ELF binary

I want to write a utility to remove a program header from an ELF binary. For example, when I run readelf -l /my/elf I get a listing of all the program headers: PHDR INTERP ... GNU_STACK GNU_RELRO. When I run my utility, I would like to get all the same program headers back in the same order, minus the one I deleted. Is there any easier way to do this than recreated the entire ELF from scratch, skipping the unwanted header?
Is there any easier way to do this than recreated the entire ELF from scratch
Sure: program headers form a fixed-record table at an offset given by ehdr.e_phoff, containing .e_phnum entries of .e_phentsize bytes.
To delete one entry, simply copy the rest of entries over it, and decrement .e_phnum. That's all there is to it.
Beware: deleting some entries will likely cause the dynamic loader to crash. GNU_STACK is about the only header that can be deleted without too much harm (that I can think of).
Update:
Yes, setting .p_type to PT_NULL is another (and simpler) approach. But such entries are generally not expected to be present, and you may find some systems where PT_NULL will trigger an assertion in the loader (or in some other program).
Finally, adding a new Phdr might be tricky. Usually there is no space to expand the table (as it is immediately followed by some other data, e.g. .text). You can relocate the table to the end of the file, and set .e_phoff and .e_phnum to correspond to the new table, but many programs expect the entire Phdr table to be loaded and available at runtime, and that is not easy to arrange, as the new location at the end of the file will not be "covered" by any PT_LOAD segment.
The GNU Binary File Descriptor library (libbfd) may be helpful.

In an ELF file, how does the address for _start get detemined?

I've been reading the ELF specification and cannot figure out where the program entry point and _start address come from.
It seems like they should have to be in a pretty consistent place, but I made a few trivial programs, and _start is always in a different place.
Can anyone clarify?
The _start symbol may be defined in any object file. Normally it is generated automatically (it corresponds to main in C). You can generate it yourself, for instance in an assembler source file:
.globl _start
_start:
// assembly here
When the linker has processed all object files it looks for the _start symbol and puts its value in the e_entry field of the elf header. The loader takes the address from this field and makes a call to it after it has finished loading all sections in memory and is ready to execute the file.
Take a look at the linker script ld is using:
ld -verbose
The format is documented at: https://sourceware.org/binutils/docs-2.25/ld/Scripts.html
It determines basically everything about how the executable will be generated.
On Binutils 2.24 Ubuntu 14.04 64-bit, it contains the line:
ENTRY(_start)
which sets the entry point to the _start symbol (goes to the ELF header as mentioned by ctn)
And then:
. = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
which sets the address of the first headers to 0x400000 + SIZEOF_HEADERS.
I have modified that address to 0x800000, passed my custom script with ld -T and it worked: readelf -s says that _start is at that address.
Another way to change it is to use the -Ttext-segment=0x800000 option.
The reason for using 0x400000 = 4Mb = getconf PAGE_SIZE is to start at the beginning of the second page as asked at: Why is the ELF execution entry point virtual address of the form 0x80xxxxx and not zero 0x0?
A question describes how to set _start from the command line: Why is the ELF entry point 0x8048000 not changeable with the "ld -e" option?
SIZEOF_HEADERS is the size of the ELF + program headers, which are at the beginning of the ELF file. That data gets loaded into the very beginning of the virtual memory space by Linux (TODO why?) In a minimal Linux x86-64 hello world with 2 program headers it is worth 0xb0, so that the _start symbol comes at 0x4000b0.
I'm not sure but try this link http://www.docstoc.com/docs/23942105/UNIX-ELF-File-Format
at page 8 it is shown where the entry point is if it is executable. Basically you need to calculate the offset and you got it.
Make sure to remember the little endianness of x86 ( i guess you use it) and reorder if you read bytewise edit: or maybe not i'm not quit sure about this to be honest.