Incoherence Program header and sector header - elf

I am facing an issue with a file generated with a tool with gcc in background , the output file is an elf file with a segment in program header that doesn't exist in sector header.
exp : in sector header the segment is at address : 0x08001000 but in program header the segment is at 0x08000000 (no segment in sector header with this address ).
I need to understand is it normal? is the file to be treated like a bad file or normally treated.
Here is the section header
And here is the bad program header

Related

How to convert ELF file to binary file?

My understanding is that a binary file is the hex-codes of the instructions of the processor (can be loaded into memory & start executing from entry point) and a ELF file is the same with NO-Fixed memory addresses assigned for data etc...
Now, how can I convert ELF to binary?
How the conversion works? I mean how the memory addresses are assigned?
In general
An ELF file does not need to use "NO-Fixed memory addresses". In fact, the typical ELF executable file (ET_EXEC) is using a fixed address.
A binary file is usually understood as a file containing non-text data. In the context of programs, it is usually understood to mean the compiled form of the program (in opposition to the source form which is usually a bunch of text files). ELF file are binary files.
Now you might want to know how the ELF file is transformed into the in-memory-representation of the program: the ELF file contains additional information such as where in the program (virtual) address-space each segment of the program should be loaded, which dynamic-libraries should be loaded, how to link the main program and the dynamic libraries together, how to initialise the program, where is the entry point of the program, etc.
One important part of an executable or shared-object is the location of the segments which must be loaded into the program address space. You can look at them using readelf -l:
$ readelf -l /bin/bash
Elf file type is EXEC (Executable file)
Entry point 0x4205bc
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000f1a74 0x00000000000f1a74 R E 200000
LOAD 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
0x0000000000009068 0x000000000000f298 RW 200000
DYNAMIC 0x00000000000f1df8 0x00000000006f1df8 0x00000000006f1df8
0x0000000000000200 0x0000000000000200 RW 8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x00000000000d6af0 0x00000000004d6af0 0x00000000004d6af0
0x000000000000407c 0x000000000000407c R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
0x0000000000000220 0x0000000000000220 R 1
Each LOAD (PT_LOAD) entry describes a segment which must be loaded in the program address-space.
Reading and processing this information is the job of the ELF loaders: on your typical OS this is done in part by the kernel and in part by the dynamic-linker (ld.so, also called "program interpreter" in ELF parlance).
ARM plain binary files
(I don't really known about ARM stuff.)
You're apparently talking about embedded platforms. On ARM, a plain binary file contains the raw content of the initial memory of the program. It does not contain things such as string tables, symbol tables, relocation tables, debug informations but only the data of the (PT_LOAD) segments.
It is a binary file, not hex-encoded. The vhx files are hex-encoded.
Plain binary files can be generated from the ELF files with fromelf.
The basic idea here is that each PT_LOAD entry of a ELF file is dumped at its correct position in the file and remaining gaps (if any) between them are filled with zeros.
The ELF file already has addresses assigned in the p_vaddr field of each segment so this conversion process does not need to determine addresses: this has already been done by the link editor (and the linker script).
References
ARM ELF file format
I came here while searching for "convert .elf into binary file" (with arm files in mind, though).
It turned out that the easiest way in my case was to use
arm-none-eabi-objcopy -O binary kernel.elf kernel.bin
I do not understand what do you want to say but ELF(Executable Linkable format) is a new executable format. ofcourse its sections including .text need to be mapped in memory for execution. but if you want to convert ELF into binary check what is the difference between ELF files and bin files. some answers contain information how to change ELF into other binary format
in order to clear how ELF is loaded into memory check http://www.gsp.com/cgi-bin/man.cgi?topic=elf. if you have still some problems come with specific question.

ELF program header segments sizes and offsets

I am trying to understand the ELF format and right now there are some thing that I don't get about the segments defined in the program header. I have this little code that I convert to an ELF file with g++ (x86_x64 on Linux):
#include <stdlib.h>
#include <iostream>
using namespace std;
int main(int argc, char *argv[])
{
if (argc == 1)
{
cout << "Hello world!" << endl;
}
return 0;
}
With g++ -c -m64 -D ACIS64 main.cpp -o main.o and g++ -s -O1 -o Main main.o.
Now, with readelf I get this list of segments:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000000afc 0x0000000000000afc R E 200000
LOAD 0x0000000000000df8 0x0000000000600df8 0x0000000000600df8
0x0000000000000270 0x00000000000003a0 RW 200000
DYNAMIC 0x0000000000000e18 0x0000000000600e18 0x0000000000600e18
0x00000000000001e0 0x00000000000001e0 RW 8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x00000000000009a4 0x00000000004009a4 0x00000000004009a4
0x0000000000000044 0x0000000000000044 R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x0000000000000df8 0x0000000000600df8 0x0000000000600df8
0x0000000000000208 0x0000000000000208 R 1
With Bless Hex Editor I am looking at the code and try to find each one of these segments.
I find the PHDR segment just after the ELF header and having the size of this entire program header. It has an alignment of 8 bytes and is readable/executable. [!]I don't understand why executable.
I find the segment where the interpreter is declared, just after the PHDR. It has the size of the interpreter's path and an alignment of 1 byte. Correct
Now I have a segment that is readable and executable, which [!]I suppose is the code segment. I don't understand why does it start at 0x0000000000000000. Shouldn't this start where the entry point is located? Why does it have a size of 0xafc bytes? Isn't the size only the size of the code? How much of the file is executable? Also, I don't understand why the alignment is 0x200000 bytes. Is that how much space is reserved for a LOAD segment in memory?. This is where this segment ends and an amout of 764 0x0 bytes follows it:
The next one (readable and writable) [!]I suppose is a segment where variables are stored. It ends just where something like the sections header might be starting.
Now the next one is a DYNAMIC header. It starts at 0xe18, which is inside the one above. [!]I thought this was a segment where references to external functions and variables are stored but I am not sure. It is readable and writable. I just don't know what segment is this and why it is "inside" the LOAD segment above
A NOTE segment, containing some info that I suppose is not important right now
GNU specific segments, one of them having any offsets and sizes equal to 0x0000000000000000, others interfering with other segments, which I don't get, either.
I come from the PE world, where each thing has its own well defined offset and size and here I see these weird addresses and sizes and I am confused.
The readelf output displays the program header table. It contains the list of segments (which may be loadable or non-loadable) in the ELF file. It is common for a segment to contain other segments, as seen here.
I find the PHDR segment just after the ELF header and having the size
of this entire program header. It has an alignment of 8 bytes and is
readable/executable. [!]I don't understand why executable.
If you read the readelf output carefully, you will notice that PHDR is actually a part of the code segment (notice the VirtAddr and the MemSiz fields). That explains why it shares the same permissions as the code segment (RX).
Now I have a segment that is readable and executable, which [!]I
suppose is the code segment. I don't understand why does it start at
0x0000000000000000. Shouldn't this start where the entry point is
located? Why does it have a size of 0xafc bytes? Isn't the size only
the size of the code? How much of the file is executable? Also, I
don't understand why the alignment is 0x200000 bytes. Is that how much
space is reserved for a LOAD segment in memory?. This is where this
segment ends and an amout of 764 0x0 bytes follows it:
Yes, this is the code segment. It begins at the beginning of the file (i.e. offset 0) and extends upto 0xafc bytes in the file. The header specifies that this part of the file is mapped to 0x0000000000400000 in memory when the ELF is loaded. The segment not only consists of the main( ) from the C++ file, some other executable stuff is also added by the compiler. Alignment only specifies where should the next segment begin, not the size of the segment. Loadable segments should have congruent values of VirtAddr and PhysAddr fields modulo page size (or Align field, if Align!=0 && Align!=1). That explains why VirtAddr for data segment is 0x0000000000600df8 (0x0000000000600df8 - 0x0000000000000df8 % 0x200000 == 0). The region in file between the text segment and the data segment (i.e. between 0xafc and 0xdf8) is filled with zeroes.
The next one (readable and writable) [!]I suppose is a segment where
variables are stored. It ends just where something like the sections
header might be starting.
Correct, this is the data segment that stores the global and static variables (among other stuff). It ends just before the section headers.
Now the next one is a DYNAMIC header. It starts at 0xe18, which is
inside the one above. [!]I thought this was a segment where references
to external functions and variables are stored but I am not sure. It
is readable and writable. I just don't know what segment is this and
why it is "inside" the LOAD segment above
Just like the PHDR segment is a part of the code segment, DYNAMIC segment is a part of the data segment. That's why the same permissions (RW). It contains .dynamic section which contains an array of structures such as addresses of symbol and string tables.
GNU specific segments, one of them having any offsets and sizes equal
to 0x0000000000000000, others interfering with other segments, which I
don't get, either.
GNU_EH_FRAME is a part of code segment and GNU_RELRO is a part of data segment (See the VirtAddr and MemSiz fields). GNU_STACK is just an program header which tells the system how to control the stack when the ELF is loaded into memory. (FileSiz and MemSiz are 0).
References:
ELF File format specification
Linkers and Loaders, by John R. Levine

Difference between #import header file with <filename> and "filename" [duplicate]

I'm wondering what decides whether you're allowed to use <Header.h> or "Header.h" when you're importing files in Objective-C. So far my observation has been that you use the quote marks "" for files in your project that you've got the implementation source to, and angle brackets <> when you're referencing a library or framework.
But how exactly does that work? What would I have to do to get my own classes to use the brackets? Right now Xcode will not allow me to do that for my own headers.
Also, by looking in some frameworks headers, I see that the headers reference each other with <frameworkname/file.h>. How does that work? It looks a lot like packages in Java, but as far as I know, there is no such thing as a package in Objective-C.
Objective-C has this in common with C/C++; the quoted form is for "local" includes of files (you need to specify the relative path from the current file, e.g. #include "headers/my_header.h"), while the angle-bracket form is for "global" includes -- those found somewhere on the include path passed to the compiler (e.g. #include <math.h>).
So to have your own headers use < > not " " you need to pass either the relative or the absolute path for your header directory to the compiler. See "How to add a global include path for Xcode" for info on how to do that in Xcode.
See this MSDN page for more info.
In C, the convention is that header files in <> bracket are searched in 'system' directories and "" in user or local directories.
The definition of system and local is a bit vague, I guess. I believe it looks in system directories in include path or in CPPFLAGS for <header.h>, and local directory or directory specified with -I to compiler are searched for "header.h" files.
I assume it works similarly for Objective-C.
To import your own classes using "< >" you have to put the header files (*.h) in the lib folder of compiler or set a SYSTEM VARIABLES ponting to your lib folder.
#import <> vs ""
<Name.h> - Angle brackets tells to preprocessor to search in a special pre-designated system's directories. For example you import systems headers like <UIKit/UIKit.h> or added frameworks
"Name.h" - Quotation marks tells to preprocessor to search in a current directory. If a header was not found the preprocessor try to use <Name.h>. Usually you should use it with your project's files
Just stumbled upon the same problem, there are 2 types of search paths is Xcode:
User Header Search Paths
Header Search Paths
If you add your own include folders into Header Search Paths, you can use angled brackets without any problem.
Or set Always Search User Path to YES so you can use angle brackets.
With angle brackets e.g. <Foundation/Foundation.h> you import system files.
You use double quotes "Person.h" to import local files (files that you created) and to tell the compiler where to look for them.
If this is an Xcode project and you want to include it in a framework, have the header file you want to included open. Then, open Xcode's rightmost tab and under "Target Membership", click on the framework you want your file to available from.
e.g. If your framework is AlphaTools and your header, AceHeader, then you'll select AlphaTools on Target Membership so you can access < AlphaTools/AceHeader.h
WHAT IS HEADER FILE ?
Header files contain definitions of functions and variables which can be incorporated into any C program by using the pre-processor #include statement. Standard header files are provided with each compiler, and cover a range of areas, string handling, mathematical, data conversion, printing and reading of variables.
Ex- #include it contain the information about input like scanf(),and out put like printf() function and etc in a compiler.
INCLUDE
1) #INCLUDE:-
It is a pre-processor that process before process of main function.
The main work of pre-processor is to initialize the environment of program i.e that is the program with the header file.
2).h:-
(Header file) A header file is a file with extension .h which contains C function declarations and macro definitions and to be shared between several source files.
Q) There are two types of header files: the files that the programmer writes and the files that come with your compiler ?
A)In a angular brackets
Angular-bracket form is for "global" includes -- those found somewhere on the include path passed to the compiler (e.g. #include)
It is used for using of library function which is all ready define in compiler.
In C the convention is that header files in <> bracket are searched in 'system' directories 
B) Quote marks:- “header.h”
quoted form is for "local" includes of files (you need to specify the relative path from the current file, e.g. #include "headers/my_header.h")
In C the convention is that header files in " " are searched in user or local directories.
In it one file to be included in another .(FILE INCLUSION).
It can be used in two cases:
Case 1: If we have a very large program, the code is best divided int several different files,each containing a set of related functions.
Case 2: There are some functions and micros definitions that we need at most in all programs that we write.
Ex

In an ELF file, how does the address for _start get detemined?

I've been reading the ELF specification and cannot figure out where the program entry point and _start address come from.
It seems like they should have to be in a pretty consistent place, but I made a few trivial programs, and _start is always in a different place.
Can anyone clarify?
The _start symbol may be defined in any object file. Normally it is generated automatically (it corresponds to main in C). You can generate it yourself, for instance in an assembler source file:
.globl _start
_start:
// assembly here
When the linker has processed all object files it looks for the _start symbol and puts its value in the e_entry field of the elf header. The loader takes the address from this field and makes a call to it after it has finished loading all sections in memory and is ready to execute the file.
Take a look at the linker script ld is using:
ld -verbose
The format is documented at: https://sourceware.org/binutils/docs-2.25/ld/Scripts.html
It determines basically everything about how the executable will be generated.
On Binutils 2.24 Ubuntu 14.04 64-bit, it contains the line:
ENTRY(_start)
which sets the entry point to the _start symbol (goes to the ELF header as mentioned by ctn)
And then:
. = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
which sets the address of the first headers to 0x400000 + SIZEOF_HEADERS.
I have modified that address to 0x800000, passed my custom script with ld -T and it worked: readelf -s says that _start is at that address.
Another way to change it is to use the -Ttext-segment=0x800000 option.
The reason for using 0x400000 = 4Mb = getconf PAGE_SIZE is to start at the beginning of the second page as asked at: Why is the ELF execution entry point virtual address of the form 0x80xxxxx and not zero 0x0?
A question describes how to set _start from the command line: Why is the ELF entry point 0x8048000 not changeable with the "ld -e" option?
SIZEOF_HEADERS is the size of the ELF + program headers, which are at the beginning of the ELF file. That data gets loaded into the very beginning of the virtual memory space by Linux (TODO why?) In a minimal Linux x86-64 hello world with 2 program headers it is worth 0xb0, so that the _start symbol comes at 0x4000b0.
I'm not sure but try this link http://www.docstoc.com/docs/23942105/UNIX-ELF-File-Format
at page 8 it is shown where the entry point is if it is executable. Basically you need to calculate the offset and you got it.
Make sure to remember the little endianness of x86 ( i guess you use it) and reorder if you read bytewise edit: or maybe not i'm not quit sure about this to be honest.

#import using angle brackets < > and quote marks " "

I'm wondering what decides whether you're allowed to use <Header.h> or "Header.h" when you're importing files in Objective-C. So far my observation has been that you use the quote marks "" for files in your project that you've got the implementation source to, and angle brackets <> when you're referencing a library or framework.
But how exactly does that work? What would I have to do to get my own classes to use the brackets? Right now Xcode will not allow me to do that for my own headers.
Also, by looking in some frameworks headers, I see that the headers reference each other with <frameworkname/file.h>. How does that work? It looks a lot like packages in Java, but as far as I know, there is no such thing as a package in Objective-C.
Objective-C has this in common with C/C++; the quoted form is for "local" includes of files (you need to specify the relative path from the current file, e.g. #include "headers/my_header.h"), while the angle-bracket form is for "global" includes -- those found somewhere on the include path passed to the compiler (e.g. #include <math.h>).
So to have your own headers use < > not " " you need to pass either the relative or the absolute path for your header directory to the compiler. See "How to add a global include path for Xcode" for info on how to do that in Xcode.
See this MSDN page for more info.
In C, the convention is that header files in <> bracket are searched in 'system' directories and "" in user or local directories.
The definition of system and local is a bit vague, I guess. I believe it looks in system directories in include path or in CPPFLAGS for <header.h>, and local directory or directory specified with -I to compiler are searched for "header.h" files.
I assume it works similarly for Objective-C.
To import your own classes using "< >" you have to put the header files (*.h) in the lib folder of compiler or set a SYSTEM VARIABLES ponting to your lib folder.
#import <> vs ""
<Name.h> - Angle brackets tells to preprocessor to search in a special pre-designated system's directories. For example you import systems headers like <UIKit/UIKit.h> or added frameworks
"Name.h" - Quotation marks tells to preprocessor to search in a current directory. If a header was not found the preprocessor try to use <Name.h>. Usually you should use it with your project's files
Just stumbled upon the same problem, there are 2 types of search paths is Xcode:
User Header Search Paths
Header Search Paths
If you add your own include folders into Header Search Paths, you can use angled brackets without any problem.
Or set Always Search User Path to YES so you can use angle brackets.
With angle brackets e.g. <Foundation/Foundation.h> you import system files.
You use double quotes "Person.h" to import local files (files that you created) and to tell the compiler where to look for them.
If this is an Xcode project and you want to include it in a framework, have the header file you want to included open. Then, open Xcode's rightmost tab and under "Target Membership", click on the framework you want your file to available from.
e.g. If your framework is AlphaTools and your header, AceHeader, then you'll select AlphaTools on Target Membership so you can access < AlphaTools/AceHeader.h
WHAT IS HEADER FILE ?
Header files contain definitions of functions and variables which can be incorporated into any C program by using the pre-processor #include statement. Standard header files are provided with each compiler, and cover a range of areas, string handling, mathematical, data conversion, printing and reading of variables.
Ex- #include it contain the information about input like scanf(),and out put like printf() function and etc in a compiler.
INCLUDE
1) #INCLUDE:-
It is a pre-processor that process before process of main function.
The main work of pre-processor is to initialize the environment of program i.e that is the program with the header file.
2).h:-
(Header file) A header file is a file with extension .h which contains C function declarations and macro definitions and to be shared between several source files.
Q) There are two types of header files: the files that the programmer writes and the files that come with your compiler ?
A)In a angular brackets
Angular-bracket form is for "global" includes -- those found somewhere on the include path passed to the compiler (e.g. #include)
It is used for using of library function which is all ready define in compiler.
In C the convention is that header files in <> bracket are searched in 'system' directories 
B) Quote marks:- “header.h”
quoted form is for "local" includes of files (you need to specify the relative path from the current file, e.g. #include "headers/my_header.h")
In C the convention is that header files in " " are searched in user or local directories.
In it one file to be included in another .(FILE INCLUSION).
It can be used in two cases:
Case 1: If we have a very large program, the code is best divided int several different files,each containing a set of related functions.
Case 2: There are some functions and micros definitions that we need at most in all programs that we write.
Ex