ELF unmapped region in a segment - elf

Below is the output for my readelf -l test
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00008000 0x00008000 0x00148 0x00148 R E 0x8000
LOAD 0x000148 0x00010148 0x00010148 0x00000 0x00004 RW 0x8000
NOTE 0x0000b4 0x000080b4 0x000080b4 0x00024 0x00024 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
Section to Segment mapping:
Segment Sections...
00 .note.gnu.build-id .text
01 .bss
02 .note.gnu.build-id
03
My question is about the first LOAD segment. It encompasses [8000 - 8148], and is mapped to sections .note and .text. My readelf -S output shows that .note section starts from 80b4, and .text starts from 80d8. That means Loadable segment contains a region [8000-80b3] which is unmapped to any section, but still will be loaded to memory by loader.
My question is, if there is any harm if I create a new segment which ranges from[80b4-8148] deleting this segment?

That means Loadable segment contains a region [8000-80b3] which is unmapped to any section, but still will be loaded to memory by loader.
Correct. You will find the Elf32_Ehdr, and likely a set of Elf32_Phdrs in that segment.
Note: for the main binary, it's actually the kernel that does the loading, and not the dynamic linker. You are not wrong in calling it "loader", but usually people use "loader" for the dynamic linker, and not the "part of the kernel that maps in the main binary".
My question is, if there is any harm if I create a new segment which ranges from[80b4-8148] deleting this segment?
The segment has to be page-aligned. A segment with .p_vaddr that is not page-aligned (as I believe you are proposing) will be rejected by the kernel.

Related

Is there any way to run 32-bit .ELF files that call for .bin Files?

The Lexibook TV Games/Yeno 200-in-1 Console has a large set of 32-bit games. On the MicroSD Card, the 32-bit games are a set of folders with one .elf file, and a folder of many many .bin files. My question is, is there any type of way to run these properly? Opening the .elf files in online viewers show the files called upon, and the beginning is as follows:
ELF header
==========
Name Offset NumValue Value
EI_MAG: 0x00000000 0x7F454C46 ELF
EI_CLASS 0x00000004 0x01 32 BIT
EI_DATA 0x00000005 0x01 DATA2LSB (Little-Endian)
EI_VERSION 0x00000006 0x01 EV_CURRENT
EI_OSABI 0x00000007 0x00 UNIX System V ABI
EI_OSABIVER 0x00000008 0x00
E_TYPE 0x00000010 0x0002 ET_EXEC (Executable file)
E_MACHINE 0x00000012 0x0087 Unknown
E_VERSION 0x00000014 0x00000001 EV_CURRENT
E_ENTRY 0x00000018 0xA0081000
E_PHOFF 0x0000001C 0x00000034
E_SHOFF 0x00000020 0x00266A18
E_FLAGS 0x00000024 0x00070000
E_EHSIZE 0x00000028 0x0034
E_PHENTSIZE 0x0000002A 0x0020
E_PHNUM 0x0000002C 0x0001
E_SHENTSIZE 0x0000002E 0x0028
E_SHNUM 0x00000030 0x001D
E_SHSTRNDX 0x00000032 0x001A
Program header tables
=====================
Type Offset VAddr PAddr FileSz MemSz Align Flags
PT_LOAD 0x00000000 0xA0080000 0xA0080000 0x001DE034 0x0025EB20 0x00008000 Execute|Write|Read
"UNIX System V ABI" Is written, which may be info that's important.
P.S. - Sorry if my question isn't very clear, I don't know much about this kind of stuff. Thank you!

In which segment is shstrtable?

I'm working with elf64 files and i was wondering two things, the first is, in which segment the shstrtable is stored, because reviewing readelf -l doesn't appear. And the other question (coming from the first one) is it possible for a section not be inside a segment?
Also i noticed some 'gaps' between some segments. What is inside those gaps?
I am using the following example, that is an hello_world.c:
readelf -lW hello
El tipo del fichero elf es DYN (Fichero objeto compartido)
Entry point 0x1040
There are 11 program headers, starting at offset 64
Encabezados de Programa:
Tipo Desplaz DirVirt DirFísica TamFich TamMem Opt Alin
PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x000268 0x000268 R 0x8
INTERP 0x0002a8 0x00000000000002a8 0x00000000000002a8 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000560 0x000560 R 0x1000
LOAD 0x001000 0x0000000000001000 0x0000000000001000 0x0001e5 0x0001e5 R E 0x1000
LOAD 0x002000 0x0000000000002000 0x0000000000002000 0x000118 0x000118 R 0x1000
LOAD 0x002de8 0x0000000000003de8 0x0000000000003de8 0x000248 0x000250 RW 0x1000
DYNAMIC 0x002df8 0x0000000000003df8 0x0000000000003df8 0x0001e0 0x0001e0 RW 0x8
NOTE 0x0002c4 0x00000000000002c4 0x00000000000002c4 0x000044 0x000044 R 0x4
GNU_EH_FRAME 0x00200c 0x000000000000200c 0x000000000000200c 0x000034 0x000034 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x002de8 0x0000000000003de8 0x0000000000003de8 0x000218 0x000218 R 0x1
mapeo de Sección a Segmento:
Segmento Secciones...
00
01 .interp
02 .interp .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
03 .init .plt .text .fini
04 .rodata .eh_frame_hdr .eh_frame
05 .init_array .fini_array .dynamic .got .got.plt .data .bss
06 .dynamic
07 .note.gnu.build-id .note.ABI-tag
08 .eh_frame_hdr
09
10 .init_array .fini_array .dynamic .got
in which segment the shstrtable is stored, because reviewing readelf -l doesn't appear.
It doesn't appear in any segment.
And the other question (coming from the first one) is it possible for a section not be inside a segment?
Yes.
Also i noticed some 'gaps' between some segments. What is inside those gaps?
Nothing. Segments tell the kernel or the runtime loader how to mmap the on-disk binary into memory. Since mmap operates on whole pages (4096 bytes here), the contents of memory between 0x560 and 0xFFF will be "whatever happens to be in the file at offsets 0x560 through 0xFFF, but the program shouldn't access it and the contents is effectively undefined. See also this answer.

Location of DW_FORM_strp values

I'm trying to understand where DW_FORM_strp attribute values are actually stored in an ELF file (can be found here: https://filebin.net/77bb8359o0ibqu67).
I've found sections .debug_info, .debug_abbrev and .debug_str. I've then parsed the compilation unit header in .debug_info, and found the abbreviation table entry for the compile unit and iterated over its abbreviations. The first abbreviation is DW_AT_producer with form DW_FORM_strp. What I'm wondering is how to find where this offset is located?
From the DWARF4 spec I read: Each debugging information entry begins with a code that represents an entry in a separate abbreviations table. This code is followed directly by a series of attribute values. My understanding of this is that if I go back to the compilation unit header, skip over its content, I should end up at the compilation unit. It starts with a ULEB128 (which I parse), after which the attribute values should come. However, in my ELF file those bytes are all 0. I've run readelf -w on the file, and I see the following:
Contents of the .debug_info section:
Compilation Unit # offset 0x0:
Length: 0xf6 (32-bit)
Version: 4
Abbrev Offset: 0x0
Pointer Size: 8
<0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
<c> DW_AT_producer : (indirect string, offset: 0x62): GNU C11 7.5.0 -mtune=generic -march=x86-64 -g -O0 -fstack-protector-strong
<10> DW_AT_language : 12 (ANSI C99)
<11> DW_AT_name : (indirect string, offset: 0xd9): elf.c
<15> DW_AT_comp_dir : (indirect string, offset: 0xad): /home//struct_analyzer
<19> DW_AT_low_pc : 0x0
<21> DW_AT_high_pc : 0x39
<29> DW_AT_stmt_list : 0x0
This tells me that the offset into the string table is 0x62, and the name is at an offset 0xd9. However, after parsing the ULEB128 which is the first part of any DIE, the next 4 bytes (the first attribute's value) are 0x00 00 00 00. This I don't understand?
Edit to Employed Russian:
Yes, I understand that the offset 0x62 points into the .debug_str section. However, what I'm wondering is where I find this 0x62 value?
Each DIE starts with a ULEB128 value (the abbreviation table entry code), and is followed by the attributes. The first attribute in the corresponding abbreviation table entry is a DW_AT_producer of form DW_FORM_strp. This means that the next 4 bytes in the DIE are supposed to be the offset into .debug_str. However, the next 4 bytes are 0x00 00 00 00, and not 0x62 00 00 00 which is the value I'm looking. 0x62 is residing at offset 0x5c8 into the ELF file, whereas the DIE's attributes start at offset 0x85 as far as I can tell (see attached image for a hexdump (little endian) - highlighted byte is the ULEB128, and the following bytes are what I expect to be the offset into .debug_str).
Edit 2
I've been able to determine that the actual attribute values of form DW_FORM_strp are located in the .rela.debug_info section in the ELF file, so I'll have to read more about that.
The specific ELF file posted for this question also has a rela.debug_info section, which contains relocation entries for the .debug_info section. From the ELF spec:
.relaNAME
This section holds relocation information as described below.
If the file has a loadable segment that includes relocation,
the section's attributes will include the SHF_ALLOC bit. Oth‐
erwise, the bit will be off. By convention, "NAME" is sup‐
plied by the section to which the relocations apply. Thus a
relocation section for .text normally would have the name
.rela.text. This section is of type SHT_RELA.
Each relocation entry in this section (of type Elf64_Rela in this particular case) should be iterated over, and the value of each entry should be addended with the corresponding value in the .debug_info section.
This tells me that the offset into the string table is 0x62, and the name is at an offset 0xd9.
Correct. These offsets are into the .debug_str section, which starts at offset 0x289 in the file.
readelf -WS elf.o | grep debug_str
[12] .debug_str PROGBITS 0000000000000000 000289 0000e4 01 MS 0 0 1
dd if=elf.o bs=1 skip=$((0x289+0x62)) count=75 2>/dev/null
GNU C11 7.5.0 -mtune=generic -march=x86-64 -g -O0 -fstack-protector-strong
dd if=elf.o bs=1 skip=$((0x289+0xd9)) count=5 2>/dev/null
elf.c
P.S.
I've found sections .dwarf_info, .dward_abbrev and .dwarf_str.
None of above sections exit in your file. It helps to be precise when asking questions.

Make all pages readable/writable/executable

I would like to grant full permissions (read, write, and execute) to all memory pages in an ELF binary. Ideally, I'd like to be able to do this as a transformation on a binary or object file, in the same way that symbols can be changed with objcopy. I have not yet found a good way to do this. I would also be okay with a solution that involves running code at startup that calls mprotect on every page with the flags PROT_READ | PROT_WRITE | PROT_EXEC. I've tried this briefly, but I haven't found a good way to know which pages are mapped, and therefore which pages need to be mprotected.
It isn't required that dynamically allocated pages have all permissions, only the pages mapped at program startup.
The following script implements Employed Russian's answer in code:
sets the p_type of the RELRO segment to PT_NULL
sets Flags on LOAD segments to PF_X|PF_W|PF_R.
It depends on pyelftools for python3, which can be installed with pip3 install pyelftools.
#!/usr/bin/env python3
import sys
from elftools.elf.elffile import ELFFile
from elftools.elf.descriptions import describe_p_type
if len(sys.argv) != 2:
print("Usage: elf_rwe <elf>")
name = sys.argv[1]
with open(name, "rb") as f:
elf = ELFFile(f)
rwe_offsets = []
relro_offsets = []
for i in range(elf['e_phnum']):
program_offset = elf['e_phoff'] + i * elf['e_phentsize']
f.seek(program_offset)
program_header = elf.structs.Elf_Phdr.parse_stream(f)
if program_header['p_type'] == "PT_LOAD":
rwe_offsets.append(program_offset)
if program_header['p_type'] == "PT_GNU_RELRO":
relro_offsets.append(program_offset)
f.seek(0)
b = list(f.read())
# Zap RELRO
pt_null = 0
for off in relro_offsets:
b[off] = pt_null
# Fix permissions
p_flags_offset = 4
for off in rwe_offsets:
b[off + p_flags_offset] = 0x7 # PF_X|PF_W|PF_R
with open(name, "wb") as f:
f.write(bytes(b))
I would like to grant full permissions (read, write, and execute) to all memory pages in an ELF binary.
Note that some security policies, such as W^X in selinux will prevent your binary from running.
Ideally, I'd like to be able to do this as a transformation on a binary or object file
Run readelf -Wl on your binary. You'll see something similar to:
$ readelf -Wl /bin/date
Elf file type is EXEC (Executable file)
Entry point 0x4021cf
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R E 0x8
INTERP 0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00dde4 0x00dde4 R E 0x200000
LOAD 0x00de10 0x000000000060de10 0x000000000060de10 0x0004e4 0x0006b0 RW 0x200000
DYNAMIC 0x00de28 0x000000000060de28 0x000000000060de28 0x0001d0 0x0001d0 RW 0x8
NOTE 0x000254 0x0000000000400254 0x0000000000400254 0x000044 0x000044 R 0x4
GNU_EH_FRAME 0x00cb8c 0x000000000040cb8c 0x000000000040cb8c 0x0002f4 0x0002f4 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x00de10 0x000000000060de10 0x000000000060de10 0x0001f0 0x0001f0 R 0x1
What you want to do then is to change Flags on LOAD segments to have PF_X|PF_W|PF_R. The flags are part of Elf{32,64}_Phdr table, and the offset to the table is in stored in e_phoff of the Elf{32,64}_Ehdr (which is stored at the start of every ELF file).
Look in /usr/include/elf.h. Parsing fixed-sized ELF structures involved here isn't complicated.
You are unlikely to find any standard tool that would do this for you (given that this is such an unusual and unsecure thing to do), but a program to change flags is trivial to write in C, Python or Perl.
P.S. You may also need to "zap" the RELRO segment, which could be done by changing its p_type to PT_NULL.
I haven't found a good way to know which pages are mapped, and therefore which pages need to be mprotected.
On Linux, you could parse /proc/self/maps to get that info. Other OSes may offer a different way to achieve the same.

Why are elf segments not page aligned?

readelf -l /bin/ls:
LOAD 0x000000 0x08048000 0x08048000 0x18ff8 0x18ff8 R E 0x1000
LOAD 0x019eec 0x08061eec 0x08061eec 0x003f4 0x01014 RW 0x1000
So the boundary page between the two segments is both read-only and read-writable, how is this possible?
Assuming a page size of 4096 (0x1000) bytes and rounding addresses to page granularities:
The first loadable segment would use the address range [0x8048000--0x8060FFF], both ends inclusive.
The second loadable segment would use the address range [0x8061000--0x8062FFF], of which 0x3F4 bytes starting at address 0x8061EEC would come from the executable, with the rest being zero-filled at load time.
There is no overlap.