Format of 64 bit symbol table entry in ELF - elf

So I am trying to learn about the ELF by taking a close look how everything relates and can't understand why the symbol table entries are the size they are.
When I run readelf -W -S tiny.o I get:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .bss NOBITS 0000000000000000 000200 000001 00 WA 0 0 4
[ 2] .text PROGBITS 0000000000000000 000200 00002a 00 AX 0 0 16
[ 3] .shstrtab STRTAB 0000000000000000 000230 000031 00 0 0 1
[ 4] .symtab SYMTAB 0000000000000000 000270 000090 18 5 5 4
[ 5] .strtab STRTAB 0000000000000000 000300 000015 00 0 0 1
[ 6] .rela.text RELA 0000000000000000 000320 000030 18 4 2 4
Which shows the symbol table having 0x18 or 24 bytes per entry and a total size of (0x300-0x270) or 0x90 giving us 6 entries.
This matches with what readelf -W -s tiny.o says:
Symbol table '.symtab' contains 6 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS tiny.asm
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 2
4: 0000000000000000 0 NOTYPE LOCAL DEFAULT 1 str
5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 2 _start
So clearly the 24 bytes size is correct, but that would correspond to a 32 bit table entry as decribed in this 32 bit spec.
Given that I am on a 64 bit system and the ELF file is 64 bit I would expect the entry to be as decribed in this 64 bit spec.
Upon looking at a hex dump of the file, I found that the layout of the fields in the file seems to be according to this 64 bit pattern.
So then why is the ELF file seemingly using undersized symbol table entries despite using the 64 bit layout and being a 64 bit file?

So then why is the ELF file seemingly using undersized symbol table entries
What makes you believe they are undersized?
In Elf64_Sym, we have:
int st_name
char st_info
char st_other
short st_shndx
<--- 8 bytes
long st_value
<--- 8 bytes
long st_size
<--- 8 bytes.
That's 24 bytes total, exactly as you'd expect.
To convince yourself that everything is in order, compile this program:
#include <elf.h>
#include <stdio.h>
int main()
{
Elf64_Sym s64;
Elf32_Sym s32;
printf("%zu %zu\n", sizeof(s32), sizeof(s64));
return 0;
}
Running it produces 16 24. You can also run it under GDB, and look at offsets of various fields, e.g.
(gdb) p (char*)&s64.st_value - (char*)&s64
$1 = 8
(gdb) p (char*)&s64.st_size - (char*)&s64
$2 = 16

Related

how to zlib inflate a gzip/deflate archive

I have an archive encoded with gzip 1.5. I'm unable to decode it using the C zlib library. zlib inflate() return EC -3 stream.msg = "unknown compression method".
$ gzip --list --verbose vmlinux.z
method crc date time compressed uncompressed ratio uncompressed_name
defla 12169518 Apr 29 13:00 4261643 9199404 53.7% vmlinux
The first 32 bytes of the file are:
00000000 1f 8b 08 08 29 f4 8a 60 00 03 76 6d 6c 69 6e 75 |....)..`..vmlinu|
00000010 78 00 ec 9a 7f 54 1c 55 96 c7 6f 75 37 d0 fc 70 |x....T.U..ou7..p|
I see the first 18 bytes are the RFC-1952 gzip header.
After the NULL, I expect the next byte to be RFC-1951 deflate or RFC-1950 zlib (I'm not sure which)
So, I pass zlib inflate() a z_stream:next_in pointing to to the byte #0x12.
If this were deflate encoded, then I would expect the next byte #0x12 to be 0aabbbbb (BFINAL=0 and BTYPE=some compression)
If this were zlib encoded, I would expect the next byte #0x12 to take the form 0aaa1000 bbbccccc
Instead, I see #0x12 EC = 1110 1100 Which fits neither of those.
For my code, I took the uncompress() code and modified it slightly with allocators appropriate to my environment and several different experiments with the window bits (including 15+16, -MAX_WBITS, and MAX_WBITS).
int ZEXPORT unzip (dest, destLen, source, sourceLen)
Bytef *dest;
uLongf *destLen;
const Bytef *source;
uLong sourceLen;
{
z_stream stream;
int err;
stream.next_in = (Bytef*)source;
stream.avail_in = (uInt)sourceLen;
/* Check for source > 64K on 16-bit machine: */
if ((uLong)stream.avail_in != sourceLen) return Z_BUF_ERROR;
stream.next_out = dest;
stream.avail_out = (uInt)*destLen;
if ((uLong)stream.avail_out != *destLen) return Z_BUF_ERROR;
stream.zalloc = (alloc_func)my_alloc;
stream.zfree = (free_func)my_free;
/*err = inflateInit(&stream);*/
err = inflateInit2(&stream, 15 + 16);
if (err != Z_OK) return err;
err = inflate(&stream, Z_FINISH);
if (err != Z_STREAM_END) {
inflateEnd(&stream);
return err == Z_OK ? Z_BUF_ERROR : err;
}
*destLen = stream.total_out;
err = inflateEnd(&stream);
return err;
}
How can I correct my decoding of this file?
That should work fine, assuming that my_alloc and my_free do what they need to do. You should verify that you are actually giving unzip() the data that you think you are giving it. The data you give it needs to start with the 1f 8b.
(Side comment: "unzip" is a lousy name for the function. It does not unzip, since zip is an entirely different format than either gzip or zlib. "gunzip" or "ungzip" would be appropriate.)
You are manually reading the bits in the deflate stream in the wrong order. The least significant bits are first. The low three bits of ec are 100, indicating a non-last dynamic block. 0 for non-last, then 10 for dynamic.
You can use infgen to disassemble a deflate stream. Its output for the 14 bytes provided is this initial portion of a dynamic block:
dynamic
count 286 27 16
code 0 5
code 2 7
code 3 7
code 4 5
code 5 5
code 6 4
code 7 4
code 8 2
code 9 3
code 10 2
code 11 4
code 12 4
code 16 7
code 17 7
lens 4 6 7 7 7 8 8 8 7 8
repeat 3
lens 10

why the number of vector of INT0 is 1 not 2 as datasheet?

I am using an ATmega32 to do interrupt
when i trying to do driver of external interrupt 0 , faced me a problem
Interrupt Vectors Table in ATmega32
Interrupt Vectors code in ISR(vector)
In iom32.h code , we see that ((INT0_vect " _VECTOR(1) ")) it's number 1 but in data sheet we see that the number is 2 , why ?
The datasheet starts numbering with the reset vector. But there is no need for an explicit define (like RESET_vect) for the reset vector, since it will not be used in conjunction with ISR(). So in the header/AVRGCC implementation it is omitted.
If you compile this
ISR(INT0_vect) { }
and look at the interrupt vector table
00000000 <__vectors>:
0: 0c 94 46 00 jmp 0x8c ; 0x8c <__ctors_end>
4: 0c 94 5f 00 jmp 0xbe ; 0xbe <__vector_1>
you can see that __vector_1 is placed at byte address 4, which corresponds to the word address 2 from the data sheet.

What is wrong with this LDAP filter packet?

I am trying to port a program which queries an LDAP server from Perl to Go, and with the Go version I am receiving a response that the filter is malformed:
00000057: LdapErr: DSID-0C0C0968, comment: The server was unable to decode a search request filter, data 0, v1db1\x00
I have used tcpdump to capture the data transmitted to the server with both the Perl and Go versions of my program, and have found that they are sending slightly different filter packets. This question is not about any possible bugs in the Go program, but simply about understanding the contents of the LDAP filter packets.
The encoded filter is:
(objectClass=*)
And the Perl-generated packet (which the server likes) looks like this:
ASCII . . o b j e c t C l a s s
Hex 87 0b 6f 62 6a 65 63 74 43 6c 61 73 73
Byte# 0 1 2 3 4 5 6 7 8 9 10 11 12
The Go-generated packet (which the server doesn't like) looks like this:
ASCII . . . . o b j e c t C l a s s
Hex a7 0d 04 0b 6f 62 6a 65 63 74 43 6c 61 73 73
Byte# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
This is my own breakdown of the packets:
##Byte 0: Tag
When I dissect Byte 0 from both packets, I see they are identical, except for the Primitive/Constructed bit, which is set to Primitive in the Perl version, and Constructed in the Go version. See DER encoding for details.
Bit# 87 6 54321
Perl 10 0 00111
Go 10 1 00111
Bits 87: In both packets, 10 = Context Specific
Bit 6: In the Perl version 0 = Primitive, in the Go version 1 = Constructed
Bits 54321: 00111 = 7 = Object descriptor
##Byte 1: Length
11 bytes for the Perl version, 13 for the Go version
##Bytes 2-3 for the Go version
Byte 2: Tag 04: Substring Filter (See section 4.5.1 of RFC 4511)
Byte 3: Length of 11 bytes
##Remainder: Payload
For both packets this is simply the ASCII text objectClass
My reading of RFC 4511 section 4.5.1 suggests that the Go version is "more" correct, yet the Perl version is the one that works with the server. What gives?
Wireshark is able to parse both packets, and interprets them both equally.
The Perl version is correct, and the Go version is incorrect.
As you point out, RFC 4511 section 4.5.1 specifies encoding for the filter elements, like:
Filter ::= CHOICE {
and [0] SET SIZE (1..MAX) OF filter Filter,
or [1] SET SIZE (1..MAX) OF filter Filter,
not [2] Filter,
equalityMatch [3] AttributeValueAssertion,
substrings [4] SubstringFilter,
greaterOrEqual [5] AttributeValueAssertion,
lessOrEqual [6] AttributeValueAssertion,
present [7] AttributeDescription,
approxMatch [8] AttributeValueAssertion,
extensibleMatch [9] MatchingRuleAssertion,
... }
And in this case, the relevant portion is:
present [7] AttributeDescription,
The AttributeDescription element is defined in section 4.1.4 of the same specification:
AttributeDescription ::= LDAPString
-- Constrained to <attributedescription>
-- [RFC4512]
And from section 4.1.2:
LDAPString ::= OCTET STRING -- UTF-8 encoded,
-- [ISO10646] characters
So this means that the present filter component is an octet string, which is a primitive element. Go is incorrectly converting it to a constructed element, and the directory server is correctly rejecting that malformed request.

Understanding the relocation table output from readelf

For example, running the command:
readelf -r /bin/ls | head -n 20
I get the following output:
Relocation section '.rela.dyn' at offset 0x15b8 contains 7 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000619ff0 003e00000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
00000061a580 006f00000005 R_X86_64_COPY 000000000061a580 __progname + 0
00000061a590 006c00000005 R_X86_64_COPY 000000000061a590 stdout + 0
00000061a5a0 007800000005 R_X86_64_COPY 000000000061a5a0 optind + 0
00000061a5a8 007a00000005 R_X86_64_COPY 000000000061a5a8 optarg + 0
00000061a5b0 007400000005 R_X86_64_COPY 000000000061a5b0 __progname_full + 0
00000061a5b8 007700000005 R_X86_64_COPY 000000000061a5b8 stderr + 0
Relocation section '.rela.plt' at offset 0x1660 contains 105 entries:
Offset Info Type Sym. Value Sym. Name + Addend
00000061a018 000100000007 R_X86_64_JUMP_SLO 0000000000000000 __ctype_toupper_loc + 0
00000061a020 000200000007 R_X86_64_JUMP_SLO 0000000000000000 getenv + 0
00000061a028 000300000007 R_X86_64_JUMP_SLO 0000000000000000 sigprocmask + 0
00000061a030 000400000007 R_X86_64_JUMP_SLO 0000000000000000 raise + 0
00000061a038 007000000007 R_X86_64_JUMP_SLO 00000000004020a0 free + 0
00000061a040 000500000007 R_X86_64_JUMP_SLO 0000000000000000 localtime + 0
00000061a048 000600000007 R_X86_64_JUMP_SLO 0000000000000000 __mempcpy_chk + 0
I do not understand this output and wanted some clarification.
Does the 1st column, offset, indicate where these symbolic references are in the .text segment? What is meant by the Info and Type columns, I thought relocations just mapped a symbol reference to a definition, so I don't understand how there can be different types? Why do certain symbol names have 0 as the address for their value... I can't imagine they all map to the same spot in the text segment? Finally, why does the relocation table even exist in the final executable? Doesn't it take up extra space and all the references have already been resolved for the last link command that generates the executable?
Here is a clear (I hope so) to the readelf output:
Offset is the offset where the symbol value should go
Info tells us two things - the type (terminates the exact calculation depends on the arch) and the symbol index in the symtab
Type - type of the symbol according to the ABI
Sym value is the addend to be added to the symbol resolution
Sym name and addend - a pretty printing of the symbol name + addend.
See this for a calculation example:
https://web.archive.org/web/20150324024617/http://mylinuxbook.com/readelf-command/
more info:
http://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-54839.html

PE Export Directory Table's OrdinalBase field ignored?

In my experience and that of others (http://webster.cs.ucr.edu/Page_TechDocs/pe.txt), the PE/COFF specification document incorrectly claims that the Export Address Table indices that are contained in the Ordinal Table are relative to the Ordinal Base, and even gives an incorrect example (Section 5.3). In actuality, the indices in the Ordinal Table are 0-based indices into the Address Table for the normal case in which Ordinal Base = 1. I have seen this in VS Studio generated PE libraries and in system libraries like Kernel32.dll.
My question is, have you ever observed a binary with an Ordinal Base that was not equal to 1? I want to know if this an off-by-one error, or if the Ordinal Base is never applied to Ordinal Table entries.
Here's a dump for mfc42.dll, version 6.06.8064.0.
Microsoft (R) COFF/PE Dumper Version 9.00.30729.01
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file mfc42.dll
File Type: DLL
Section contains the following exports for MFC42.dll
00000000 characteristics
4D79A4A3 time date stamp Fri Mar 11 05:27:15 2011
0.00 version
5 ordinal base
6939 number of functions
6 number of names
ordinal hint RVA name
5 0 0000ED7C ?classCCachedDataPathProperty#CCachedDataPathProperty##2UCRuntimeClass##B
6 1 0000ED44 ?classCDataPathProperty#CDataPathProperty##2UCRuntimeClass##B
7 2 000DEEAC DllCanUnloadNow
8 3 000DEE6C DllGetClassObject
9 4 000DED0A DllRegisterServer
10 5 000DEEDE DllUnregisterServer
256 0004F84F [NONAME]
[...]
6943 0003B412 [NONAME]
Here's how it looks in the binary:
;
; Export directory for MFC42.dll
;
dd 0 ; Characteristics
dd 4D79A4A3h ; TimeDateStamp: Fri Mar 11 05:27:15 2011
dw 0 ; MajorVersion
dw 0 ; MinorVersion
dd rva aMfc42_dll ; Name
dd 5 ; Base
dd 1B1Bh ; NumberOfFunctions
dd 6 ; NumberOfNames
dd rva functbl ; AddressOfFunctions
dd rva nametbl ; AddressOfNames
dd rva nameordtbl ; AddressOfNameOrdinals
;
; Export Address Table for MFC42.dll
;
functbl dd rva ?classCCachedDataPathProperty#CCachedDataPathProperty##2UCRuntimeClass##B; 0
dd rva ?classCDataPathProperty#CDataPathProperty##2UCRuntimeClass##B; 1
dd rva DllCanUnloadNow ; 2
dd rva DllGetClassObject; 3
dd rva DllRegisterServer; 4
dd rva DllUnregisterServer; 5
dd 0F5h dup(rva __ImageBase); 6
dd rva ??0_AFX_CHECKLIST_STATE##QAE#XZ; 251
[...]
;
; Export Names Table for MFC42.dll
;
nametbl dd rva a?classccachedd, rva a?classcdatapat, rva aDllcanunloadno
dd rva aDllgetclassobj, rva aDllregisterser, rva aDllunregisters
;
; Export Ordinals Table for MFC42.dll
;
nameordtbl dw 0, 1, 2, 3, 4, 5
So yes, it seems you're right and the indexes in the ordinal table are 0-based.
It's not an off-by-one error and the Ordinal Base is not applied to the Ordinal Table entries but to the calulation of the ordinal itself. And yes, the Microsoft PE specification (http://msdn.microsoft.com/en-us/library/windows/hardware/gg463119.aspx, section 5.3.4) is wrong. This is how the calculations should be done:
i = Search_ExportNamePointerTable(ExportName);
ordinal = ExportOrdinalTable[i] + OrdinalBase; // The "+ OrdinalBase" is missing in the official PE specification
SymbolRVA = ExportAddressTable[ordinal - OrdinalBase];
Or, expressed in a different way:
i = Search_ExportNamePointerTable(ExportName);
offset = ExportOrdinalTable[i];
SymbolRVA = ExportAddressTable[offset];
ordinal = OrdinalBase + offset;
If I dump my mfc42.dll...
dumpbin mfc42.dll /exports |more
...this is what I get:
Microsoft (R) COFF/PE Dumper Version 12.00.20827.3
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file mfc42.dll
File Type: DLL
Section contains the following exports for MFC42.dll
00000000 characteristics
4D798B26 time date stamp Fri Mar 11 03:38:30 2011
0.00 version
5 ordinal base
6888 number of functions
8 number of names
ordinal hint RVA name
1452 0 000EF5D8 ?AfxFreeLibrary##YAHPEAUHINSTANCE__###Z
1494 1 000EF5A4 ?AfxLoadLibrary##YAPEAUHINSTANCE__##PEBD#Z
1497 2 000F8344 ?AfxLockGlobals##YAXH#Z
1587 3 000F83DC ?AfxUnlockGlobals##YAXH#Z
7 4 000FC83C DllCanUnloadNow
8 5 000FC7E0 DllGetClassObject
9 6 000FC870 DllRegisterServer
10 7 000FC87C DllUnregisterServer
5 0001C910 [NONAME]
6 0001C8E8 [NONAME]
256 0005DEC0 [NONAME]
257 000423C0 [NONAME]
258 00042400 [NONAME]
259 00042440 [NONAME]
[...]
The 7th function (for example) above is DllRegisterServer, which corresponds to the 7th word (0x0004) in the export ordinal table in the below hex dump of mfc42.dll. The start is A7 05.
59 CC 12 00 6B CC 12 00 A7 05 D1 05 D4 05 2E 06
02 00 03 00 04 00 05 00 4D 46 43 34 32 2E 64 6C
The calculations:
i = Search_ExportNamePointerTable("DllRegisterServer") = 7 - 1 = 6 // zero-based
offset = ExportOrdinalTable[6] = 4
SymbolRVA = ExportAddressTable[4] = ...
ordinal = OrdinalBase + offset = 5 + 4 = 9
NO, PE Export Directory Table's OrdinalBase field is NOT ignored!
The sample provided above (mfc42.dll) is a good one (since its Ordinal Base is not 1).
Here two remarks about this issue:
. the output of the Dump tool is correct as far as the ordinal field is concerned. It shows, that the Base field is 5. This means that, when importing an exported function from mfc42.dll by name, the computed offset in the Export Address Table will be x-5. The Microsoft specification Section 5.3 is correct.
. the output of the Dump tool is NOT correct as far as the Hint is concerned. Export Tables have NO Hint field, ONLY Import tables have a Hint field.
As a matter of fact, the Ordinal Base is applied NOT in the Ordinal Table BUT when retrieving the index of the Address Table!