PE Export Directory Table's OrdinalBase field ignored? - dll

In my experience and that of others (http://webster.cs.ucr.edu/Page_TechDocs/pe.txt), the PE/COFF specification document incorrectly claims that the Export Address Table indices that are contained in the Ordinal Table are relative to the Ordinal Base, and even gives an incorrect example (Section 5.3). In actuality, the indices in the Ordinal Table are 0-based indices into the Address Table for the normal case in which Ordinal Base = 1. I have seen this in VS Studio generated PE libraries and in system libraries like Kernel32.dll.
My question is, have you ever observed a binary with an Ordinal Base that was not equal to 1? I want to know if this an off-by-one error, or if the Ordinal Base is never applied to Ordinal Table entries.

Here's a dump for mfc42.dll, version 6.06.8064.0.
Microsoft (R) COFF/PE Dumper Version 9.00.30729.01
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file mfc42.dll
File Type: DLL
Section contains the following exports for MFC42.dll
00000000 characteristics
4D79A4A3 time date stamp Fri Mar 11 05:27:15 2011
0.00 version
5 ordinal base
6939 number of functions
6 number of names
ordinal hint RVA name
5 0 0000ED7C ?classCCachedDataPathProperty#CCachedDataPathProperty##2UCRuntimeClass##B
6 1 0000ED44 ?classCDataPathProperty#CDataPathProperty##2UCRuntimeClass##B
7 2 000DEEAC DllCanUnloadNow
8 3 000DEE6C DllGetClassObject
9 4 000DED0A DllRegisterServer
10 5 000DEEDE DllUnregisterServer
256 0004F84F [NONAME]
[...]
6943 0003B412 [NONAME]
Here's how it looks in the binary:
;
; Export directory for MFC42.dll
;
dd 0 ; Characteristics
dd 4D79A4A3h ; TimeDateStamp: Fri Mar 11 05:27:15 2011
dw 0 ; MajorVersion
dw 0 ; MinorVersion
dd rva aMfc42_dll ; Name
dd 5 ; Base
dd 1B1Bh ; NumberOfFunctions
dd 6 ; NumberOfNames
dd rva functbl ; AddressOfFunctions
dd rva nametbl ; AddressOfNames
dd rva nameordtbl ; AddressOfNameOrdinals
;
; Export Address Table for MFC42.dll
;
functbl dd rva ?classCCachedDataPathProperty#CCachedDataPathProperty##2UCRuntimeClass##B; 0
dd rva ?classCDataPathProperty#CDataPathProperty##2UCRuntimeClass##B; 1
dd rva DllCanUnloadNow ; 2
dd rva DllGetClassObject; 3
dd rva DllRegisterServer; 4
dd rva DllUnregisterServer; 5
dd 0F5h dup(rva __ImageBase); 6
dd rva ??0_AFX_CHECKLIST_STATE##QAE#XZ; 251
[...]
;
; Export Names Table for MFC42.dll
;
nametbl dd rva a?classccachedd, rva a?classcdatapat, rva aDllcanunloadno
dd rva aDllgetclassobj, rva aDllregisterser, rva aDllunregisters
;
; Export Ordinals Table for MFC42.dll
;
nameordtbl dw 0, 1, 2, 3, 4, 5
So yes, it seems you're right and the indexes in the ordinal table are 0-based.

It's not an off-by-one error and the Ordinal Base is not applied to the Ordinal Table entries but to the calulation of the ordinal itself. And yes, the Microsoft PE specification (http://msdn.microsoft.com/en-us/library/windows/hardware/gg463119.aspx, section 5.3.4) is wrong. This is how the calculations should be done:
i = Search_ExportNamePointerTable(ExportName);
ordinal = ExportOrdinalTable[i] + OrdinalBase; // The "+ OrdinalBase" is missing in the official PE specification
SymbolRVA = ExportAddressTable[ordinal - OrdinalBase];
Or, expressed in a different way:
i = Search_ExportNamePointerTable(ExportName);
offset = ExportOrdinalTable[i];
SymbolRVA = ExportAddressTable[offset];
ordinal = OrdinalBase + offset;
If I dump my mfc42.dll...
dumpbin mfc42.dll /exports |more
...this is what I get:
Microsoft (R) COFF/PE Dumper Version 12.00.20827.3
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file mfc42.dll
File Type: DLL
Section contains the following exports for MFC42.dll
00000000 characteristics
4D798B26 time date stamp Fri Mar 11 03:38:30 2011
0.00 version
5 ordinal base
6888 number of functions
8 number of names
ordinal hint RVA name
1452 0 000EF5D8 ?AfxFreeLibrary##YAHPEAUHINSTANCE__###Z
1494 1 000EF5A4 ?AfxLoadLibrary##YAPEAUHINSTANCE__##PEBD#Z
1497 2 000F8344 ?AfxLockGlobals##YAXH#Z
1587 3 000F83DC ?AfxUnlockGlobals##YAXH#Z
7 4 000FC83C DllCanUnloadNow
8 5 000FC7E0 DllGetClassObject
9 6 000FC870 DllRegisterServer
10 7 000FC87C DllUnregisterServer
5 0001C910 [NONAME]
6 0001C8E8 [NONAME]
256 0005DEC0 [NONAME]
257 000423C0 [NONAME]
258 00042400 [NONAME]
259 00042440 [NONAME]
[...]
The 7th function (for example) above is DllRegisterServer, which corresponds to the 7th word (0x0004) in the export ordinal table in the below hex dump of mfc42.dll. The start is A7 05.
59 CC 12 00 6B CC 12 00 A7 05 D1 05 D4 05 2E 06
02 00 03 00 04 00 05 00 4D 46 43 34 32 2E 64 6C
The calculations:
i = Search_ExportNamePointerTable("DllRegisterServer") = 7 - 1 = 6 // zero-based
offset = ExportOrdinalTable[6] = 4
SymbolRVA = ExportAddressTable[4] = ...
ordinal = OrdinalBase + offset = 5 + 4 = 9

NO, PE Export Directory Table's OrdinalBase field is NOT ignored!
The sample provided above (mfc42.dll) is a good one (since its Ordinal Base is not 1).
Here two remarks about this issue:
. the output of the Dump tool is correct as far as the ordinal field is concerned. It shows, that the Base field is 5. This means that, when importing an exported function from mfc42.dll by name, the computed offset in the Export Address Table will be x-5. The Microsoft specification Section 5.3 is correct.
. the output of the Dump tool is NOT correct as far as the Hint is concerned. Export Tables have NO Hint field, ONLY Import tables have a Hint field.
As a matter of fact, the Ordinal Base is applied NOT in the Ordinal Table BUT when retrieving the index of the Address Table!

Related

How to compute custom timestamp in COBOL85 Tandem?

I want to calculate timestamp for custom date and time.
E.g 23/09/2022 4:30:45
To calculate Julian Timestamp of current date and time you can use JULIANTIMESTAMP using ENTER TAL but for Custom timestamp we have COMPUTETIMESTAMP GPC .
Syntax from GPC Reference manual
jval := COMPUTETIMESTAMP ( date-n-time,
[error-mask] );
Data types
jval is 64-bit Fixed Julian timestamp .
date-n-time is an integer array of date and time [YYYY,MM,DD,HH,,MM,SS,MIL,MIC] all elements of array are compulsory.
error-mask is integer array of bits 1 or 0 of length 8.
So let's Jump to the main Question how to Calculate Custom timestamp in COBOL85 . I have small example .
?ANSI
?save param
?symbols
?inspect
IDENTIFICATION DIVISION.
PROGRAM-ID. HELLO.
ENVIRONMENT DIVISION.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 USER-FLD-CUST PIC X(50).
01 ARR.
03 DT PIC S9(4) COMP OCCURS 8 TIMES.
01 VAL PIC 9(16).
01 JTIME PIC S9(18) VALUE 0.
01 CER.
03 ERR PIC S9(1) COMP OCCURS 8 TIMES.
PROCEDURE DIVISION.
PROGRAM-BEGIN.
MOVE 2022 TO DT(1).
MOVE 99 TO DT(2).
MOVE 30 TO DT(3).
MOVE 10 TO DT(4).
MOVE 00 TO DT(5).
MOVE 00 TO DT(6).
MOVE 000 TO DT(7).
MOVE 000 TO DT(8).
ENTER TAL "COMPUTETIMESTAMP" USING ARR , CER
GIVING JTIME.
IF JTIME = -1
DISPLAY "INVALID DATE"
ELSE
DISPLAY JTIME.
STOP RUN.

how to zlib inflate a gzip/deflate archive

I have an archive encoded with gzip 1.5. I'm unable to decode it using the C zlib library. zlib inflate() return EC -3 stream.msg = "unknown compression method".
$ gzip --list --verbose vmlinux.z
method crc date time compressed uncompressed ratio uncompressed_name
defla 12169518 Apr 29 13:00 4261643 9199404 53.7% vmlinux
The first 32 bytes of the file are:
00000000 1f 8b 08 08 29 f4 8a 60 00 03 76 6d 6c 69 6e 75 |....)..`..vmlinu|
00000010 78 00 ec 9a 7f 54 1c 55 96 c7 6f 75 37 d0 fc 70 |x....T.U..ou7..p|
I see the first 18 bytes are the RFC-1952 gzip header.
After the NULL, I expect the next byte to be RFC-1951 deflate or RFC-1950 zlib (I'm not sure which)
So, I pass zlib inflate() a z_stream:next_in pointing to to the byte #0x12.
If this were deflate encoded, then I would expect the next byte #0x12 to be 0aabbbbb (BFINAL=0 and BTYPE=some compression)
If this were zlib encoded, I would expect the next byte #0x12 to take the form 0aaa1000 bbbccccc
Instead, I see #0x12 EC = 1110 1100 Which fits neither of those.
For my code, I took the uncompress() code and modified it slightly with allocators appropriate to my environment and several different experiments with the window bits (including 15+16, -MAX_WBITS, and MAX_WBITS).
int ZEXPORT unzip (dest, destLen, source, sourceLen)
Bytef *dest;
uLongf *destLen;
const Bytef *source;
uLong sourceLen;
{
z_stream stream;
int err;
stream.next_in = (Bytef*)source;
stream.avail_in = (uInt)sourceLen;
/* Check for source > 64K on 16-bit machine: */
if ((uLong)stream.avail_in != sourceLen) return Z_BUF_ERROR;
stream.next_out = dest;
stream.avail_out = (uInt)*destLen;
if ((uLong)stream.avail_out != *destLen) return Z_BUF_ERROR;
stream.zalloc = (alloc_func)my_alloc;
stream.zfree = (free_func)my_free;
/*err = inflateInit(&stream);*/
err = inflateInit2(&stream, 15 + 16);
if (err != Z_OK) return err;
err = inflate(&stream, Z_FINISH);
if (err != Z_STREAM_END) {
inflateEnd(&stream);
return err == Z_OK ? Z_BUF_ERROR : err;
}
*destLen = stream.total_out;
err = inflateEnd(&stream);
return err;
}
How can I correct my decoding of this file?
That should work fine, assuming that my_alloc and my_free do what they need to do. You should verify that you are actually giving unzip() the data that you think you are giving it. The data you give it needs to start with the 1f 8b.
(Side comment: "unzip" is a lousy name for the function. It does not unzip, since zip is an entirely different format than either gzip or zlib. "gunzip" or "ungzip" would be appropriate.)
You are manually reading the bits in the deflate stream in the wrong order. The least significant bits are first. The low three bits of ec are 100, indicating a non-last dynamic block. 0 for non-last, then 10 for dynamic.
You can use infgen to disassemble a deflate stream. Its output for the 14 bytes provided is this initial portion of a dynamic block:
dynamic
count 286 27 16
code 0 5
code 2 7
code 3 7
code 4 5
code 5 5
code 6 4
code 7 4
code 8 2
code 9 3
code 10 2
code 11 4
code 12 4
code 16 7
code 17 7
lens 4 6 7 7 7 8 8 8 7 8
repeat 3
lens 10

Format of 64 bit symbol table entry in ELF

So I am trying to learn about the ELF by taking a close look how everything relates and can't understand why the symbol table entries are the size they are.
When I run readelf -W -S tiny.o I get:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .bss NOBITS 0000000000000000 000200 000001 00 WA 0 0 4
[ 2] .text PROGBITS 0000000000000000 000200 00002a 00 AX 0 0 16
[ 3] .shstrtab STRTAB 0000000000000000 000230 000031 00 0 0 1
[ 4] .symtab SYMTAB 0000000000000000 000270 000090 18 5 5 4
[ 5] .strtab STRTAB 0000000000000000 000300 000015 00 0 0 1
[ 6] .rela.text RELA 0000000000000000 000320 000030 18 4 2 4
Which shows the symbol table having 0x18 or 24 bytes per entry and a total size of (0x300-0x270) or 0x90 giving us 6 entries.
This matches with what readelf -W -s tiny.o says:
Symbol table '.symtab' contains 6 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS tiny.asm
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 2
4: 0000000000000000 0 NOTYPE LOCAL DEFAULT 1 str
5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 2 _start
So clearly the 24 bytes size is correct, but that would correspond to a 32 bit table entry as decribed in this 32 bit spec.
Given that I am on a 64 bit system and the ELF file is 64 bit I would expect the entry to be as decribed in this 64 bit spec.
Upon looking at a hex dump of the file, I found that the layout of the fields in the file seems to be according to this 64 bit pattern.
So then why is the ELF file seemingly using undersized symbol table entries despite using the 64 bit layout and being a 64 bit file?
So then why is the ELF file seemingly using undersized symbol table entries
What makes you believe they are undersized?
In Elf64_Sym, we have:
int st_name
char st_info
char st_other
short st_shndx
<--- 8 bytes
long st_value
<--- 8 bytes
long st_size
<--- 8 bytes.
That's 24 bytes total, exactly as you'd expect.
To convince yourself that everything is in order, compile this program:
#include <elf.h>
#include <stdio.h>
int main()
{
Elf64_Sym s64;
Elf32_Sym s32;
printf("%zu %zu\n", sizeof(s32), sizeof(s64));
return 0;
}
Running it produces 16 24. You can also run it under GDB, and look at offsets of various fields, e.g.
(gdb) p (char*)&s64.st_value - (char*)&s64
$1 = 8
(gdb) p (char*)&s64.st_size - (char*)&s64
$2 = 16

What is wrong with this LDAP filter packet?

I am trying to port a program which queries an LDAP server from Perl to Go, and with the Go version I am receiving a response that the filter is malformed:
00000057: LdapErr: DSID-0C0C0968, comment: The server was unable to decode a search request filter, data 0, v1db1\x00
I have used tcpdump to capture the data transmitted to the server with both the Perl and Go versions of my program, and have found that they are sending slightly different filter packets. This question is not about any possible bugs in the Go program, but simply about understanding the contents of the LDAP filter packets.
The encoded filter is:
(objectClass=*)
And the Perl-generated packet (which the server likes) looks like this:
ASCII . . o b j e c t C l a s s
Hex 87 0b 6f 62 6a 65 63 74 43 6c 61 73 73
Byte# 0 1 2 3 4 5 6 7 8 9 10 11 12
The Go-generated packet (which the server doesn't like) looks like this:
ASCII . . . . o b j e c t C l a s s
Hex a7 0d 04 0b 6f 62 6a 65 63 74 43 6c 61 73 73
Byte# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
This is my own breakdown of the packets:
##Byte 0: Tag
When I dissect Byte 0 from both packets, I see they are identical, except for the Primitive/Constructed bit, which is set to Primitive in the Perl version, and Constructed in the Go version. See DER encoding for details.
Bit# 87 6 54321
Perl 10 0 00111
Go 10 1 00111
Bits 87: In both packets, 10 = Context Specific
Bit 6: In the Perl version 0 = Primitive, in the Go version 1 = Constructed
Bits 54321: 00111 = 7 = Object descriptor
##Byte 1: Length
11 bytes for the Perl version, 13 for the Go version
##Bytes 2-3 for the Go version
Byte 2: Tag 04: Substring Filter (See section 4.5.1 of RFC 4511)
Byte 3: Length of 11 bytes
##Remainder: Payload
For both packets this is simply the ASCII text objectClass
My reading of RFC 4511 section 4.5.1 suggests that the Go version is "more" correct, yet the Perl version is the one that works with the server. What gives?
Wireshark is able to parse both packets, and interprets them both equally.
The Perl version is correct, and the Go version is incorrect.
As you point out, RFC 4511 section 4.5.1 specifies encoding for the filter elements, like:
Filter ::= CHOICE {
and [0] SET SIZE (1..MAX) OF filter Filter,
or [1] SET SIZE (1..MAX) OF filter Filter,
not [2] Filter,
equalityMatch [3] AttributeValueAssertion,
substrings [4] SubstringFilter,
greaterOrEqual [5] AttributeValueAssertion,
lessOrEqual [6] AttributeValueAssertion,
present [7] AttributeDescription,
approxMatch [8] AttributeValueAssertion,
extensibleMatch [9] MatchingRuleAssertion,
... }
And in this case, the relevant portion is:
present [7] AttributeDescription,
The AttributeDescription element is defined in section 4.1.4 of the same specification:
AttributeDescription ::= LDAPString
-- Constrained to <attributedescription>
-- [RFC4512]
And from section 4.1.2:
LDAPString ::= OCTET STRING -- UTF-8 encoded,
-- [ISO10646] characters
So this means that the present filter component is an octet string, which is a primitive element. Go is incorrectly converting it to a constructed element, and the directory server is correctly rejecting that malformed request.

How to find the "lexical file" in Wordnet?

If you look at the original Wordnet search and select "Display options: Show Lexical File Info", you'll see an extremely useful classification of words called lexical file. Eg for "filling" we have:
<noun.substance>S: (n) filling, fill (any material that fills a space or container)
<noun.process>S: (n) filling (flow into something (as a container))
<noun.food>S: (n) filling (a food mixture used to fill pastry or sandwiches etc.)
<noun.artifact>S: (n) woof, weft, filling, pick (the yarn woven across the warp yarn in weaving)
<noun.artifact>S: (n) filling ((dentistry) a dental appliance consisting of ...)
<noun.act>S: (n) filling (the act of filling something)
The first thing in brackets is the "lexical file". Unfortunately I have not been able to find a SPARQL endpoint that provides this info
The latest RDF translation of Wordnet 3.0 points to two things:
Talis SPARQL endpoint. Use eg this query to check there's no such info:
DESCRIBE <http://purl.org/vocabularies/princeton/wn30/synset-chair-noun-1>
W3C's mapping description. Appendix D "Conversion details" describes something useful: wn:classifiedByTopic.
But it's not the same as lexical file, and is quite incomplete. Eg "chair" has nothing, while one of the senses of "completion" is in the topic "American Football"
DESCRIBE <http://purl.org/vocabularies/princeton/wn30/synset-completion-noun-1> ->
<j.1:classifiedByTopic rdf:resource="http://purl.org/vocabularies/princeton/wn30/synset-American_football-noun-1"/>
The question: is there a public Wordnet query API, or a database, that provides the lexical file information?
Using the Python NLTK interface:
from nltk.corpus import wordnet as wn
for synset in wn.synsets('can'):
print synset.lexname
I don't think you can find it in the RDF/OWL Representation of WordNet. It's in the WordNet distribution though: dict/lexnames. Here is the content of the file as of WordNet 3.0:
00 adj.all 3
01 adj.pert 3
02 adv.all 4
03 noun.Tops 1
04 noun.act 1
05 noun.animal 1
06 noun.artifact 1
07 noun.attribute 1
08 noun.body 1
09 noun.cognition 1
10 noun.communication 1
11 noun.event 1
12 noun.feeling 1
13 noun.food 1
14 noun.group 1
15 noun.location 1
16 noun.motive 1
17 noun.object 1
18 noun.person 1
19 noun.phenomenon 1
20 noun.plant 1
21 noun.possession 1
22 noun.process 1
23 noun.quantity 1
24 noun.relation 1
25 noun.shape 1
26 noun.state 1
27 noun.substance 1
28 noun.time 1
29 verb.body 2
30 verb.change 2
31 verb.cognition 2
32 verb.communication 2
33 verb.competition 2
34 verb.consumption 2
35 verb.contact 2
36 verb.creation 2
37 verb.emotion 2
38 verb.motion 2
39 verb.perception 2
40 verb.possession 2
41 verb.social 2
42 verb.stative 2
43 verb.weather 2
44 adj.ppl 3
For each entry of dict/data.*, the second number is the lexical file info. For example, this filling entry contains the number 13, which is noun.food.
07883031 13 n 01 filling 0 002 # 07882497 n 0000 ~ 07883156 n 0000 | a food mixture used to fill pastry or sandwiches etc.
It can be done through MIT JWI (MIT Java Wordnet Interface) a Java API to query Wordnet. There's a topic in this link showing how to implement a java class to access lexicographic
This is what worked for me,
Synset[] synsets = database.getSynsets(wordStr);
ReferenceSynset referenceSynset = (ReferenceSynset) synsets[i];
int lexicalCode =referenceSynset.getLexicalFileNumber();
Then use above table to deduce "lexnames" e.g. noun.time
If you're on Windows, chances are it is in your appdata, in the local directory. To get there, you will want to open your file browser, go to the top, and type in %appdata%
Next click on roaming, and then find the nltk_data directory. In there, you will have your corpora file. The full path is something like:
C:\Users\yourname\AppData\Roaming\nltk_data\corpora
and lexnames will present under
C:\Users\yourname\AppData\Roaming\nltk_data\corpora\wordnet.