Extracting data from a .DLL: unknown file offsets - dll

I'm currently trying to extract some data from a .DLL library - I've figured out the file structure (there are 1039 data blocks compressed with zlib, starting at offset 0x3c00, the last one being the fat table). The fat table itself is divided into 1038 "blocks" (8 bytes + a base64 encoded string - the filename). As far as I've seen, byte 5 is the length of the filename.
My problem is that I can't seem to understand what bytes 1-4 are used for: my first guess was that they were an offset to locate the file block inside the .DLL (mainly because the values are increasing throughout the table), but for instance, in this case, the first "block" is:
Supposed offset: 2E 78 00 00
Filename length: 30 00 00 00
Base64 encoded filename: 59 6D 46 30 64 47 78 6C 58 32 6C 75 64 47 56 79 5A 6D 46 6A 5A 56 78 42 59 33 52 70 64 6D 56 51 5A 58 4A 72 63 31 4E 6F 62 33 63 75 59 77 3D 3D
yet, as I said earlier, the block itself is at 0x3c00, so things don't match. Same goes for the second block (starting at 0x3f0b, whereas the table supposed offset is 0x167e)
Any ideas?

Answering my own question lol
Anyway, those numbers are the actual offsets of the file blocks, except for the fact that the first one starts from some random number instead than from the actual location of the first block. Aside from that, though, differences between each couple of offsets do match the length of the corresponding block.

Related

Generating PDF user password hash

Currently, I am attempting to generating a hash of a user password for PDF, given the encrypted PDF file and the plain password. I follow the instruction of this article. However, the hash I've computed is different from the hash stored in the PDF file.
The hashed user password (/U entry) is simply the 32-byte padding
string above, encrypted with RC4, using the 5-byte file key. Compliant
PDF viewers will check the password given by the user (by attempting
to decrypt the /U entry using the file key, and comparing it against
the padding string) and allow or refuse certain operations based on
the permission settings.
First, I padded my password "123456" using a hardcoded 32-byte string, which gives me
31 32 33 34 35 36 28 BF 4E 5E 4E 75 8A 41 64 00
4E 56 FF FA 01 08 2E 2E 00 B6 D0 68 3E 80 2F 0C
I tried to compute the hash with RC4 using the 5-byte file key as the key. According to the article:
The encryption key is generated as follows:
1. Pad the user password out to 32 bytes, using a hardcoded
32-byte string:
28 BF 4E 5E 4E 75 8A 41 64 00 4E 56 FF FA 01 08
2E 2E 00 B6 D0 68 3E 80 2F 0C A9 FE 64 53 69 7A
If the user password is null, just use the entire padding
string. (I.e., concatenate the user password and the padding
string and take the first 32 bytes.)
2. Append the hashed owner password (the /O entry above).
3. Append the permissions (the /P entry), treated as a four-byte
integer, LSB first.
4. Append the file identifier (the /ID entry from the trailer
dictionary). This is an arbitrary string of bytes; Adobe
recommends that it be generated by MD5 hashing various pieces
of information about the document.
5. MD5 hash this string; the first 5 bytes of output are the
encryption key. (This is a 40-bit key, presumably to meet US
export regulations.)
I appended the hashed owner key to the padded password, which gives me
31 32 33 34 35 36 28 BF 4E 5E 4E 75 8A 41 64 00
4E 56 FF FA 01 08 2E 2E 00 B6 D0 68 3E 80 2F 0C
C4 31 FA B9 CC 5E F7 B5 9C 24 4B 61 B7 45 F7 1A
C5 BA 42 7B 1B 91 02 DA 46 8E 77 12 7F 1E 69 D6
Then, I appended the /P entry (-4), treated as a four-byte integer, encoded with little endian, which gives me
31 32 33 34 35 36 28 BF 4E 5E 4E 75 8A 41 64 00
4E 56 FF FA 01 08 2E 2E 00 B6 D0 68 3E 80 2F 0C
C4 31 FA B9 CC 5E F7 B5 9C 24 4B 61 B7 45 F7 1A
C5 BA 42 7B 1B 91 02 DA 46 8E 77 12 7F 1E 69 D6
FC FF FF FF
Last, I appended the file identifier to it. The trailer of my PDF is:
trailer
<<
/Size 13
/Root 2 0 R
/Encrypt 1 0 R
/Info 4 0 R
/ID [<B5185D941CC0EA39ACA809F661EF36D4> <393BE725532F9158DC9E6E8EA97CFBF0>]
>>
and the result is
31 32 33 34 35 36 28 BF 4E 5E 4E 75 8A 41 64 00
4E 56 FF FA 01 08 2E 2E 00 B6 D0 68 3E 80 2F 0C
C4 31 FA B9 CC 5E F7 B5 9C 24 4B 61 B7 45 F7 1A
C5 BA 42 7B 1B 91 02 DA 46 8E 77 12 7F 1E 69 D6
FC FF FF FF B5 18 5D 94 1C C0 EA 39 AC A8 09 F6
61 EF 36 D4 39 3B E7 25 53 2F 91 58 DC 9E 6E 8E
A9 7C FB F0
MD5 hashing this block of data returns 942c5e7b2020ce57ce4408f531a65019. I RC4-ed the padded password with cryptii using the first 5 bytes of the MD5 hash as the key. However, it returns
90 e2 b5 21 2a 7d 53 05 70 d9 5d 26 95 c7 c2 05
6e 2a 28 40 63 e7 4a d4 e9 05 86 71 43 d1 39 d6
while the hash in PDF is
58 81 CA 74 65 DC 2E A7 5D D2 39 D4 43 9C 0D DE
28 BF 4E 5E 4E 75 8A 41 64 00 4E 56 FF FA 01 08
Which step am I doing wrong? I suspect that the problem happens because
I am appending the File Idenifier in a wrong format
I am using the wrong drop bytes with RC4.
The hash function is not for PDF 1.6
I make some mistake during those process
Or maybe the article is actually wrong
Files: Original PDF dummy.pdf, dummy-protected.pdf (Password: 123456)
Please help
There are two issues in your calculation:
The article to use refers to PDF encryption algorithms available for PDF-1.3 but your document is encrypted using an algorithm introduced with PDF-1.5.
You make an error when appending the file identifier - actually only the first entry of the ID array shall be appended, not both (which is not really clear from the article you use).
In a comment you asked accordingly
where can I find the password hashing detail for >V1.3 PDF?
I would propose using the PDF specification, ISO 32000.
As ISO specifications go, they are not free, but Adobe used to provide a version of ISO 32000-1 with merely the ISO header removed on their web site. Some days ago it has been removed (By design? By error? I don't know yet.) but you still find copies of it googl'ing for "PDF32000".
The relevant section in ISO 32000-1 is 7.6 Encryption and in particular 7.6.3 Standard Security Handler.
Following that information you should be able to correctly calculate the value in question.
(Alternatively you can also use old Adobe PDF references, the editions for PDF 1.5, 1.6, and 1.7 should also give you the information required for decrypting your document. But these references have been characterized as not normative in nature by prominent Adobe employees, so I would go for the ISO norm.)
Beware, though: After ISO 32000-1 had been published, Adobe introduced an AES-256 encryption scheme as an extension which obviously is not included in ISO 32000-1. You can find a specification in "Adobe Supplement to ISO 32000, base version 1.7, extension level 3".
Furthermore, with ISO 32000-2 that Adobe AES-256 encryption scheme and all older schemes became deprecated, the only encryption scheme to use with PDF-2.0 is a new AES-256 encryption scheme described in ISO 32000-2 which is based on the Adobe scheme but introduces some extra hashing iterations.

Does AVRO schema also get encoded in the binary part?

An Avro file contains the schema in plain text followed by the data in binary format. I'd like to know whether the schema (or some part of it) also exists in the binary part? I got a hunch that the schema (or just the field names) also get coded in the binary part because when I make some changes in the plain schema part of an AVRO file I get an error message when exporting the schema using the Avro-tool.jar .
When the binary encoding is used, the whole file is using a binary format.
The file starts with a 4 bytes header, then a map containing some metadata immediately follows. This map contains a "avro.schema" entry. The value of this entry is the schema stored as a string. After the map you will find your data.
If you edit the schema manually, read change its size, then length prefix stored just before this string will be incoherent and the file is corrupted.
See Binary encoding specification to learn how various types are binary encoded.
I'm not sure what you are trying to achieve, and I quite sure that is should not be done. But for fun, lets try to edit the schema in place.
For this example I will use the weather.avro file from the avro's source tree:
$ java -jar avro-tools-1.8.0.jar getmeta weather-orig.avro
avro.codec null
avro.schema {"type":"record","name":"Weather","namespace":"test","fields":[{"name":"station","type":"string"},{"name":"time","type":"long"},{"name":"temp","type":"int"}],"doc":"A weather reading."}
$ java -jar avro-tools-1.8.0.jar getschema weather-orig.avro
{
"type" : "record", "name" : "Weather", "namespace" : "test", "doc" : "A weather reading.",
"fields" : [
{"name" : "station", "type" : "string"},
{"name" : "time", "type" : "long"},
{"name" : "temp", "type" : "int"}
]
}
$ java -jar /avro-tools-1.8.0.jar tojson weather-orig.avro
{"station":"011990-99999","time":-619524000000,"temp":0}
{"station":"011990-99999","time":-619506000000,"temp":22}
{"station":"011990-99999","time":-619484400000,"temp":-11}
{"station":"012650-99999","time":-655531200000,"temp":111}
{"station":"012650-99999","time":-655509600000,"temp":78}
OK. This is our source file. Plain simple, two metadata entries and the schema defines three fields. Now, we will try to understand how things are stored in binary and how we can edit the file to change the rename station int station-id.
$ hexdump weather-orig.avro -n 256 -C
00000000 4f 62 6a 01 04 14 61 76 72 6f 2e 63 6f 64 65 63 |Obj...avro.codec|
00000010 08 6e 75 6c 6c 16 61 76 72 6f 2e 73 63 68 65 6d |.null.avro.schem|
00000020 61 f2 02 7b 22 74 79 70 65 22 3a 22 72 65 63 6f |a..{"type":"reco|
00000030 72 64 22 2c 22 6e 61 6d 65 22 3a 22 57 65 61 74 |rd","name":"Weat|
00000040 68 65 72 22 2c 22 6e 61 6d 65 73 70 61 63 65 22 |her","namespace"|
00000050 3a 22 74 65 73 74 22 2c 22 66 69 65 6c 64 73 22 |:"test","fields"|
00000060 3a 5b 7b 22 6e 61 6d 65 22 3a 22 73 74 61 74 69 |:[{"name":"stati|
00000070 6f 6e 22 2c 22 74 79 70 65 22 3a 22 73 74 72 69 |on","type":"stri|
00000080 6e 67 22 7d 2c 7b 22 6e 61 6d 65 22 3a 22 74 69 |ng"},{"name":"ti|
00000090 6d 65 22 2c 22 74 79 70 65 22 3a 22 6c 6f 6e 67 |me","type":"long|
000000a0 22 7d 2c 7b 22 6e 61 6d 65 22 3a 22 74 65 6d 70 |"},{"name":"temp|
000000b0 22 2c 22 74 79 70 65 22 3a 22 69 6e 74 22 7d 5d |","type":"int"}]|
000000c0 2c 22 64 6f 63 22 3a 22 41 20 77 65 61 74 68 65 |,"doc":"A weathe|
000000d0 72 20 72 65 61 64 69 6e 67 2e 22 7d 00 b0 81 b3 |r reading."}....|
000000e0 c4 0a 0c f6 62 fa c9 38 fd 7e 52 00 a7 0a cc 01 |....b..8.~R.....|
000000f0 18 30 31 31 39 39 30 2d 39 39 39 39 39 ff a3 90 |.011990-99999...|
First four bytes, 4f 62 6a 01, are the header
The next thing is a long describing the size of the first block of the "metadata" map. long are encoded using variable-length zig-zag coding, so here 04 means 2 which coherent with the output of getmeta. (remember to read the Avro specification to know how various data type are encoded)
Just after you will find the first key of the map. A key is a string and a string is prefixed by its length in bytes. Here 0x14 means 10 bytes which is the length of "avro.codec" when encoded in UTF-8.
You can then skip the next 10 bytes and go to the next element. Etc. You can advance until you spot the avro.schema part.
Just after this string is the length of the map value (which is a string since it is our schema). That is what you want to modify. We are renaming station into station-id so you want to add 3 to the current length, so f2 02 should now be f8 02 (remember variable length zig zag coding ?).
You can now update the schema string to add "-id"
Enjoy
java -jar /home/cmathieu/Sources/avro-trunk/lang/java/tools/target/avro-tools-1.8.0-SNAPSHOT.jar tojson weather.avro
{"station-id":"011990-99999","time":-619524000000,"temp":0}
{"station-id":"011990-99999","time":-619506000000,"temp":22}
{"station-id":"011990-99999","time":-619484400000,"temp":-11}
{"station-id":"012650-99999","time":-655531200000,"temp":111}
{"station-id":"012650-99999","time":-655509600000,"temp":78}
But as I said, you most likely don't want to do that.

What are the parts ECDSA entry in the 'known_hosts' file?

I'm trying to extract an ECDSA public key from my known_hosts file that ssh uses to verify a host. I have one below as an example.
This is the entry for "127.0.0.1 ecdsa-sha2-nistp256" in my known_hosts file:
AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBF3QCzKPRluwunLRHaFVEZNGCPD/rT13yFjKiCesA1qoU3rEp9syhnJgTbaJgK70OjoT71fDGkwwcnCZuJQPFfo=
I ran it through a Base64 decoder to get this:
���ecdsa-sha2-nistp256���nistp256���A]2F[rUF=wXʈ'ZSzħ2r`M::WL0rp
So I'm assuming those question marks are some kind of separator (no, those are lengths). I figured that nistp256 is the elliptical curve used, but what exactly is that last value?
From what I've been reading, the public key for ECDSA has a pair of values, x and y, which represent a point on the curve. Is there some way to extract x and y from there?
I'm trying to convert it into a Java public key object, but I need x and y in order to do so.
Not all of characters are shown since they are binary. Write the Base64-decoded value to the file and open it in a hex editor.
The public key for a P256 curve should be a 65-byte array, starting from the byte with value 4 (which means a non-compressed point). The next 32 bytes would be the x value, and the next 32 the y value.
Here is the result in hexadecimal:
Signature algorithm:
00 00 00 13
65 63 64 73 61 2d 73 68 61 32 2d 6e 69 73 74 70 32 35 36
(ecdsa-sha2-nistp256)
Name of domain parameters:
00 00 00 08
6e 69 73 74 70 32 35 36
(nistp256)
Public key value:
00 00 00 41
04
5d d0 0b 32 8f 46 5b b0 ba 72 d1 1d a1 55 11 93 46 08 f0 ff ad 3d 77 c8 58 ca 88 27 ac 03 5a a8
53 7a c4 a7 db 32 86 72 60 4d b6 89 80 ae f4 3a 3a 13 ef 57 c3 1a 4c 30 72 70 99 b8 94 0f 15 fa
So you first have the name of the digital signature algorithm to use, then the name of the curve and then the public component of the key, represented by an uncompressed EC point. Uncompressed points start with 04, then the X coordinate (same size as the key size) and then the Y coordinate.
As you can see, all field values are preceded by four bytes indicating the size of the field. All values and fields are using big-endian notation.

How to return LOW VALUES HEX '00' in sql statement?

I need to write into file (in the middle of the string) a LOW VALUES HEX'00'.
I could do it using package utl_file using the next code utl_file.put_raw(v_file, hextoraw('000000')). But I may do it only in the beginning and end of file, not in the middle of string.
So, my question is: how to write a LOW VALUES HEX'00' in the select statement.
I tried some variants like
Select ‘blablabla’ Q, hextoraw('000000'), ‘blablabla’ w from dual;
save it into .dat file, then open it in hex-editor but the result was different when using utl_file.
Could anybody (if it's possible) write a correct sql statement.
If I understand you correctly, you're trying to add a null/binary zero to your output. If so, you can just use chr(0).
eg. utl_file.putf(l_file, 'This is a binary zero' || chr(0));
Looking at that in a hex editor will show you:
00000000 54 68 69 73 20 69 73 20 61 20 62 69 6e 61 72 79 |This is a binary|
00000010 20 7a 65 72 6f 00 0a | zero..|

How does Opera Turbo compress the data (cache)? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I have an Opera browser with "Opera Turbo" enabled. It is a proxy, which recompress HTML into smaller format. I have a file from opera cache, which was compressed by turbo from 2000 kb to 500 kb. How can I uncompress this file into readable form (the original file have almost no html tags, just 8bit text, "<p>" tags, and html header/footer)?
Here is an example of such file:
.opera$ hexdump -C cache/turbo/g_0000/opr00003.tmp
00000000 78 da 6c 8f bf 4e c4 30 0c c6 67 fa 14 26 48 6c |xзl▐©Nд0.фgЗ.&Hl|
00000010 a1 1c 12 d3 25 1d f8 37 82 54 f1 02 69 63 48 74 |║..с%.Ь7┌TЯ.icHt|
00000020 69 52 12 97 d2 b7 ed 88 40 80 b8 05 06 06 7a 57 |iR.≈р╥М┬#─╦...zW|
00000030 09 21 84 27 fb f3 cf 9f 6d 61 a8 71 45 26 0c 2a |.!└'ШСо÷ma╗qE&.*|
00000040 5d 64 3b a2 41 52 60 88 5a 8e 77 9d bd 97 ec 34 |]d;╒AR`┬Z▌w²╫≈Л4|
00000050 78 42 4f fc 7a 68 91 41 3d 57 92 11 3e 50 be 99 |xBOЭzh▒A=W▓.>P╬≥|
00000060 5d 42 6d 54 4c 48 b2 b7 5e 87 3e f1 c5 d1 f1 82 |]BmTLH╡╥^┤>ЯеяЯ┌|
00000070 fd 78 79 d5 a0 64 1a 53 1d 6d 4b 36 f8 5f 26 ef |Щxyу═d.S.mK6Ь_&О|
00000080 eb 71 fd f5 f8 97 5d e1 d0 87 a8 d3 ff 20 59 72 |КqЩУЬ≈]Ап┤╗сЪ Yr|
00000090 58 94 5d 4a 56 41 f0 40 06 e1 12 09 f6 1b ad 92 |X■]JVAП#.А..Ж.╜▓|
000000a0 59 c2 8c 8a 7c e6 32 91 cf 9f 09 67 fd 0a 22 3a |Yб▄┼|Ф2▒о÷.gЩ.":|
...
and here is a part of original file (I'm not sure is it the really original file or not, but very likely it is):
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1251">
<meta name="description" content="статьи">
<meta name="keywords" content="статьи">
<title>Russia on the Net — статьи</title>
</head>
<link rel="stylesheet" href="/rus/style.css">
<body bgcolor="#FFFFFF">
<center>
...
Size of compressed file is 3397 and of original ~ 8913 bytes. Original file is compressible by bzip2 to 3281 byte; by gzip to 3177 byte; by lzma to 2990 byte; by 7z to 3082 byte; by zip to 3291 byte.
Update: I have information (from chrome opera-mini extension http://ompd-proxy.narod.ru/distrib/opera_mini_proxy.crx - unpack it with 7-zip) that opera mini uses this to unpack data webodf/src/core_RawInflate.js Can this file help me?
The first two bytes 78 DA are a valid 2 byte zLib header (see section 2.2 on CMF and FLG) that precedes deflate compressed data. So the file could be compressed using zLib/deflate.
For a first quick test, you can use my command-line tool Precomp like this:
precomp -v -c- -slow opr00003.tmp
It will report zLib compressed streams and how big they are decompressed ("... can be decompressed to ... bytes"). If this is successful (returns a decompressed size close to the original filesize you know), use your favourite programming language along with the zLib library to decompress your data.
Also note that if you're lucky, the stream (or a part of it) can be recompressed bit-to-bit identical by Precomp and the output file opr00003.pcf contains (a part of) the decompressed data preceded by a small header.
EDIT: As osgx commented and further analysis showed, the data can not be decompressed using zLib/deflate, so this is still an unsolved case.
EDIT2: The update and especially the linked JS show that it is deflate, but it seems to be some custom variant. Comparison with the original code could help as well as comparison to original zLib source code.
Additionally, the JS code could of course be used to try to decompress the data. It doesn't seem to handle the 2 byte header, though, so perhaps these have to be skipped.
There are different file types in opera turbo cache. The first one is cited in question; some files are unpacked (css and js), and there is Z-packed multifile tar-like archive for images (VP8, detected by plain-text RIFF,WEBP,VP8 magics):
Example of Z-packed file header:
5a 03 01 1c 90 02 0a 22 03 18 2a (RIFF data first img) (RIFF data second img)
(RIFF data third img)
RIFF container is clearly visible and it has length field, so I suggest a description:
5a - magic of format
03 - number of files
01 - first file (riff size=0x1c90)
1c 90 - big-endian len of first file
02 - second file (riff size=0a22)
0a 22 - len of second file
03 - third file (riff size=182a)
18 2a
52 49 46 46 == "RIFF" magic of first file
Another example of Z-file with JPGs ("JFIF" magic is visible, ffd8ff jpeg-marker is invisible; 8 files inside):
0000000: 5a08 0118 de02 1cab 0308 0804 162c 0531 Z............,.1
0000010: 4d06 080f 070a 4608 0964"ffd8 ffe0 0010 M.....F..d......
0000020: 4a46 4946 0001 0101 0060 0060 0000 ffdb JFIF.....`.`....
Another detected (by file) type of file is "<000"-file with example header of (hex) "1f 8b 08 00 00 00 00 00 02 ff ec 52 cb 6a c3 30 10 fc 15 63".
file says it is "gzip compressed data, max compression", and it is just unpacked by any gzip.