Gzip deflate noncompressed data format - gzip

After reading RFC 1951 and manually wrote a simple gzip file that contains non-compressed data. The uncompressed data file only has one character 'a' with no additional spaces or line breaks. The content of the gzip file is
1f 8b 08 00 00 00 00 00 00 03 01 80 00 7f ff 86 43 be b7 e8 01 00 00 00.
When I was trying to unzip it under Linux system, it gave me an error "gzip: xxx.gz: unexpected end of file".
I think I followed the deflate format of non-compressed data block mentioned in 3.2.4. After 10 bytes gzip header,
01: BFINAL=1 and BTYPE=00
8000: LEN=1
7fff: NLEN
86: a
Followed by CRC and Size.
Can anyone point out anything wrong or missing in the gzip file? Thanks a lot.

8000 is length 128, not 1. 0100 would be length 1. (Interestingly, you managed to correctly represent the total uncompressed length at the end as 01 00 00 00.)
Also an a is hex 61, not 86.
So the correct stream would be:
1f 8b 08 00 00 00 00 00 00 03 01 01 00 fe ff 61 43 be b7 e8 01 00 00 00

Related

jbig2 data in pdf is not valid jbig2 data. Wrong magic

I would like to take some jbig2 data out of a pdf file and load it using libjbig2dec (http://sourceforge.net/projects/jbig2dec)
For some reason the jbig2 data in the pdf file starts with this:
00000000 00 00 00 00 30 01 01 00 00 00 13 00 00 0a 5e 00
00000010 00 0f c3 00 00 2e 23 00 00 2e 23 00 00 00 00 00
00000020 00 01 26 01 01 ff ff ff ff 00 00 0a 5e 00 00 0f
00000030 c3 00 00 00 00 00 00 00 00 00 00 03 ff fd ff 02
00000040 fe fe fe ab f3 d0 fe 9e 92 d8 9f 63 ae 67 79 b8
00000050 81 ff 57 33 90 a4 ee c2 af c8 80 dc 0d 60 1e 86
But a valid jbig2 file should start with this magic:
0x97, 0x4a, 0x42, 0x32, 0x0d, 0x0a, 0x1a, 0x0a
What's going on here?
pdf format strips the header and the tail of the jbig2 file as specified in PDF, Version 1.7 (ISO 32000-1:2008) section 7.4.7 JBIG2Decode Filter
Further, some pdf files contain jbig2 streams with last segment of unspecified size (ff ff ff ff). libjbig2dec can not handle this.
Some PDFs are missing JBIG2 header and here is one of well known stream for jb2 file format.
974A42320D0A1A0A0100000001000000003E00010000006820000000536F7572636500506F776572204A4249472D3220456E636F646572202D2054686520556E6976657273697479206F66204272697469736820436F6C756D626120616E6420496D61676520506F77657220496E632E0056657273696F6E00312E302E3000000000
I added the above stream to header of rough data, and it was decoded well.

ssl client_hello, unidentified data

I am trying to make sense of a SSL Client Hello packet, but I am stuck on the last view bytes.
0000 16 03 00 00 58 01 00 00 54 03 03 52 f3 8a b2 f6 ....X...T..R....
0010 35 b8 08 39 25 5f 61 73 d5 b6 af 4d 3c 1a 2d 70 5..9%_as...M<.-p
0020 58 2e be 8a 89 b6 5c e1 9a 3f 81 00 00 18 00 35 X.....\..?.....5
0030 00 2f 00 0a 00 05 00 04 00 38 00 32 00 13 00 66 ./.......8.2...f
0040 00 39 00 33 00 16 01 00 00 13 ff 01 00 01 00 00 .9.3............
0050 0d 00 0a 00 08 04 02 04 01 02 01 02 02 .............
What I got so far:
16: msg type
03 00: SSL version
00 58: Record Length
01: Handshake Type - Client_Hello
00 00 54: Message Length
03 03: Client preferred version
52 f3 8a b2 f6 35 ... 5c e1 9a 3f 81: random data/ timestamp
00: Session ID Length 0
00 18: Ciphersuit Length
00 35 .. 00 16: cipher suites
01: compression method length
00: compression method
00 13 ff 01 00 01 00 00 0d 00 0a 00 08 04 02 04 01 02 01 02 02: what is this ?
At first a thought it was challenge data, but it seems to be constant over all the packages.
My main guide for deciphering the packet was: http://www.ntu.edu.sg/home/ehchua/programming/webprogramming/HTTP_SSL.html (under Client_Hello)
(sorry for the bad formatting)
The bytes after the compression method are TLS extensions (see RFC 5246, section 7.4.1.2 Client Hello).
0x13 0x00 length of extensions
The first one is the renegotiation_info extension (see RFC 5746, Section 3.2 Extension Definition):
0xff 0x01 renegotiation_info
0x00 0x01 length
0x00 0x00 for inital handshakes
The other one is the signature_algorithms extension (RFC 5246, section 7.4.1.4.1):
0x00 0x0d signature_algorithm
0x00 0x0a length
0x00 0x08 HashAlgorithm: none, SignatureAlgorithm: 0x08
0x04 0x02 HashAlgorithm: sha-256, SignatureAlgorithm: dsa
0x04 0x01 HashAlgorithm: sha-256, SignatureAlgorithm: rsa
0x02 0x01 HashAlgorithm: sha-1, SignatureAlgorithm: rsa
0x02 0x02 HashAlgorithm: sha-1, SignatureAlgorithm: dsa

Why is there information missing in objdump?

I can't manage to find out why there are sometime some .words missing in my assembly code when I run objdump. What do the "..." alone on a line represent?
Inside of the objdump output of -d or -D (disassemble), there will often be multiple instances of lines containing only an ellipsis. This is only because all the bytes between the above and below bytes are all null (0x00).
Below is the output of a disassembled 32bit program. Between the offset of 00234(+4) and 00240 are all 0x00 inside of the executable file.
40022c: 00000034 0x34
400230: 0000016a 0x16a
400234: 000001ac 0x1ac
...
400240: 00000098 0x98
400244: 00000000 nop
400248: 000000a9 0xa9
...
400254: 000000cf 0xcf
Looking at the application we disassembled, you can see that where the ellipsis occurs is all null bytes. No point in outputting these to the user multiple times, so objdump simply removes them. The bold text is where the ellipsis occur. I should also note, that if there is only one section (32 / 64bits) of null bytes, objdump will show this as nop or similar depending on machine.
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000220 34 00 00 00 4...
00000230 6A 01 00 00 AC 01 00 00 00 00 00 00 00 00 00 00 j...¬...........
00000240 98 00 00 00 00 00 00 00 A9 00 00 00 00 00 00 00 ˜.......©.......
00000250 00 00 00 00 CF 00 00 00 ....Ï...
I've used a -z argument to objdump which suppresses hiding of some zero information. You should see the .word arguments with zeroes.
This seems to be useful when you're passing the output of objdump to another program.

understand hexedit of an elf

Consider the following hexedit display of an ELF file.
00000000 7F 45 4C 46 01 01 01 00 00 00 00 00 .ELF........
0000000C 00 00 00 00 02 00 03 00 01 00 00 00 ............
00000018 30 83 04 08 34 00 00 00 50 14 00 00 0...4...P...
00000024 00 00 00 00 34 00 20 00 08 00 28 00 ....4. ...(.
00000030 24 00 21 00 06 00 00 00 34 00 00 00 $.!.....4...
0000003C 34 80 04 08 34 80 04 08 00 01 00 00 4...4.......
00000048 00 01 00 00 05 00 00 00 04 00 00 00 ............
How many section headers does it have?
Is it an object file or an executable file?
How many program headers does it have?
If there are any program headers, what does the first program header do?
If there are any section headers, at what offset is the section header table?
Strange, this hexdump looks like your homework to me...
There are 36 section headers.
It is an executable.
It has 8 program headers.
As you can tell by the first word (offset 0x34: 0x0006) in the first program header, it is of type PT_PHDR, which just informs about the characteristics of the program header table itself.
The section header table begins at byte 5200 (which is 0x1450 in hex).
How do I know this stuff? By dumping the hex into a binary and reading it with readelf -a (because I am lazy). Except for question no. 4, which I had to figure out manually by reading man 5 elf.

Who's Messing Up this TCP Connection?

I'm responsible for some embedded software that has to work with a customer's proprietary TCP interface (also embedded, but running under a well known and well regarded RTOS), but it's not getting through the three-way handshake, even though the HTTP interface, etc., all work fine, and I can communicate using the custom protocol with a program running on my PC.
Looking at the WireShark captures, his side initiates by sending a SYN, I send a SYN-ACK, and then he immediately sends a RST, so it looks like the problem is on his end. Is my analysis correct?
Here's a typical three packet example of the problem, with the MAC IDs anonymized (the real MAC IDs are valid). Sorry about pasting the raw hex, if anybody's got a better idea of how to put the WireShark capture up, I'm certainly amenable.
63 2009-06-29 13:07:49.685057 10.13.91.2 10.13.92.3 TCP 1024 > 49151 [SYN] Seq=0 Win=8192 Len=0 MSS=1460 WS=0 TSV=194 TSER=0
0000 f1 f1 f1 00 03 09 ab ab ab 60 10 89 08 00 45 00
0010 00 3c 00 68 40 00 40 06 6f 35 0a 0d 5b 02 0a 0d
0020 5c 03 04 00 bf ff 7d b3 81 44 00 00 00 00 a0 02
0030 20 00 9c 2f 00 00 02 04 05 b4 01 03 03 00 01 01
0040 08 0a 00 00 00 c2 00 00 00 00
64 2009-06-29 13:07:49.685375 10.13.92.3 10.13.91.2 TCP 49151 > 1024 [SYN, ACK] Seq=0 Ack=1 Win=1460 Len=0
0000 ab ab ab 60 10 89 f1 f1 f1 00 03 09 08 00 45 00
0010 00 28 00 02 00 00 64 06 8b af 0a 0d 5c 03 0a 0d
0020 5b 02 bf ff 04 00 d4 ff ff ff 7d b3 81 45 50 12
0030 05 b4 47 07 00 00 00 00 00 00 00 00
65 2009-06-29 13:07:49.685549 10.13.91.2 10.13.92.3 TCP 1024 > 49151 [RST] Seq=1 Win=0 Len=0
0000 f1 f1 f1 00 03 09 ab ab ab 60 10 89 08 00 45 00
0010 00 28 00 6a 00 00 40 06 af 47 0a 0d 5b 02 0a 0d
0020 5c 03 04 00 bf ff 7d b3 81 45 00 00 00 00 50 04
0030 00 00 21 c9 00 00 00 00 00 00 00 00
If both of you are using standard RTOS implementations, it is unlikely the TCP stack has a problem. Or, did you say the TCP is locally implemented?
If his client sends a SYN properly, and you can reply with a SYN+ACK,
it would appear that either your SYN+ACK is not well formed
(but, I could not see anything wrong yet), or,
like you suspect, his TCP stack did not accept the SYN+ACK properly.
However, if these are standard implementations, that is unlikely.
So, what more can you do?
Since it is the TCP handshake we are checking, you can just make him connect to any other machine at your end that is listening on the desired port
This will check his implementation (its good if the 3-way completes).
You can check your TCP stack with a TELNET connect to the port from another local machine
This will check your implementation (good if 3-way completes).
If both these things are fine, we need to suspect the network path
For example, could there be some firewall not allowing the communication and actively sending a RST to you?
First of all, those aren't valid MAC addresses; a high-order byte & 0x1 means it's a multicast MAC. See http://en.wikipedia.org/wiki/MAC_address
If you're not using fancy stuff on your side like custom tcp stack or raw sockets, I'd suspect the "proprietary TCP interface".
Has this ever worked with that client?
Does it work with other clients?