Weird pcap header of byte sequence 0a 0d 0d 0a created on Mac? - cross-platform

I have a PCAP file that was created on a Mac with mergecap that can be parsed on a Mac with Apple's libpcap but cannot be parsed on a Linux system. combined file has an extra 16-byte header that contains 0a 0d 0d 0a 78 00 00 00 before the 4d 3c 2b 1a intro that's common in pcap files. Here is a hex dump:
0000000: 0a0d 0d0a 7800 0000 4d3c 2b1a 0100 0000 ....x...M<+.....
0000010: ffff ffff ffff ffff 0100 4700 4669 6c65 ..........G.File
0000020: 2063 7265 6174 6564 2062 7920 6d65 7267 created by merg
0000030: 696e 673a 200a 4669 6c65 313a 2037 2e70 ing: .File1: 7.p
0000040: 6361 7020 0a46 696c 6532 3a20 362e 7063 cap .File2: 6.pc
0000050: 6170 200a 4669 6c65 333a 2034 2e70 6361 ap .File3: 4.pca
0000060: 7020 0a00 0400 0800 6d65 7267 6563 6170 p ......mergecap
Does anybody know what this is? or how I can read it on a Linux system with libpcap?

I have a PCAP file
No, you don't. You have a pcap-ng file.
that can be parsed on a Mac with Apple's libpcap
libpcap 1.1.0 and later can also read some pcap-ng files (the pcap API only allows a file to have one link-layer header type, one snapshot length, and one byte order, so only pcap-ng files where all sections have the same byte order and all interfaces have the same link-layer header type and snapshot length are supported), and OS X Snow Leopard and later have a libpcap based on 1.1.x, so they can read those files.
(OS X Mountain Lion and later have tweaked libpcap to allow it to write pcap-ng files as well; the -P flag makes tcpdump write out pcap-ng files, with text comments attached to some outgoing packets indicating the process ID and process name of the process that sent them - pcap-ng allows text comments to be attached to packets.)
but cannot be parsed on a Linux system
Your Linux system probably has an older libpcap version. (Note: do not be confused by Debian and Debian derivatives calling the libpcap package "libpcap0.8" - they're not still using libpcap 0.8.)
combined file has an extra 16-byte header that contains 0a 0d 0d 0a 78 00 00 00
A pcap-ng file is a sequence of "blocks" that start with a 4-byte block type and a 4-byte length, both in the byte order of the host that wrote them.
They're divided into "sections", each one beginning with a "Section Header Block" (SHB); the block type for the SHB is 0x0a0d0d0a, which is byte-order-independent (so that you don't have to know the byte order to read the SHB) and contains carriage returns and line feeds (so that if the file is, for example, transferred between a UN*X system and a Windows system by a tool that thinks it's transferring a text file and that "helpfully" tries to fix line endings, the SHB magic number will be damaged and it will be obvious that the file was corrupted by being transferred in that fashion; think of it as the equivalent of a shock indicator).
The 0x78000000 is the length; what follows it is the "byte-order magic" field, which is 0x1A2B3C4D (which is not the same as the 0xA1B2C3D4 magic number for pcap files), and which serves the same purposes as the pcap magic number, namely:
it lets code identify that the file is a pcap-ng file
it lets code determine the byte order of the section.
(No, you don't need to know the length before looking for the pcap magic number; once you've found the magic number, you then check the length to make sure it's at least 28 and, if it's less than or equal to 28, you reject the block as not being valid.)
Does anybody know what this is?
A (little-endian) pcap-ng file.
or how I can read it on a Linux system with libpcap?
Either read it on a Linux system with a newer version of libpcap (which may mean a newer version of whatever distribution you're using, or may just mean doing an update if that will get you a 1.1.0 or later version of libpcap), read it with Wireshark or TShark (which have their own library for reading capture files, which supports the native pcap and pcap-ng formats, as well as a lot of other formats), or download a newer version of libpcap from tcpdump.org, build it, install it, and then build whatever other tools need to read pcap-ng files with that version of libpcap rather than the one that comes with the system.
Newer versions of Wireshark write pcap-ng files by default, including in tools such as mergecap; you can get them to write pcap files with a flag argument of -F pcap.

Related

iMX6 get U-Boot to temporarily boot another U-Boot

First some background:
We have the following setup in our iMX6-based embedded system. There are two U-Boot partitions and two system (Linux) partitions. Currently we use only the first U-Boot partition and it uses a standard method for selecting, running and (if need be) rolling back the system partitions.
We are now looking into a similar scheme for upgrading U-Boot itself (this will happen very rarely but we do want the ability to do this without having to return the devices to base).
However, this is more fraught with danger because, once you tell the iMX6 device to boot from the alternate U-Boot partition, that's it - there's no U-Boot/watchdog combo that will revert to the previous one if boot fails, so a bad update runs a serious risk of bricking the device until we can return it to base for repair (a costly option which is why we're trying to mitigate it as much as possible).
The method chosen is a two-step U-Boot install procedure, consisting of 'write' and 'activate'. It relies on our ability to successfully figure out which U-Boot partition will be run if the device reboots (the selected one) and which is currently being run (the booted one). We've got this bit sorted out already.
But the bit that we're missing is the ability for UBoot to transfer control to the other UBoot partition under some circumstances. We got it doing different actions based solely on the UBoot environment as follows:
First, mmcboot has a prefix added so that it checks for the control transfer, specifically it's set to run ub_xfer_chk ; <original content of mmcboot>.
Secondly, we have a variable ub_xfer_flag normally set to 0.
Thirdly, we have the checking function ub_xfer_chk, defined as:
if test ${ub_xfer_flag} -eq 1 ; then
echo Soft-booting other UBoot...
setenv ub_xfer_flag 0
saveenv
weave_magic
fi
The weave_magic code is where we are having trouble :-) The idea is that this will load the other UBoot partition into memory (at our CONFIS_SYS_TEXT_BASE of 0x1780000) and execute it as if the actual device had done it.
We've tested the meat of this solution by using reset in place of weave_magic and it successfully restarts the device once, so we're certain we can make it safe.
My specific question then is :how can I convince U-Boot to load a second copy from another partition and run it?
The two UBoot partitions live in the /dev/mmcblk3boot0 and /dev/mmcblk3boot1 devices accessible from the system partition and are 2M files, including the 1K lead-in header and a fair bit of padding at the end.
Update:
We have actually had some success and managed to load an IMX image from the boot partition with the command:
ext4load mmc ${mmcdev}:${mmcpart} 0x17800000 ${bootdir}/u-boot.imx
but, when trying to execute it with:
go 0x17800000
we get an illegal instruction and immediate reboot:
pc : [<17800070>] lr : [<4ff83c64>]
sp : 4f579ac0 ip : 00000030 fp : 4f57be58
r10: 00000002 r9 : 4f579efc r8 : 4ffbe2b0
r7 : 4f57be68 r6 : 17800000 r5 : fffff200 r4 : 000002cc
r3 : 17800000 r2 : 4f57be6c r1 : 4f57be6c r0 : 00000000
Flags: nZCv IRQs off FIQs off Mode SVC_32
Resetting CPU ...
So I'm guessing that's not executable code at the start of that file. Any ideas on where to go from here?
The actual code in the IMX file is not at the beginning. You can discover this fact by using the excellent on-line disassembler with armv5 big-endian no-thumb architecture to figure out that the bytes at the beginning frequently give you invalid and/or not-very-sensible code:
ldtdmi a1, [a1], -a2 ; <UNPREDICTABLE>
strne a1, [a1, a1]
andeq a1, a1, a1
ldrbne pc, [pc, -ip, lsr #8]! ; <UNPREDICTABLE>
In any case, the data at the start of the IMX file is known to be header information (the d1 at the start is a "magic" marker indicating the IVT header and there should also be a DCD block after that. However, even beyond the IVT and DCD blocks (based on their purported lengths in the header fields), the code is not sensible.
However, there's viable information at offset 0xc00 following a large chunk of 0x00 bytes:
00000be0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000bf0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000c00: 0f00 00ea 14f0 9fe5 14f0 9fe5 14f0 9fe5 ................
00000c10: 14f0 9fe5 14f0 9fe5 14f0 9fe5 14f0 9fe5 ................
Putting the hex bytes at offset 0xc00 into the disassembler, and adjusting for areas that are skipped by branches, shows both valid and sane ARM code.
And, indeed, stripping the IMX file with:
dd if=u-boot.imx bs=1 skip=3072 of=ub-at-c00.imx
should give you a file you can boot with:
ext4load mmc ${mmcdev}:${mmcpart} 0x17800000 ${bootdir}/ub-at-c00.imx
go 0x17800000
When we do this, it outputs:
U-Boot 2014.04 (Nov 07 2018 - 19:05:32)
CPU: Freescale i.MX6Q rev1.5 at 792 MHz
CPU: Temperature 32 C, calibration data: 0x5764e169
Reset cause: unknown reset
Board: DTI BRD0208 (Spitfire I) 05/01/2017
I2C: ready
DRAM: 1 GiB
We know this is the newer UBoot simply because the normal one we're using outputs an October date rather than a November one.
Unfotunately, it hangs at that point, with the watchdog timer eventually kicking in and rebooting back to the original UBoot but I suspect that has to do with UBoot not liking the current state of the device (i.e., it doesn't like initialising it twice).
So we'll have to figure out how to convince it to do so but at least we've gotten it booting another copy of itself, which is what the question was about.

Why do 7zip and gzip add 0x0A at the end of gzip compressed data

Wikipedia states(wrongly apparently at least for real world status) that gzip format demands that last 4 bytes are uncompressed size (mod 4GB).
I have fond the credible answer on SO that explains that sometimes there is junk at the end of the gzip data so you can not reply on last 4 bytes being size.
Unfortunately this matches my experiments(both terminal gzip and 7zip archiver add 0x0A byte for my small test example).
My question is what is the reason for this gzip and 7zip doing this?
Obviously they do it like that because they are written to do that, but I wonder about the motivation to break the format specification.
I know that some formats have padding requirements, but I found nothing for gzip.
edit:process:
echo "Testing rocks:) Debugging sucks :(" >> test_data
rm test_data.gz
gzip -6 test_data
vim -c "noautocmd edit test_data.gz"
in vim: :%!xxd -c 4
and last 5 bytes are size(35) and 0x0a (23 hex=35, then 00 00 00 0a)
7zip process is just using GUI to make a archive.
Your testing process is wrong. Vim is what adds 0x0A to the end of the file. Here is a simpler test, using xxd directly (why did you even use Vim?):
echo "Testing rocks:) Debugging sucks :(" >> test_data
gzip -6 test_data
xxd -c 4 test_data.gz
Output:
0000000: 1f8b 0808 ....
0000004: 453c 5d59 E<]Y
0000008: 0003 7465 ..te
000000c: 7374 5f64 st_d
0000010: 6174 6100 ata.
0000014: 0b49 2d2e .I-.
0000018: c9cc 4b57 ..KW
000001c: 28ca 4fce (.O.
0000020: 2eb6 d254 ...T
0000024: 7049 4d2a pIM*
0000028: 4d4f 0789 MO..
000002c: 1497 0245 ...E
0000030: 14ac 34b8 ..4.
0000034: 00f4 a724 ...$
0000038: 5623 0000 V#..
000003c: 00 .
As you can see, there is no 0x0A at the end. I think Vim adds newlines to the end of files by default, if they are not present.

Decompress gzip file that contine multiple blocks

I have a Gzip file that has multiple blocks.Every block starts with
1F 8B 08
And ends with
00 00 FF FF
I tried to decompress the file using 7-Zip and gzip tool in linux ,But I always get an error saying that the file is invalid.
So I wrote this python script
import zlib
CHUNKSIZE=1
f=open("file.gz","rb")
buffer=f.read(CHUNKSIZE)
data=""
r=CHUNKSIZE
d = zlib.decompressobj(16+zlib.MAX_WBITS)
while buffer:
outstr = d.decompress(buffer)
print(r)
buffer=f.read(CHUNKSIZE)
r=r+CHUNKSIZE
outstr = d.flush()
I have notice that when it reach to the header of the second block
00 00 00 FF FF 1F 8B 08
at the point between FF and 1F
the script return
zlib.error: Error -3 while decompressing data: invalid block type
I made the size of the chunk to be 1 so the I would know exactly where the problem is.
I know that the problem is not in the file because I have multiple files constructed the same way and they show exactly the same error.
I know that the problem is not in the file because I have multiple
files constructed the same way and they show exactly the same error.
The conclusion is not that the problem is not in the file, but rather that the problem is in all of your files. Someone either inadvertently or deliberately constructed invalid gzip files. It looks like they did that by using Z_SYNC_FLUSH or Z_FULL_FLUSH instead of Z_FINISH to end each stream before starting another faux gzip stream. A gzip stream ends with a last block followed by an eight-byte gzip trailer containing two check values on the integrity of the uncompressed data.
You can nevertheless continue with decompression, though without the comfort of any integrity checking of the data, by simply picking up with a new instance of decompressobj when you get an error and see a new gzip header, 1f 8b 08.
More importantly you should locate and contact the source of these files and say "Hey, WTF?"

Apache mod_speling falsely "correcting" URLs?

I've been tasked with moving an old dynamic website from a Windows server to Linux. The site was initially written with no regard to character case. Some filenames were all upper-case, some lower-case, and some mixed. This was never a problem in Windows, of course, but now we're moving to a case-sensitive file system.
A with a quick find/rename command (thanks to another tutorial) got the filenames to all lowercase.
However, many of the URL references in the code still point to these mixed-case filenames, so I enabled mod_speling to overcome this issue. It seems to work OK for the most part, with the exception of one page: I have a file name haematobium.html, which, everytime a link points to .../haematobium.html, it gets rewritten as .../hæmatobium.html in the browser.
I don't know how this strange character made its way into the filename in the first place, but I've corrected the code in the HTML document to now link to haematobium.html, then renamed the haematobium.html file itself to match.
When requesting .../haematobium.html in Chrome, it "corrects" to .../hæmatobium.html in the address bar, and shows an error saying "The requested URL .../hæmatobium.html was not found on this server."
In IE9, I'm promted for the login (this is a .htaccess protected page), I enter it, and then if forwards the URL to .../h%C3%A6matobium.html, which again doesn't load.
In my frustration I even copied haematobium.html to both hæmatobium.html and hæmatobium.html, still, none of the three pages actually load.
So my question: I read somewhere that mod_speling tries to "learn" misspelled URLs. Does it actually rename files (is that where the odd character might have come from)? Does it keep a cache of what's been called for, and what it was forwarded to (a cache I could clear)?
PS. there are also many mixed-case references to MySQL database tables and fields, but that's a whole 'nother nightmare.
[Cannot comment yet, therefore answering.]
Your question doesn't make it entirely clear which of the two names (two characters ae [ASCII], or one ligature character æ [Unicode]) for haematobium.html actually exists in your Apache's file system.
Try the following in your shell:
$ echo -n h*matobium.html | hd
The output should be either one of the following two alternatives. This is ASCII, with 61 and 65 for a and e, respectively:
00000000 68 61 65 6d 61 74 6f 62 69 75 6d 2e 68 74 6d 6c |haematobium.html|
00000010
And this is Unicode, with c3 a6 for the single character æ:
00000000 68 c3 a6 6d 61 74 6f 62 69 75 6d 2e 68 74 6d 6c |h..matobium.html|
00000010
I would recommend using the ASCII version, it makes life considerably easier.
Now to your actual question. mod_speling does neither "learn", nor rename or cache its data. The caching is either done by your browsers, or by proxies in between your browsers and the server.
It's actually best practice to test these cases with command line tools like wget or curl, which should be already available or easily installable on any Linux.
Use wget -S or curl -i to actually see the response headers sent by your web server.

How to save and retrieve string with accents in redis?

I do not manage to set and retrieve string with accents in my redis db.
Chars with accents are encoded, how can I retrieve them back as they where set ?
redis> set test téléphone
OK
redis> get test
"t\xc3\xa9l\xc3\xa9phone"
I know this has already been asked
(http://stackoverflow.com/questions/6731450/redis-problem-with-accents-utf-8-encoding) but there is no detailed answer.
The Redis server itself stores all data as a binary objects, so it is not dependent on the encoding. The server will just store what is sent by the client (including UTF-8 chars).
Here are a few experiments:
$ echo téléphone | hexdump -C
00000000 74 c3 a9 6c c3 a9 70 68 6f 6e 65 0a |t..l..phone.|
c3a9 is the representation of the 'é' char.
$ redis-cli
> set t téléphone
OK
> get t
"t\xc3\xa9l\xc3\xa9phone"
Actually the data is correctly stored in the Redis server. However, when it is launched in a terminal, the Redis client interprets the output and applies the sdscatrepr function to transform non printable chars (whose definition is locale dependent, and may be broken for multibyte chars anyway).
A simple workaround is to launch redis-cli with the 'raw' option:
$ redis-cli --raw
> get t
téléphone
Your own application will probably use one of the client libraries rather than redis-cli, so it should not be a problem in practice.