I have used pgpdump on an encrypted file (via BouncyCastle) to get more information about it and found several lines about partial start, partial continue and partial end.
So I was wondering what exactly this was describing. Is it some sort of fragmentation of plain text?
Furthermore what does the bit count stand for after the RSA algorithm? In this case it's 1022 bits, but I've seen files with 1023 and 1024bits.
Partial body lengths are pretty well explained by this tumblr post. OpenPGP messages are composed of packets of a given length. Sometimes for large outputs (or in the case of packets from GnuPG, short messages), there will be partial body lengths that specify that another header will show up that tell the reader to continue reading From the post:
A partial body length tells the parser: “I know there are at least N more bytes in this packet. After N more bytes, there will be another header to tell if how many more bytes to read.” The idea being, I guess, that you can encrypt a stream of data as it comes in without having to know when it ends. Maybe you are PGP encrypting a speech, or some off-the-air TV. I don’t know. It can be infinite length — you can just keep throwing more partial body length headers in there, each one can handle up to a gigabyte in length. Every gigabyte it informs the parser: “yeah, there’s more coming!”
So in the case of your screenshot, pgpdump reads 8192 bytes, then encounters another header that says to read another 2048 bytes. after that 2k bytes, it hits another header for 1037 bytes, so on and so forth until the last continue header. 489 bytes after that is the end of the message
The 1022 bits, is the length of the public modulus. It is always going to be close to 1024 (if you have a 1024-bit key) but it can end up being slightly shorter than that given the initial selection of the RSA parameters. They are still called "1024-bit keys" though, even though they are slightly shorter than that.
Related
after a very long research on the web, I'm still not able to find any code/algorithm that shows how the shared-key authentication works in WEP, and in particular how the response is computed.
The general concept is clear:
The mobile station (MB) sends a connect request to the access
point (AP).
The AP replies with a challenge
The MB encrypts this challenge (it has to prove to have the shared key) and sends it back to the AP
The AP verifies the cypher text and allows the access.
Now:
The challenge is 128 bytes.
How is the response computed? When opening in wireshark the traffic, the response is usually 136 bytes, meaning that the encryption includes also something else.
This should be something like:
RC4 ( IV + challenge + CRC32(challenge))
Where can I verify if this expression is the correct one?
Furthermore:
the IV is 6 Hex digit (so 3 bytes) meaning that maybe there is an extension of one byte. How is this extension computed?
the challenge is 128 bytes
is the CRC-32 computed on the challenge text only? Does it include also the IV?
Could you please refer to any official document where I can find the complete specification of the fields involved in the computation?
Thanks
I'm currently using Erlang for a big project but i have a question regarding a proper proceeding.
I receive bytes over a tcp socket. The bytes are according to a fixed protocol, the sender is a pyton client. The python client uses class inheritance to create bytes from the objects.
Now i would like to (in Erlang) take the bytes and convert these to their equivelant messages, they all have a common message header.
How can i do this as generic as possible in Erlang?
Kind Regards,
Me
Pattern matching/binary header consumption using Erlang's binary syntax. But you will need to know either exactly what bytes or bits your are expecting to receive, or the field sizes in bytes or bits.
For example, let's say that you are expecting a string of bytes that will either begin with the equivalent of the ASCII strings "PUSH" or "PULL", followed by some other data you will place somewhere. You can create a function head that matches those, and captures the rest to pass on to a function that does "push()" or "pull()" based on the byte header:
operation_type(<<"PUSH", Rest/binary>>) -> push(Rest);
operation_type(<<"PULL", Rest/binary>>) -> pull(Rest).
The bytes after the first four will now be in Rest, leaving you free to interpret whatever subsequent headers or data remain in turn. You could also match on the whole binary:
operation_type(Bin = <<"PUSH", _/binary>>) -> push(Bin);
operation_type(Bin = <<"PULL", _/binary>>) -> pull(Bin).
In this case the "_" variable works like it always does -- you're just checking for the lead, essentially peeking the buffer and passing the whole thing on based on the initial contents.
You could also skip around in it. Say you knew you were going to receive a binary with 4 bytes of fluff at the front, 6 bytes of type data, and then the rest you want to pass on:
filter_thingy(<<_:4/binary, Type:6/binary, Rest/binary>>) ->
% Do stuff with Rest based on Type...
It becomes very natural to split binaries in function headers (whether the data equates to character strings or not), letting the "Rest" fall through to appropriate functions as you go along. If you are receiving Python pickle data or something similar, you would want to write the parsing routine in a recursive way, so that the conclusion of each data type returns you to the top to determine the next type, with an accumulated tree that represents the data read so far.
I only covered 8-bit bytes above, but there is also a pure bitstring syntax, which lets you go as far into the weeds with bits and bytes as you need with the same ease of syntax. Matching is a real lifesaver here.
Hopefully this informed more than confused. Binary syntax in Erlang makes this the most pleasant binary parsing environment in a general programming language I've yet encountered.
http://www.erlang.org/doc/programming_examples/bit_syntax.html
I use AES128 crypto in CTR mode for encryption, implemented for different clients (Android/Java and iOS/ObjC). The 16 byte IV used when encrypting a packet is formated like this:
<11 byte nonce> | <4 byte packet counter> | 0
The packet counter (included in a sent packet) is increased by one for every packet sent. The last byte is used as block counter, so that packets with fewer than 256 blocks always get a unique counter value. I was under the assumption that the CTR mode specified that the counter should be increased by 1 for each block, using the 8 last bytes as counter in a big endian way, or that this at least was a de facto standard. This also seems to be the case in the Sun crypto implementation.
I was a bit surprised when the corresponding iOS implementation (using CommonCryptor, iOS 5.1) failed to decode every block except the first when decoding a packet. It seems that CommonCryptor defines the counter in some other way. The CommonCryptor can be created in both big endian and little endian mode, but some vague comments in the CommonCryptor code indicates that this is not (or at least has not been) fully supported:
http://www.opensource.apple.com/source/CommonCrypto/CommonCrypto-60026/Source/API/CommonCryptor.c
/* corecrypto only implements CTR_BE. No use of CTR_LE was found so we're marking
this as unimplemented for now. Also in Lion this was defined in reverse order.
See <rdar://problem/10306112> */
By decoding block by block, each time setting the IV as specified above, it works nicely.
My question: is there a "right" way of implementing the CTR/IV mode when decoding multiple blocks in a single go, or can I expect it to be interoperability problems when using different crypto libs? Is CommonCrypto bugged in this regard, or is it just a question of implementing the CTR mode differently?
The definition of the counter is (loosely) specified in NIST recommendation sp800-38a Appendix B. Note that NIST only specifies how to use CTR mode with regards to security; it does not define one standard algorithm for the counter.
To answer your question directly, whatever you do you should expect the counter to be incremented by one each time. The counter should represent a 128 bit big endian integer according to the NIST specifications. It may be that only the least significant (rightmost) bits are incremented, but that will usually not make a difference unless you pass the 2^32 - 1 or 2^64 - 1 value.
For the sake of compatibility you could decide to use the first (leftmost) 12 bytes as random nonce, and leave the latter ones to zero, then let the implementation of the CTR do the increments. In that case you simply use a 96 bit / 12 byte random at the start, in that case there is no need for a packet counter.
You are however limited to 2^32 * 16 bytes of plaintext until the counter uses up all the available bits. It is implementation specific if the counter returns to zero or if the nonce itself is included in the counter, so you may want to limit yourself to messages of 68,719,476,736 = ~68 GB (yes that's base 10, Giga means 1,000,000,000).
because of the birthday problem you've got a 2^48 chance (48 = 96 / 2) of creating a collision for the nonce (required for each message, not each block), so you should limit the amount of messages;
if some attacker tricks you into decrypting 2^32 packets for the same nonce, you run out of counter.
In case this is still incompatible (test!) then use the initial 8 bytes as nonce. Unfortunately that does mean that you need to limit the number of messages because of the birthday problem.
Further investigations sheds some light on the CommonCrypto problem:
In iOS 6.0.1 the little endian option is now unimplemented. Also, I have verified that CommonCrypto is bugged in that the CCCryptorReset method does not in fact change the IV as it should, instead using pre-existing IV. The behaviour in 6.0.1 is different from 5.x.
This is potentially a security risc, if you initialize CommonCrypto with a nulled IV, and reset it to the actual IV right before encrypting. This would lead to all your data being encrypted with the same (nulled) IV, and multiple streams (that perhaps should have different IV but use same key) would leak data via a simple XOR of packets with corresponding ctr.
I've been doing some preliminary research in the area of message digests. Specifically collision attacks of cryptographic hash functions such as MD5 and SHA-1, such as the Postscript example and X.509 certificate duplicate.
From what I can tell in the case of the postscript attack, specific data was generated and embedded within the header of the postscript file (which is ignored during rendering) which brought about the internal state of the md5 to a state such that the modified wording of the document would lead to a final MD value equivalent to the original postscript file.
The X.509 took a similar approach where by data was injected within the comment/whitespace sections of the certificate.
Ok so here is my question, and I can't seem to find anyone asking this question:
Why isn't the length of ONLY the data being consumed added as a final block to the MD calculation?
In the case of X.509 - Why is the whitespace and comments being taken into account as part of the MD?
Wouldn't a simple processes such as one of the following be enough to resolve the proposed collision attacks:
MD(M + |M|) = xyz
MD(M + |M| + |M| * magicseed_0 +...+ |M| * magicseed_n) = xyz
where :
M : is the message
|M| : size of the message
MD : is the message digest function (eg: md5, sha, whirlpool etc)
xyz : is the pairing of the acutal message digest value for the message M and |M|. <M,|M|>
magicseed_{i}: Is a set of random values generated with seed based on the internal-state prior to the size being added.
This technqiue should work, as to date all such collision attacks rely on adding more data to the original message.
In short, the level of difficulty involved in generating a collision message such that:
It not only generates the same MD
But is also comprehensible/parsible/compliant
and is also the same size as the original message,
is immensely difficult if not near impossible. Has this approach ever been discussed? Any links to papers etc would be nice.
Further Question: What is the lower bound for collisions of messages of common length for a hash function H chosen randomly from U, where U is the set of universal hash functions ?
Is it 1/N (where N is 2^(|M|)) or is it greater? If it is greater, that implies there is more than 1 message of length N that will map to the same MD value for a given H.
If that is the case, how practical is it to find these other messages? bruteforce would be of O(2^N), is there a method of time complexity less than bruteforce?
Can't speak for the rest of the questions, but the first one is fairly simple - adding length data to the input of the md5, at any stage of the hashing process (1st block, Nth block, final block) just changes the output hash. You couldn't retrieve that length from the output hash string afterwards. It's also not inconceivable that a collision couldn't be produced from another string with the exact same length in the first place, so saying "the original string was 17 bytes" is meaningless, because the colliding string could also be 17 bytes.
e.g.
md5("abce(17bytes)fghi") = md5("abdefghi<long sequence of text to produce collision>")
is still possible.
In the case of X.509 certificates specifically, the "comments" are not comments in the programming language sense: they are simply additional attributes with an OID that indicates they are to be interpreted as comments. The signature on a certificate is defined to be over the DER representation of the entire tbsCertificate ('to be signed' certificate) structure which includes all the additional attributes.
Hash function design is pretty deep theory, though, and might be better served on the Theoretical CS Stack Exchange.
As #Marc points out, though, as long as more bits can be modified than the output of the hash function contains, then by the pigeonhole principle a collision must exist for some pair of inputs. Because cryptographic hash functions are in general designed to behave pseudo-randomly over their inputs, collisions will tend toward being uniformly distributed over possible inputs.
EDIT: Incorporating the message length into the final block of the hash function would be equivalent to appending the length of everything that has gone before to the input message, so there's no real need to modify the hash function to do this itself; rather, specify it as part of the usage in a given context. I can see where this would make some types of collision attacks harder to pull off, since if you change the message length there's a changed field "downstream" of the area modified by the attack. However, this wouldn't necessarily impede the X.509 intermediate CA forgery attack since the length of the tbsCertificate is not modified.
I'm trying to encrypt some date using a public key derived form the exchange key pair made with the CALG_RSA_KEYX key type. I determined the block size was 512 bits using cryptgetkeyparam KP_BLOCKLEN. It seems the maximum number of bytes I can feed cryptencrypt in 53 (424 bits) for which I get an encrypted length of 64 back. How can I determine how many bytes I can feed into cryptencrypt? If I feed in more than 53 bytes, the call fails.
RSA using the usual PKCS#1 v.1.5 mode can encrypt a message that is at most k-11 bytes, where k is the length of the modulus in bytes. So a 512 bit key can encrypt up to 53 bytes and a 1024 bit key can encrypt up to 117 bytes.
RSA using OAEP can encrypt a message up to k-2*hLen-2, where k is the modulus byte-length and hLen is the length of the output of the underlying hash-function. So using SHA-1, a 512 bit key can encrypt up to 22 bytes and a 1024 bit key can encrypt up to 86 bytes.
You should not normally use a RSA key to encrypt your message directly. Instead you should generate a random symmetric key (f.x. an AES key), encrypt your message with the symmetric key, encrypt the key with the RSA key and transmit both encryptions to the recipient. This is usually called hybrid encryption.
EDIT: Although this response is marked as accepted by the OP, please see Rasmus Faber response instead, as this is a much better response. Posted 24 hours later, Rasmus's response corrects factual errors,in particular a mis-characterization of OAEP as a block cipher; OAEP is in fact a scheme used atop PKCS-1's Encoding Primitive for the purpose of key-encryption. OAEP is more secure and puts an even bigger limit on the maximum message length, this limit is also bound to a hash algorithm and its key length.
Another shortcoming of the following reply is its failure to stress that CALG_RSA_KEYX should be used exclusively for the key exchange, after which transmission of messages of any length can take place with whatever symmetric key encryption algorithm desired. The OP was aware of this, he was merely trying to "play" with the PK, and I did cover that much, albeit deep in the the long remarks thread.
Fore the time being, I'm leaving this response here, for the record, and also as Mike D may want to refer to it, but do remark-me-in, if you think that it would be better to remove it altogether; I don't mind doing so for sake of clarity!
-mjv- Sept 29, 2009
Original reply:
Have you check the error code from GetLastError(), following cryptencrypt()'s false return?
I suspect it might be NTE_BAD_LEN, unless there's be some other issue.
Maybe you can post the code that surrounds your calling criptencryt().
Bingo, upon seeing the CryptEncrypt() call.
You do not seem to be using the RSAES w/ OAEP scheme, since you do not have the CRYPT_OAEP flag on. This OAEP scheme is a block cipher based upon RSAES. This latter encryption algorihtm, however, can only encrypt messages slightly less than its key size (expressed in bytes). This is due to the minimum padding size defined in PKCS#1; such padding helps protect the algorithm from some key attacks, I think the ones based on known cleartext).
Therefore you have three options:
use the CRYPT_OAEP in the Flag parameter to CryptEncrypt()
extend the key size to say 1024 (if you have control over it, beware that longer keys will increase the time to encode/decode...)
Limit yourself to clear-text messages shorter than 54 bytes.
For documentation purposes, I'd like to make note of a few online resources.
- The [RSA Labs][1] web site which is very useful in all things crypto.
- Wikipedia articles on the subject are also quite informative, easier to read
and yet quite factual (I think).
When in doubt, however, do consult a real crypto specialist, not someone like me :-)