How to extract encryption and MAC keys using KDF (X9.63) defined by javacardx.security.derivation - cryptography

As per Java Card v3.1 new package is defined javacardx.security.derivation
https://docs.oracle.com/en/java/javacard/3.1/jc_api_srvc/api_classic/javacardx/security/derivation/package-summary.html
KDF X9.63 works on three inputs: input secret, counter and shared info.
Depends on length of generated key material, multiple rounds on hash is carried out to generated final output.
I am using this KDF via JC API to generated 64 bytes of output (which is carried out by 2 rounds of SHA-256) for a 16 bytes-Encryption Key, a 16 bytes-IV, and a 32 bytes-MAC Key.
Note: This is just pseudo code to put my question with necessary details.
DerivationFunction df = DerivationFunction.getInstance(DerivationFunction.ALG_KDF_ANSI_X9_63, false);
df.init(KDFAnsiX963Spec(MessageDigest.ALG_SHA_256, input, sharedInfo, (short) 64);
SecretKey encKey = KeyBuilder.buildKey(KeyBuilder.TYPE_AES, (short)16, false);
SecretKey macKey = KeyBuilder.buildKey(KeyBuilder.TYPE_HMAC, (short)32, false);
df.nextBytes(encKey);
df.nextBytes(IVBuffer, (short)0, (short)16);
df.lastBytes(macKey);
I have the following questions:
When rounds of KDF are performed? Are these performed during df.init() or during df.nextBytes() & df.lastBytes()?
One KDF round will generate 32 bytes output (considering SHA-256 algorithm) then how API's df.nextBytes() & df.lastBytes() will work with any output expected length < 32 bytes?
In this KDF counter is incremented in every next round then how counter will be managed between df.nextBytes() & df.lastBytes() API's?

When rounds of KDF are performed? Are these performed during df.init() or during df.nextBytes() & df.lastBytes()?
That seems to be implementation specific to me. It will probably be faster to perform all the calculations at one time, but in that case it still makes sense to wait for the first request of the bytes. On the other hand RAM is also often an issue, so on demand generation also makes some sense. That requires a somewhat trickier implementation though.
The fact that the output size is pre-specified probably indicates that the simpler method of generating all the key material at once is at least foreseen by the API designers (they probably created an implementation before subjecting it to peer review in the JCF).
One KDF round will generate 32 bytes output (considering SHA-256 algorithm) then how API's df.nextBytes() & df.lastBytes() will work with any output expected length < 32 bytes?
It will commonly return the leftmost bytes (of the hash output) and likely leave the rest of the bytes in a buffer. This buffer will likely be destroyed together with the rest of the state when lastBytes is called (so don't forget to do so).
Note that the API clearly states that you have to re-initialize the DerivationFunction instance if you want to use it again. So that is a very strong indication that they though of destruction of key material (something that is required by FIPS and Common Criteria certification, not just common sense).
Other KDF's could have a different way of returning bytes, but using the leftmost bytes and then add rounds to the right is so common you can call it universal. For the ANSI X9.63 KDF this is certainly the case and it is clearly specified in the standard that way.
In this KDF counter is incremented in every next round then how counter will be managed between df.nextBytes() & df.lastBytes() API's?
These are methods of the same class and cannot be viewed separately, so they are not separate API's. Class instances can keep state in anyway they want. It might simply hold the counter as class variable, but if it decided to generate the bytes during init or the first nextBytes / lastBytes call then the counter is not even required anymore.

Related

How does libgcrypt increment the counter for CTR mode?

I have a file encrypted with AES-256 using libgcrypt's CTR mode implementation.
I want to be able to decrypt the file in parts (e.g. decrypting blocks 5-10 out of 20 blocks without decrypting the whole file).
I know that by using CTR mode, I should be able to do it. All I need is to know the correct counter.
The problem lies in the fact that all I have is the initial counter for block 0. If I want to decrypt block 5 for example, I need another counter, one that is achieved by doing some action on the initial counter for each block from 0 to 5.
I can't seem to find an API that libgcrypt exposes in order to calculate counter for later blocks given the initial counter.
How can I calculate the counter of later blocks (e.g. block #5) given the counter of block #0?
When in doubt, go to the source. Here's the code in gcrypt's generic CTR mode implementation (_gcry_cipher_ctr_encrypt() in cipher-ctr.c) that increments the counter:
for (i = blocksize; i > 0; i--)
{
c->u_ctr.ctr[i-1]++;
if (c->u_ctr.ctr[i-1] != 0)
break;
}
There are other, more optimized implementations of counter incrementing found in other places in the libgcrypt source, e.g. in the various cipher-specific fast bulk CTR encryption implementations, but this generic one happens to be nice and readable. (Of course, all those alternative implementations need to produce the same sequence of counter values anyway, so that gcrypt stays compatible with itself.)
OK, so what does it actually do?
Well, looking at the context (or, more specifically, cipher-internal.h), it's clear that c->u_ctr.ctr is an array of blocksize unsigned bytes (where blocksize equals 16 bytes for AES). The code above increments its last byte by one, and checks if the result wrapped around to zero. If it didn't, it stops; if it did wrap, the code then moves to the second-to-last byte, increments it, checks to see if it wrapped, and keeps looping until it either finds a byte that doesn't wrap around when incremented, or it has incremented all of the blocksize bytes.
So, for example, if your original counter value was {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}, then after incrementing it would become {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1}. If incremented again, it would become {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2}, then {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3}, and so on up to {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255}, after which the next counter value would be {0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0} (and after that {0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1}, {0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2}, {0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,3}, etc.).
Of course, what this is really doing is just arithmetically incrementing a single (blocksize × 8)-bit integer, stored in memory in big-endian byte order.

how to represent message as an integer between 1 and n-1?

I am trying to implement simple El-Gamal cryptosystem.
And I can't understand how to represent message as an integer between 1 and n-1.
The only thing that comes to my mind is:
if n bit length is k, then divide input message m on t | t < k bits and each piece of bits use as integer number.
I think It is wrong.
So how to represent message as an integer between 1 and n-1?
You could do that which is essentially the equivalent of using ECB mode in block ciphers, but there are attacks on this. An attacker may reorder the different blocks of the ciphertext and you would decrypt it without problem, but the received plaintext would be broken without you knowing this. This may also open the door for replay attacks since the blocks are all encrypted independently. You would need some kind of authenticated encryption.
Back to your original question. Such a problem is usually solved by using hybrid encryption. A block cipher like AES is used to encrypt the whole plaintext with a random key. This random key is in turn encrypted through ElGamal since the key is small enough to be represented in < k bits.
Now depending on the mode of operation of the block cipher this could still be malleable. You would either need to put a hash of the ciphertext/plaintext next to the random key as an integrity check. Or otherwise use an authenticated mode of operation like GCM and add the resulting tag next to the random key. Depending on k, this should fit.
Note that you should use some kind of padding for random key | hash/tag if it doesn't reach k.

Interoperability of AES CTR mode?

I use AES128 crypto in CTR mode for encryption, implemented for different clients (Android/Java and iOS/ObjC). The 16 byte IV used when encrypting a packet is formated like this:
<11 byte nonce> | <4 byte packet counter> | 0
The packet counter (included in a sent packet) is increased by one for every packet sent. The last byte is used as block counter, so that packets with fewer than 256 blocks always get a unique counter value. I was under the assumption that the CTR mode specified that the counter should be increased by 1 for each block, using the 8 last bytes as counter in a big endian way, or that this at least was a de facto standard. This also seems to be the case in the Sun crypto implementation.
I was a bit surprised when the corresponding iOS implementation (using CommonCryptor, iOS 5.1) failed to decode every block except the first when decoding a packet. It seems that CommonCryptor defines the counter in some other way. The CommonCryptor can be created in both big endian and little endian mode, but some vague comments in the CommonCryptor code indicates that this is not (or at least has not been) fully supported:
http://www.opensource.apple.com/source/CommonCrypto/CommonCrypto-60026/Source/API/CommonCryptor.c
/* corecrypto only implements CTR_BE. No use of CTR_LE was found so we're marking
this as unimplemented for now. Also in Lion this was defined in reverse order.
See <rdar://problem/10306112> */
By decoding block by block, each time setting the IV as specified above, it works nicely.
My question: is there a "right" way of implementing the CTR/IV mode when decoding multiple blocks in a single go, or can I expect it to be interoperability problems when using different crypto libs? Is CommonCrypto bugged in this regard, or is it just a question of implementing the CTR mode differently?
The definition of the counter is (loosely) specified in NIST recommendation sp800-38a Appendix B. Note that NIST only specifies how to use CTR mode with regards to security; it does not define one standard algorithm for the counter.
To answer your question directly, whatever you do you should expect the counter to be incremented by one each time. The counter should represent a 128 bit big endian integer according to the NIST specifications. It may be that only the least significant (rightmost) bits are incremented, but that will usually not make a difference unless you pass the 2^32 - 1 or 2^64 - 1 value.
For the sake of compatibility you could decide to use the first (leftmost) 12 bytes as random nonce, and leave the latter ones to zero, then let the implementation of the CTR do the increments. In that case you simply use a 96 bit / 12 byte random at the start, in that case there is no need for a packet counter.
You are however limited to 2^32 * 16 bytes of plaintext until the counter uses up all the available bits. It is implementation specific if the counter returns to zero or if the nonce itself is included in the counter, so you may want to limit yourself to messages of 68,719,476,736 = ~68 GB (yes that's base 10, Giga means 1,000,000,000).
because of the birthday problem you've got a 2^48 chance (48 = 96 / 2) of creating a collision for the nonce (required for each message, not each block), so you should limit the amount of messages;
if some attacker tricks you into decrypting 2^32 packets for the same nonce, you run out of counter.
In case this is still incompatible (test!) then use the initial 8 bytes as nonce. Unfortunately that does mean that you need to limit the number of messages because of the birthday problem.
Further investigations sheds some light on the CommonCrypto problem:
In iOS 6.0.1 the little endian option is now unimplemented. Also, I have verified that CommonCrypto is bugged in that the CCCryptorReset method does not in fact change the IV as it should, instead using pre-existing IV. The behaviour in 6.0.1 is different from 5.x.
This is potentially a security risc, if you initialize CommonCrypto with a nulled IV, and reset it to the actual IV right before encrypting. This would lead to all your data being encrypted with the same (nulled) IV, and multiple streams (that perhaps should have different IV but use same key) would leak data via a simple XOR of packets with corresponding ctr.

Collision Attacks, Message Digests and a Possible solution

I've been doing some preliminary research in the area of message digests. Specifically collision attacks of cryptographic hash functions such as MD5 and SHA-1, such as the Postscript example and X.509 certificate duplicate.
From what I can tell in the case of the postscript attack, specific data was generated and embedded within the header of the postscript file (which is ignored during rendering) which brought about the internal state of the md5 to a state such that the modified wording of the document would lead to a final MD value equivalent to the original postscript file.
The X.509 took a similar approach where by data was injected within the comment/whitespace sections of the certificate.
Ok so here is my question, and I can't seem to find anyone asking this question:
Why isn't the length of ONLY the data being consumed added as a final block to the MD calculation?
In the case of X.509 - Why is the whitespace and comments being taken into account as part of the MD?
Wouldn't a simple processes such as one of the following be enough to resolve the proposed collision attacks:
MD(M + |M|) = xyz
MD(M + |M| + |M| * magicseed_0 +...+ |M| * magicseed_n) = xyz
where :
M : is the message
|M| : size of the message
MD : is the message digest function (eg: md5, sha, whirlpool etc)
xyz : is the pairing of the acutal message digest value for the message M and |M|. <M,|M|>
magicseed_{i}: Is a set of random values generated with seed based on the internal-state prior to the size being added.
This technqiue should work, as to date all such collision attacks rely on adding more data to the original message.
In short, the level of difficulty involved in generating a collision message such that:
It not only generates the same MD
But is also comprehensible/parsible/compliant
and is also the same size as the original message,
is immensely difficult if not near impossible. Has this approach ever been discussed? Any links to papers etc would be nice.
Further Question: What is the lower bound for collisions of messages of common length for a hash function H chosen randomly from U, where U is the set of universal hash functions ?
Is it 1/N (where N is 2^(|M|)) or is it greater? If it is greater, that implies there is more than 1 message of length N that will map to the same MD value for a given H.
If that is the case, how practical is it to find these other messages? bruteforce would be of O(2^N), is there a method of time complexity less than bruteforce?
Can't speak for the rest of the questions, but the first one is fairly simple - adding length data to the input of the md5, at any stage of the hashing process (1st block, Nth block, final block) just changes the output hash. You couldn't retrieve that length from the output hash string afterwards. It's also not inconceivable that a collision couldn't be produced from another string with the exact same length in the first place, so saying "the original string was 17 bytes" is meaningless, because the colliding string could also be 17 bytes.
e.g.
md5("abce(17bytes)fghi") = md5("abdefghi<long sequence of text to produce collision>")
is still possible.
In the case of X.509 certificates specifically, the "comments" are not comments in the programming language sense: they are simply additional attributes with an OID that indicates they are to be interpreted as comments. The signature on a certificate is defined to be over the DER representation of the entire tbsCertificate ('to be signed' certificate) structure which includes all the additional attributes.
Hash function design is pretty deep theory, though, and might be better served on the Theoretical CS Stack Exchange.
As #Marc points out, though, as long as more bits can be modified than the output of the hash function contains, then by the pigeonhole principle a collision must exist for some pair of inputs. Because cryptographic hash functions are in general designed to behave pseudo-randomly over their inputs, collisions will tend toward being uniformly distributed over possible inputs.
EDIT: Incorporating the message length into the final block of the hash function would be equivalent to appending the length of everything that has gone before to the input message, so there's no real need to modify the hash function to do this itself; rather, specify it as part of the usage in a given context. I can see where this would make some types of collision attacks harder to pull off, since if you change the message length there's a changed field "downstream" of the area modified by the attack. However, this wouldn't necessarily impede the X.509 intermediate CA forgery attack since the length of the tbsCertificate is not modified.

crypto api - block mode encryption determining input byte count

I'm trying to encrypt some date using a public key derived form the exchange key pair made with the CALG_RSA_KEYX key type. I determined the block size was 512 bits using cryptgetkeyparam KP_BLOCKLEN. It seems the maximum number of bytes I can feed cryptencrypt in 53 (424 bits) for which I get an encrypted length of 64 back. How can I determine how many bytes I can feed into cryptencrypt? If I feed in more than 53 bytes, the call fails.
RSA using the usual PKCS#1 v.1.5 mode can encrypt a message that is at most k-11 bytes, where k is the length of the modulus in bytes. So a 512 bit key can encrypt up to 53 bytes and a 1024 bit key can encrypt up to 117 bytes.
RSA using OAEP can encrypt a message up to k-2*hLen-2, where k is the modulus byte-length and hLen is the length of the output of the underlying hash-function. So using SHA-1, a 512 bit key can encrypt up to 22 bytes and a 1024 bit key can encrypt up to 86 bytes.
You should not normally use a RSA key to encrypt your message directly. Instead you should generate a random symmetric key (f.x. an AES key), encrypt your message with the symmetric key, encrypt the key with the RSA key and transmit both encryptions to the recipient. This is usually called hybrid encryption.
EDIT: Although this response is marked as accepted by the OP, please see Rasmus Faber response instead, as this is a much better response. Posted 24 hours later, Rasmus's response corrects factual errors,in particular a mis-characterization of OAEP as a block cipher; OAEP is in fact a scheme used atop PKCS-1's Encoding Primitive for the purpose of key-encryption. OAEP is more secure and puts an even bigger limit on the maximum message length, this limit is also bound to a hash algorithm and its key length.
Another shortcoming of the following reply is its failure to stress that CALG_RSA_KEYX should be used exclusively for the key exchange, after which transmission of messages of any length can take place with whatever symmetric key encryption algorithm desired. The OP was aware of this, he was merely trying to "play" with the PK, and I did cover that much, albeit deep in the the long remarks thread.
Fore the time being, I'm leaving this response here, for the record, and also as Mike D may want to refer to it, but do remark-me-in, if you think that it would be better to remove it altogether; I don't mind doing so for sake of clarity!
-mjv- Sept 29, 2009
Original reply:
Have you check the error code from GetLastError(), following cryptencrypt()'s false return?
I suspect it might be NTE_BAD_LEN, unless there's be some other issue.
Maybe you can post the code that surrounds your calling criptencryt().
Bingo, upon seeing the CryptEncrypt() call.
You do not seem to be using the RSAES w/ OAEP scheme, since you do not have the CRYPT_OAEP flag on. This OAEP scheme is a block cipher based upon RSAES. This latter encryption algorihtm, however, can only encrypt messages slightly less than its key size (expressed in bytes). This is due to the minimum padding size defined in PKCS#1; such padding helps protect the algorithm from some key attacks, I think the ones based on known cleartext).
Therefore you have three options:
use the CRYPT_OAEP in the Flag parameter to CryptEncrypt()
extend the key size to say 1024 (if you have control over it, beware that longer keys will increase the time to encode/decode...)
Limit yourself to clear-text messages shorter than 54 bytes.
For documentation purposes, I'd like to make note of a few online resources.
- The [RSA Labs][1] web site which is very useful in all things crypto.
- Wikipedia articles on the subject are also quite informative, easier to read
and yet quite factual (I think).
When in doubt, however, do consult a real crypto specialist, not someone like me :-)