Length extension attack doubts - cryptography

So I've been studying this concept of length extension attacks and there are few things that I noticed during my study about it which are not very bright to me.
1.Research papers are explaining how you can append some type of data to the end and make newly formed data. For example
Desired New Data: count=10&lat=37.351&user_id=1&long=-119.827&waffle=eggo&waffle=liege
(notice 2 waffles). My question is if a parser function on the server side can track duplicate attributes, could then the entire length extension attack be nonsense? Because the server would notice duplicate attributes. Is a proper parser that is made to check any duplicates a good solution versus length extension attacks? I'm aware of HMAC approach and other protections, but specifically talking just about parsers here now.
2.Research says that only vulnerable data is H(key|message). They claim that H(message|key) won't work for the attacker because we would have to append a new key (which we obviously don't know). My question is why would we have to append a new key? We don't do it when we are attacking H(key|message). Why can't we rely on the fact that we will pass the verification test (we would create the correct hash) and that if the parser tries to extract the key from it, that it would take the only key in the block we send out and resume from there? Why would we have to send 2 keys? Why doesn't attack against H(message|key) work?

My question is if a parser function on server side can track duplicate attributes, could then the entire length extension attack be a nonsense?
You are talking about a well-written parser. Writing software is hard and writing correct software is very hard.
In that example, you have seen an overwritten attribute. Are you able to say that a good parser must take the last one or the first one? What is the rule? There can be stations that the last one must be taken! That is an attack that can be applied or not. This depends on the station. If you consider that the knowledge of the length extension attack goes back to 1990s, then finding a place applicable to this should amaze someone!. And, it is applied in the wild to Flickr API in 2009, after almost 20 years;
Flickr's API Signature Forgery by Thai Duong and Juliano Rizzo Published on Sep. 28, 2009.
My question is why would we have to append new key? We don't do it when we are attacking H(key|message). Why can't we relay on the fact that we will pass verification test (we would create correct hash) and that if parser tries to extract key from it, that it would take the only key in the block we send out and resume from there. Why would we have to send 2 keys? Why doesnt attack against H(message|key) work?
The attack is a signature forgery. The key is not known to the attacker, but they can still forge new signatures. The new message and signature - extended hash - is sent to the server, then the server takes the key and appends it to the message to execute a canonical verification, that is; if it does the signature is valid.
The parser doesn't extract the key, it already knows the key. The point is that can you make sure that the data is really extended or not. The padding rule is simple, add 1 and fill many zeroes so that the last 64 (128) is the length encoding (very simplified, for example, the final length must be multiple of 512 for SHA256). To see that there is another padding inside you must check every block and then you may claim that there is an extension attack. Yes, you can do this, however, the one of aims of cryptography is to reduce the dependencies, too. If we can create a better signature that eliminates the checking then we suggest to left the others. This enables the software developers to write more secure implementation easily.
Why doesn't attack against H(message|key) work?
Simple, you get the extended message message|extended and send the extended hash
H(message|key|extended) to the server. Then the server takes the message message|extended and appends the key message|extended|key and hashes it H(message|extended|key) and clearly this is not equal to the extended one H(message|key|extended)
Note that the trimmed version of the SHA2 series like SHA-512/256 has resistance to length extension attacks. SHA3 is immune to it by design and that enables a simple KMAC signature scheme. Blake2 is also immune since it is designed with the HAIFA construction.

Related

Writing a kama sutra cipher

I took up cryptography recently, and 1 of my task was to create a kama sutra cipher. Up till the point of generating the keys, I will have no problems. However, due to the nature of kama sutra, I believe that the keys are not supposed to be hard coded into the program, but rather generated for each plain text it takes in.
What I understand is that the cipher text's length should be the same as the length of plain text. However, the thing is that where do I place the key, such that as long as the cipher text is generated by my program, the program would be able to decipher it even if the program was closed. Given that this is an algorithm, I am sure that I should not be looking at storing the key in another flat file/ database.
There are not many related information online regarding this cipher. What I saw are those that allow you to randomise a key set, generate a cipher text based on the given key set. When decrypting, you will also need to provide the same key set. Is this the correct way of implementation?
For those who have knowledge about this, please guide me along.
If you want to be able to decrypt the cyphertext, then you need to be able to recover the key whenever you need. For a classical cypher, this was usually done by using the same key for multiple messages, see the Caesar Cypher for an example. Caesar used a constant key, a -3/+3 shift while Augustus used a +1/-1 shift.
You may want to consult your instructor as to whether a fixed key or a varying key is required.
It will be simpler to develop a fixed key version, and then to add varying key functionality on top. That way you can get the rest of the program working correctly.
You may also want to look at classical techniques for using a keyphrase to mix an alphabet.

are there any multiple encryption standards?

I'm looking to encrypt some data using multiple ciphers (ie, AES, Serpent, Twofish...), and I want the user to be able to choose which ciphers are used and in what order. Are there any standards available for defining the metadata? My understanding is that what I dont want to do is prefix each layer with a magic number indicating the type of cipher and parameters used in the next layer because it will expose me to a plaintext attack. I took a peak at the PKCS #8 RFC, and it appears that only a single layer of encryption is supported here:
EncryptedPrivateKeyInfo ::= SEQUENCE {
encryptionAlgorithm EncryptionAlgorithmIdentifier,
encryptedData EncryptedData }
I suppose I could just define the encryptionAlgorithm to be an array of values, but I want to make sure there isnt already a standard defined somewhere that I have missed.
PKCS#7 and its successor CMS allows for multiple layers. The EncryptedData contains an EncryptedContentInfo that when decrypted can contain another EncryptedData. This is usually used to combine encryption and signing, but there is no reason that it cannot be used for multiple layers of encryption (though support in other implementations may vary).
XML Encryption is another common standard for cryptographic metadata. It has no direct support for nesting encryption layers, but since it relies on the specification of the enclosing schema to specify the expected format of the encrypted data, there is no reason it could not specify multiple layers.
The OpenPGP Message Format is the final standardized format I can think of. Like CMS it supports nested layers of encryption (in theory - implementations might or might not support it).
Neither of the formats supports specifying nested encryption-layers upfront: the metadata for the nested layers will be encrypted, so you do not avoid the known-plaintext weakness. However, since you should always choose an algorithm that is safe against known-plaintext attacks anyway, I do not see that as a big problem.
Not that I'm aware of, because this isn't a cryptographic best-practice. Select a single, well known and peer reviewed cipher, and use that. Build your code so that you (or your users, rather) can easily swap out existing ciphers for a new one if a compromise is found, but don't expect to nest ciphers.

Password strength check: comparing to previous passwords

Every now and then I come across applications that force you to change passwords once in a while. Almost universally, they have this strange requirement for the new password: it has to be "significantly" different from your previous password(s).
While at first this sounds logical, next thing I think is: how do they do that? Do they store my passwords in plain text? I would have accepted the answer that they do, if it wasn't for the fact that these are kinds of applications that pretend to care about security so much they force you to change your password if it is expired! Microsoft Exchange is one example of this.
I'm not very good at cryptography and hash functions, so my question is this: Is it possible to enforce this kind of policy without storing passwords in plain text?
Do you know how this policy is implemented in real world applications?
UPDATE: An Example.
I was recently changing my Microsoft Exchange password. I only use Web Access, so it might be different a little -- I have no idea.
So, it forces me to change my password. What I do sometimes is I change it to something new and then change it back almost immediately. The freaky part is that It did not allow me to even change it back because of this. I tried changing it a little, by adding a letter in front of it or changing one symbol -- no luck, it was complaining.
With a typical hash, the best you can do is see if the new password is exactly equal to previous ones. You can break the password into multiple hashes in order to get more flexible with comparison, for example 3 hashes:
Alpha characters only
Numeric characters only
All other characters
You could for example require all the hashes to change to be accepted, to prevent users from just changing their password from SecretPassword01 to SecretPassword02.
A cryptographic expert may weigh in here on if this could be made as secure as a single hash.
NOTE that this is not as secure as a single hash, so before you go implementing this, make sure you have really done your research.
When changing password you're usually asked for the old one to confirm your identity. It's then trivial to compare the old one and the new one to see how much they differ. TBH I don't know how to compare to several previous passwords without storing them, but that's getting into the territory of ridiculous policies anyway.

Isn't it difficult to recognize a successful decryption?

When I hear about methods for breaking encryption algorithms, I notice there is often focused on how to decrypt very rapidly and how to reduce the search space. However, I always wonder how you can recognize a successful decryption, and why this doesn't form a bottleneck. Or is it often assumed that a encrypted/decrypted pair is known?
From Cryptonomicon:
There is a compromise between the two
extremes of, on the one hand, not
knowing any of the plaintext at all,
and, on the other, knowing all of it.
In the Cryptonomicon that falls under
the heading of cribs. A crib is an
educated guess as to what words or
phrases might be present in the
message. For example if you were
decrypting German messages from World
War II, you might guess that the
plaintext included the phrase "HElL
HITLER" or "SIEG HElL." You might pick
out a sequence of ten characters at
random and say, "Let's assume that
this represented HEIL HITLER. If that
is the case, then what would it imply
about the remainder of the message?"
...
Sitting down in his office with the
fresh Arethusa intercepts, he went to
work, using FUNERAL as a crib: if this
group of seven letters decrypts to
FUNERAL, then what does the rest of
the message look like? Gibberish?
Okay, how about this group of seven
letters?
Generally, you have some idea of the format of the file you expect to result from the decryption, and most formats provide an easy way to identify them. For example, nearly all binary formats such as images, documents, zipfiles, etc, have easily identifiable headers, while text files will contain only ASCII, or only valid UTF-8 sequences.
In assymetric cryptography you usually have access to the public key. Therefore, any decryption of an encrypted ciphertext can be re-encrypted using the public key and compared to the original ciphertext, thus revealing if the decryption was succesful.
The same is true for symmetric encryption. If you think you have decrypted a cipher, you must also think that you have found the key. Therefore, you can use that key to encrypt your, presumably correct, decrypted text and see if the encrypted result is identical to the original ciphertext.
For symmetric encryption where the key length is shorter than the cipher-text length, you're guaranteed to not be able to produce every possible plain-text. You can probably guess what form your plain--text will take, to some degree -- you probably know whether it's an image, or XML, or if you don't even know that much then you can assume you'll be able to run file on it and not get 'data'. You have to hope that there are only a few keys which would give you even a vaguely sensible decryption and only one which matches the form you are looking for.
If you have a sample plain-text (or partial plain-text) then this gets a lot easier.

How to store sensitive data (e.g. passwords, API keys) in Cocoa app?

I need to provide some passwords, API keys and similar sensitive data in my code. What are best practices in that regard? Hard-coded? SQlite? Some cryptographic framework?
Like the others said, you can't both secure an API key and use it in your app. However, you can do simple obfuscation relatively easy and if the payoff to the cracker is low then you may not get burned.
One simple technique is to break your API key into several sub-strings. Make sure you put them in your code in some random order. For instance, if your API key is 12345678901234567890 you might break it up into 5 sub-strings like this:
static char *part1 = "12345";
static char *part5 = "7890";
static char *part3 = "890123";
static char *part2 = "67";
static char *part4 = "456";
If you run /usr/bin/strings on the resulting binary then you should not see the API key in order. Instead you'll see the API substrings in the order listed in your C file. With 5 substrings like this, that is 5*4*3*2*1=120 permutations. If you break it into 13 substrings you're looking at over 6 billion permutations.
However, that won't stop someone who knows what they're doing from getting your API key if they want it. Eventually you'll have to combine the strings together and pass it to one of your methods, at which point a cracker could use a debugger to set a breakpoint and inspect memory.
Use the Mac OS X Keychain:
Keychain Services Reference
Mac Dev Center: Security Overview
Update:
If your goal is to conceal information from your end users, then I'm not aware of a built-in way to do this.
Hard-coding is a start, but a user with a debugger can read the string out of your binary. To combat this, I've heard of developers that store the data as many separate strings and then combine them at the last minute. YMMV
You can use anyone of the posix compliant C cytographic libraries but as noted above anyone with the skills to crack your code can defeat the encryption by finding the key.
There are a few tricks you can use to slowdown a cracker: (1) Use gibberish names for classes, methods and variables to obscure the code handling encryption e.g. -(void) qwert asdf:(NSString *) lkj; (2) Put in duplicate routines and branches that don't actually do anything. (3) Hide data in unexpected place such as within images.
To add to the direct answers: It's all for naught if you don't use a secure method of transport, such as TLS or SSH. If you're sending the reconstituted API key in clear text, it's not hard for someone to use something like Wireshark or tcpdump (or, a bit more difficultly, a customized router) to capture it after it leaves your app.
If whatever API you're using doesn't offer a method of encrypted access, then there's nothing you can do about that (besides ask for one), but if it does, then you should use it.
You can not secure them. You can only try to hide them so it's not too obvious.
Security by obscurity that is. But I don't think there is a way to keep someone who is willing to get his hands dirty from finding them.