As there have been significant advances in the cryptoanalysis of SHA1 it's supposed to be phased out in favor of SHA2 (wikipedia).
For use as underlying hash function in PBKDF2, however, it's basically used as a PRNG. As such it should be still secure to use SHA1 as hash for PBKDF2, right?
None of the currently known weaknesses on SHA-1 has any impact on its security when used in HMAC, a fortiori when used in PBKDF2. For that matter, MD5 would be fine too (but not MD4).
However, SHA-1 is not good for public relations: if, in 2011, you use SHA-1, then you must prepare yourself to have to justify that choice. On the other hand, SHA-256 is a fine "default function" and nobody will question it.
There is no performance issue in PBKDF2 (PBKDF2 includes an "iteration count" meant to make it exactly as slow as needed) so there is very little reason to prefer SHA-1 over SHA-256 here. However, if you have an existing, deployed system which uses PBKDF2-with-SHA-1, then there is no immediate need to "fix" it.
Sure. SHA-256, or larger, might be more efficient if you want to generate more key material.
But PBKDF2-HMAC-SHA1 is fine. Also standard HMAC use has not been compromised, but again, longer hashes are in principle more secure in that scenario.
The attacks on SHA1 which caused a lot of public turmoil make it possible to construct a message which has the same hash as a different message. This is of course always possible (in principle) for every hash function, since a hash function has fewer output bits than input bits. However, it is normally not likely to happen by accident, and doing it on purpose should be computationally not feasible.
From a "ensure message integrity" point of view, this can be seen as a disaster.
On the other hand, for the purpose of generating random numbers, this has absolutely no bearing.
Related
I'm trying to optimize my signature and verification scheme for an embedded device and I'm finding race conditions at just 0.5s/verification. Instead of making the device compute the SHA-256 hash of the data, could I just use an AES encryption and sign that with PSS to accelerate the process, or does it need to be a hashing algorithm?
If you’re asking if the concept could work: sure. You could pigeon hole the tag as a 128-bit hash output and, provided you’re calling API that accepts the pre-computed hash everything would work (provided you told the PSS operations they were using a 128 but hash algorithm).
But no one else would be able to verify your signature, because that’s not a predefined way of doing RSASSA-PSS. And you’d have a “public only” verification problem… the only way someone can know if the tag matched the data was to also have the encryption key, so you would have to embed the key and nonce in the signature parameters (a really bad idea) or just be a private/application protocol.
So, it could be done, but it won’t interoperate, and it’s almost guaranteed never to be a standard because it can’t be used by implementations that don’t accept pre-computed hashes (without forcing the scheme to plaintext transport the content encryption key).
In general, RSA-PSS requires a hash algorithm for the mask generation function and other operations, so it's doubtful that you could actually make RSA-PSS work in any functional way with something that is not a hash algorithm.
The idea you're proposing is also likely insecure, so even if you could get it to work, it wouldn't be effective as a signature scheme, since it could probably be forged. That's because AES, unlike a hash function, allows users to invert the operation (that is, decrypt), so an attacker who knows the key (which, if you hard-code it, they do) can likely create arbitrary messages to sign.
For a secure digital signature, you really need a secure hash algorithm, which means that you need something like SHA-2, SHA-3, or similar (MD5 and SHA-1 are not secure and should not be used). If possible, I would investigate a possibly more performant SHA-256 implementation here. You could also try the hash algorithm BLAKE2s, which is both cryptographically secure and much faster in software than SHA-256, and may meet your needs better.
I'm experimenting with PBKDF2 for my passwords right now, and it dawned on me that if I were to ever upgrade to a faster machine in the future, I would want to increase the number of PBKDF2 iterations. However, this would invalidate all the current passwords that I have stored. One idea I've seen was to store the PBKDF2 settings along with the password (similar to how you store the salt) such as the iteration count and the PRF used (SHA-256, SHA-512) at the time of the hash creation. It sounds like a good idea in terms of backwards compatibility, but I wanted to know if there are any drawbacks to doing this. Any insight into this would be appreciated.
You are definitely taking the right direction here. Many systems store just the salt but where is the rest of the parameters required to perform PBKDF2? Hardcoded! And hardcoding parameters of cryptographic functions is almost never a good idea.
Only drawback I see is that when you store all the parameters your database will probably take a little more space but your future upgrades will be much easier and straightforward.
BTW RFC 2898 defines structure called PBKDF2-params which was designed as a data holder for all the public parameters of PBKDF2 algorithm. Use it at least as an inspiration so you won't forget any important parameter.
Say I have some data and a password, and I want to encrypt the data in such a way that it can only be recovered with the right password.
How does this technically work (i.e. how to implement this)? I often hear people use bitshifting for encryption, but how do you base that on a password? How does password-based encryption work?
An example is Mac OS X FileVault
Thanks.
If you give sample code, preferably in C, Objective-C or pseudocode.
For (symmetric) encryption you need a secret key for encryption and decryption.
Usually, the password you supply is used as the source of this key. For various security reasons, the password is not (and often cannot, due to requirements of the cipher used) directly used as the key. Instead, a key derivation function is used to generate the key from the password.
This is why passwords for encryption must be long and fairly random: Otherwise the resulting key will only come from a very small subset of possible keys, and these can then simply all be tried, thus brute-forcing the encryption.
As to code examples, there are several possibilities:
look at the source code of a crypto library, such as OpenSSL
look at the source code of a program that implements encryption, such as GnuPG
google some sample source code for a simple encryption algorithm, or a key derivation function, and try to understand it
This depends on what you want to learn.
You'll need to look to other resources for a deep explanation, as this question is extremely broad.
Speaking generally: you use a password as a "seed" for an encryption key, as sleske pointed out. Then you use this key to apply a two-way encryption algorithm (i.e. one that can be applied once to encrypt and again to decrypt). When you apply the algorithm to a piece of data, it becomes encrypted in such a way that you could never get the data back out again without using the same key, and you can't practically produce the same key without having the same password as a seed.
If you're interested in crypto, read Applied Cryptography by Bruce Schneier. Excellent read, lots of examples. It goes through many different cryptography types.
An easy way, but not exactly secure, is to rotate each byte by a number determined by the password. You can use a hash code from a string, or count the number of characters, or whatever for the number.
What you are probably thinking of, though, is public key encryption. Here is a link to a document that will tell you the math for it - you'll have to work out the implementation details yourself, but it's not that hard once you understand the math.
http://mathaware.org/mam/06/Kaliski.pdf
The basic building block of most block ciphers is a construction called a Feistel Network. It's reasonably easy to understand.
Stream ciphers are even simpler - they're essentially just pseudo-random number generators, albeit with some important security properties, where the initial internal state is derived from the key.
Password based encryption IS symmetric. The input usually consists of a salt in addition to the password. FooBabel has a cool app where you can play around with this... currently they hard code the Salt to an array of eight bytes (zero to seven) for simplicity. I put in a request to see that they let users input the salt. Anyway, here it is - PBECrypto
In the current project I would like to create my own hash function but so far haven't gained much theoretical background on hashing principle.
I would be very thankful if anyone of you could suggest any useful resource about the theory of hashing, cryptography and practical implementations of hash functions.
Thank you!
P.S. As hashing blocks of informations in this case is a part of larger research project I would like to create a hash function on my own and this way learn the principle rather than use the existing libraries. The informations I am working on will stay in house so there is no need to worry about the possible attacks.
Don't. Existing encryption and hashing algorithms (as pointed out in the comments above, they have little to do with each other) have been designed by experts and extensively peer-reviewed. Anything you write from scratch will suck in comparison. Guaranteed. Really. The only thing you'll gain is a false sense of security -- your algorithm won't be peer-reviewed, so you'll think it's more secure than it actually is.
But if you do want to know more about the theory (and gain an appreciation for why you shouldn't do it yourself), read "Applied Cryptography" by Bruce Schneier. You won't find a better resource.
Brush up on your math first.
First of all, if you use the right terminology, you'll be better able to find helpful resources.
"Encryption" is performed with ciphers, not cryptographic hash functions. You'll never find a reliable reference that mentions a hash as an "encryption function". So, if you are trying to learn about hashes, leave "encryption" out.
Another term for "cryptographic hash" is "message digest," so keep that in mind as you search.
Many chapters of an excellent book, The Handbook of Applied Cryptography are available for free online. Especially check out Chapter 9, "Hash Functions and Data Integrity."
Instead of writing your own hashing function have you considered using a standard hashing function from a library and then salting the data you're hashing? That is common practice and ensures that anyone with software that decrypts data with standard encryption functions doesn't intercept your data and decipher it.
Like the others said, do not make a new kind of hash (the code will get complicated and you might as well reinvent SHA1 or MD5.) Study cryptography first. But if you are willing to, look at existing hashes (most are based on another). Or you can look at the hash model. The hash model looks like:
A mixing stage (mix up the contents and modify)
A combining stage (combine the data in the mixing stage with the initial state [the original hash])
Or maybe start with something simple and build up from it (to make a secure hash).
The Skein hash proposed for SHA-3 boasts some impressive speed results, which I suspect would be applicable for the Threefish block cipher at its heart - but, if Skein is approved for SHA-3, would this imply that Threefish is considered secure as well? That is, would any vulnerability in Threefish imply a vulnerability in SHA-3? (and thus, a lack of known issues and a general trust in SHA-3 imply the same for Threefish)
Nope. The security of Skein does not imply the security of Threefish. Putting it positively, if someone finds a weakness in Threefish then this does not imply that Skein is also insecure.
The question however, is quite intersting an applies to other hash functions too.
Skein uses a Davis-Meyer construction with some modification. MD5, SHA1 and many other hash functions are also using this Davis-Meyer construction and hence they are in principle based on a block cipher. Just in case of MD5 or SHA1 that block cipher does not have a name and I'm not aware of much research on how suitable these constructs are.
The requirements for a good block cipher and for a good hash function are different. Somewhat simplified, if E is a block cipher and it is not feasible to find two keys K, K' and two messages M, M' such that EK(M) xor M = EK'(M') xor M' then E is suitable for constructing a hash function using Davis-Meyer. But to be secure as a block cipher E would need other properties. E would have to resist chosen-ciphertext attacks, chosen-plaintext attacks etc.
Furthermore, if E is a good block cipher then that does also not mean it gives a good hash function. Microsoft had to learn this the hard way with the hash they used in the XBOX. This hash was based on the block cipher TEA that had a weakness that was insignificant for a block cipher, but proved fatal when used for a hash function.
To be fair, there are some relations between being a good block cipher and being suitable for a hash function. E.g., in both cases differential attacks need to be avoided. Hence some design methods used for construction good block ciphers can be used to construct good hash functions.
Let me also add that some of the proposals for SHA-3 are based on AES. So far, I haven't seen much support for favoring AES based hash functions, just because AES is already a standard. These hash functions are analyzed just like any other SHA-3 proposal.
Disregard my previous answer. I misunderstood the relationship between Skein and Threefish. I still don't think Skein being approved absolutely proves Threefish is generally secure (it's possible Threefish is only secure when used in a particular manner), but it would be an indication.