This question already has answers here:
Optimal bcrypt work factor
(3 answers)
Closed 6 years ago.
I'm using bcrypt to store passwords in my database, using a work factor of 7, which takes about 0.02s to hash a single password on my reasonably modern laptop.
Coda Hale says that using bcrypt allows you to 'keep up with Moore's law' by tweaking the work factor. But there's no way to re-encrypt a user's password, since I'm not storing the plaintext. How can I keep my database up-to-date and difficult to crack (assuming it hangs around for the 5+ years it would take for this to become an issue)?
Re-encrypt on login. See Optimal bcrypt work factor.
Remember that the value is stored in the password: $2a$(2 chars work)$(22 chars salt)(31 chars hash). It is not a fixed value.
If you find the load is too high, just make it so the next time they log in, you crypt to something faster to compute. Similarly, as time goes on and you get better servers, if load isn't an issue, you can upgrade the strength of their hash when they log in.
The trick is to keep it taking roughly the same amount of time forever into the future along with Moore's Law. The number is log2, so every time computers double in speed, add 1 to the default number...
Related
What is the difference between passwords Entropy and Min-Entropy?
Is there a recommended number of min-entropy or standard to ensure strong passwords?
How to convert Min-Entropy to number of days/months/years?
for example: Min-Entropy of 30 bits is corresponds to computational time of 2 years in brute-force attack.
Thanks
More information is needed about the attack that is being defended against.
The problem is not brute-forcing of the password space but rather brute-forcing a list of frequently used passwords ordered by use, say 10,000,000, that will not take 2-years. Then there is cracking hardware consisting of an array of GPUs and fuzzing software.
There is also the difference between cracking a particular password and cracking 90% of a million passwords for sale on the dark web.
What is necessary is to use an iteration key derivation function such as PBKDF2 with 100,000 iterations (about 100ms).
For more information see:
Password list at SecLists.
Infosec password-cracking-tools
Arstechnica How I became a password cracker
Advanced Password Recovery hashcat
[DRAFT NIST Special Publication 800-63B Digital Authentication Guideline](
https://pages.nist.gov/800-63-3/sp800-63b.html)
A colleague implemented our password-hashing code after a fair amount of research, including taking advice from https://crackstation.net/hashing-security.htm
The resulting password hash includes the salt (which is supposed to be OK, and is necessary to validate the password), and also includes the iteration count, which is high for key stretching.
It's nice that the iteration count is saved in the database, because lower counts can be used in unit tests, and if we change the counts then existing saved password hashes can still be validated. But I wonder if it's safe to include the number, because wouldn't a brute-force attack be easier if the iteration count was known? It seems to me that this would prevent a lot of extra checks against each iteration count tested incrementally.
It is ok to include the iteration count in the resulting hash.
Most important, this allows you to increase the number of iterations when future hardware will become faster. It is necessary to be able to adapt to faster hardware, without loosing older hash values.
It wouldn't help much to hide this number. If it is not known, an attacker would assume a reasonable number, maybe a bit higher. He could not only compare the last iteration with the hash-value, but also every step between. In case of BCrypt with a logarithmic cost parameter, this would be about 3-5 compare operations more (too small numbers are not reasonable), that's not a big deal.
Well known APIs like PHP's password_hash() will include the cost parameter as well.
Edit:
Hiding the number of iterations would add a kind of secret to the hashing process, an attacker has to guess this number. There are much better possibilities though to add a server side secret, i tried to explain this in my tutorial about safely storing passwords (have a look at the part about pepper and encrypting the hash-value).
I'm working on my sites authentication and was thinking of using bcrypt and randomly creating a salt thats stored in my users login row on the database. I want my site to be fast but anything over using 15 to generate(takes about 1 second) is too slow so I was thinking of randomly generating a salt between say 5-14, but is that secure or is there a better way?
If it helps, I'm using py-bcrypt.
One major reason to use bcrypt is to prevent brute force attacks by requiring a lot of CPU time to calculate hashes. For your problem I would use a constant length salt, but with random values, this way each password takes the same amount of time to calculate.
From this you can cater your length of salt and number of stretching iterations to whatever you feel is secure enough, though I personally like to make sure the hash takes at least 1/2 second to generate on a really beefy server.
OK, so it seems the salt length and work factor are linked. bcrypt is already rather secure, but the issue is that no matter what kind of hash you use, the password strength itself is at least as important. So you should try for the best you can handle on your server with a minimum cost (strength) of 12.
Note that a cryptographically secure but fast & often reseeded RNG is needed or you might run out of random numbers.
More imporantly: make sure that the passwords have sufficient strength. Finding a password "password" takes no time at all, even with bcrypt.
No, there is no better way, except finding a faster implementation for the password hashing. An attacker will use the fastest implementation that can be found of course.
It seems that the current best practice for storing passwords on the web is to use bcrypt as opposed to sha256 or any other hashing algorithm. Bcrypt seems fantastic, with one flaw as I see it: if I have a database filled with passwords using a work factor of 10 and I want to increase that work factor to 12 because computational power has increased, then I have no way of doing this without knowing the users password, meaning waiting until they login again. This causes problems for uses who have abandoned their account.
It seems to me then that an alternate solution would be to use sha256 and do a number of passes equal to 2^(work factor). If I do this, then when I want to increase the work factor I can just do the difference in the number of passes for every stored password.
I've written a bit of code to do exactly that, and I'd like to get feedback from everyone on whether this is a good idea or not.
https://github.com/rbrcurtis/pcrypt
Did a lot of digging and reading papers on these various encryption algorithms. What finally gave me a sort-of answer was this question on crypto.stackexchange.com. My algorithm is somewhat similar to shacrypt, which I hadn't heard of previously, but is still not as good as bcrypt. Reason being that bcrypt, in addition to the work factor, also requires more memory to process than the sha2 family. This means that it cannot as effectively be parallelized in GPUs (although to some extent it can be, and more easily in an FPGA) while sha2 can (and easily). As such, no matter how many passes of sha2 one does, it will still not be as effective as bcrypt.
As an aside, scrypt is significantly better still because it has both a work factor for CPU and a memory factor (and as such is essentially impossible to parallelize in a GPU or FPGA). The only issue is that the nodejs library for scrypt is essentially unusable at present so that might be something to put some effort into.
A potential solution for upping the number of bcrypt passes(or work factor. I don't actually use bcrypt but this is an algorithm-agnostic answer):
For each entry the table where your passwords are stored, also store the number of passes it was hashed with. When you up to more passes, save all new passwords with that number of passes, and set all passwords with less passes than that to expire in 7 days. When they make a new password, hash it with the right number of passes.
Alternatively, you can not reset the password, but the next time they try to login, rehash their password and store it in the table. This does mean that if people haven't logged in, in a long time, their passwords are more susceptible to breach in the event of a DB comprimise. That being said, it's more worth it for the attacker to attack the mass of people with more passes, than the few with less passes(nevermind, because of salts, this last sentence is wrong).
I saw some guy who encrypt users password multiple times with MD5 to improve security. I'm not sure if this works but it doesn't look good. So, does it make sense?
Let's assume the hash function you use would be a perfect one-way function. Then you can view its output like that of a "random oracle", its output values are in a finite range of values (2^128 for MD5).
Now what happens if you apply the hash multiple times? The output will still stay in the same range (2^128). It's like you saying "Guess my random number!" twenty times, each time thinking of a new number - that doesn't make it harder or easier to guess. There isn't any "more random" than random. That's not a perfect analogy, but I think it helps to illustrate the problem.
Considering brute-forcing a password, your scheme doesn't add any security at all. Even worse, the only thing you could "accomplish" is to weaken the security by introducing some possibility to exploit the repeated application of the hash function. It's unlikely, but at least it's guaranteed that you for sure won't win anything.
So why is still not all lost with this approach? It's because of the notion that the others made with regard to having thousands of iterations instead of just twenty. Why is this a good thing, slowing the algorithm down? It's because most attackers will try to gain access using a dictionary (or rainbow table using often-used passwords, hoping that one of your users was negligent enough to use one of those (I'm guilty, at least Ubuntu told me upon installation). But on the other hand it's inhumane to require your users to remember let's say 30 random characters.
That's why we need some form of trade-off between easy to remember passwords but at the same time making it as hard as possible for attackers to guess them. There are two common practices, salts and slowing the process down by applying lots of iterations of some function instead of a single iteration. PKCS#5 is a good example to look into.
In your case applying MD5 20000 instead of 20 times would slow attackers using a dictionary significantly down, because each of their input passwords would have to go through the ordinary procedure of being hashed 20000 times in order to be still useful as an attack. Note that this procedure does not affect brute-forcing as illustrated above.
But why is using a salt still better? Because even if you apply the hash 20000 times, a resourceful attacker could pre-compute a large database of passwords, hashing each of them 20000 times, effectively generating a customized rainbow table specifically targeted at your application. Having done this they could quite easily attack your application or any other application using your scheme. That's why you also need to generate a high cost per password, to make such rainbow tables impractical to use.
If you want to be on the really safe side, use something like PBKDF2 illustrated in PKCS#5.
Hashing a password is not encryption. It is a one-way process.
Check out security.stackexchange.com, and the password related questions. They are so popular we put together this blog post specifically to help individuals find useful questions and answers.
This question specifically discusses using md5 20 times in a row - check out Thomas Pornin's answer. Key points in his answer:
20 is too low, it should be 20000 or more - password processing is still too fast
There is no salt: an attacker may attack passwords with very low per-password cost, e.g. rainbow tables - which can be created for any number of md5 cycles
Since there is no sure test for knowing whether a given algorithm is secure or not, inventing your own cryptography is often a recipe for disaster. Don't do it
There is such a question on crypto.SE but it is NOT public now. The answer by PaĆlo Ebermann is:
For password-hashing, you should not use a normal cryptographic hash,
but something made specially to protect passwords, like bcrypt.
See How to safely store a password for details.
The important point is that password crackers don't have to bruteforce
the hash output space (2160 for SHA-1), but only the
password space, which is much much smaller (depending on your password
rules - and often dictionaries help). Thus we don't want a fast
hash function, but a slow one. Bcrypt and friends are designed for
this.
And similar question has these answers:
The question is "Guarding against cryptanalytic breakthroughs: combining multiple hash functions"
Answer by Thomas Pornin:
Combining is what SSL/TLS does with MD5 and SHA-1, in its
definition of its internal "PRF" (which is actually a Key Derivation
Function). For a given hash function, TLS defines a KDF which
relies on HMAC which relies on the hash function. Then the KDF is
invoked twice, once with MD5 and once with SHA-1, and the results are
XORed together. The idea was to resist cryptanalytic breaks in either
MD5 or SHA-1. Note that XORing the outputs of two hash functions
relies on subtle assumptions. For instance, if I define SHB-256(m) =
SHA-256(m) XOR C, for a fixed constant C, then SHB-256 is as
good a hash function as SHA-256; but the XOR of both always yields
C, which is not good at all for hashing purposes. Hence, the
construction in TLS in not really sanctioned by the authority of
science (it just happens not to have been broken). TLS-1.2 does
not use that combination anymore; it relies on the KDF with a single,
configurable hash function, often SHA-256 (which is, in 2011, a smart
choice).
As #PulpSpy points out, concatenation is not a good generic way of
building hash functions. This was published by Joux in 2004 and then
generalized by Hoch and Shamir in 2006, for a large class of
construction involving iterations and concatenations. But mind the
fine print: this is not really about surviving weaknesses in hash
functions, but about getting your money worth. Namely, if you take a
hash function with a 128-bit output and another with a 160-bit output,
and concatenate the results, then collision resistance will be no
worse than the strongest of the two; what Joux showed is that it will
not be much better either. With 128+160 = 288 bits of output, you
could aim at 2144 resistance, but Joux's result implies
that you will not go beyond about 287.
So the question becomes: is there a way, if possible an efficient
way, to combine two hash functions such that the result is as
collision-resistant as the strongest of the two, but without incurring
the output enlargement of concatenation ? In 2006, Boneh and
Boyen have published a result which simply states that the answer
is no, subject to the condition of evaluating each hash function only
once. Edit: Pietrzak lifted the latter condition in 2007
(i.e. invoking each hash function several times does not help).
And by PulpSpy:
I'm sure #Thomas will give a thorough answer. In the interm, I'll just
point out that the collision resistance of your first construction,
H1(m)||H2(M) is surprisingly not that much better than just H1(M). See
section 4 of this paper:
http://web.cecs.pdx.edu/~teshrim/spring06/papers/general-attacks/multi-joux.pdf
no , it's not a good practice, you must use a $salt for your encryption because the password cand be cracked with those rainbow tables