We are currently storing plain text passwords for a web app that we have.
I keep advocating moving to a password hash but another developer said that this would be less secure -- more passwords could match the hash and a dictionary/hash attack would be faster.
Is there any truth to this argument?
Absolutely none. But it doesn't matter. I've posted a similar response before:
It's unfortunate, but people, even programmers, are just too emotional to be easily be swayed by argument. Once he's invested in his position (and, if you're posting here, he is) you're not likely to convince him with facts alone. What you need to do is switch the burden of proof. You need to get him out looking for data that he hopes will convince you, and in so doing learn the truth. Unfortunately, he has the benefit of the status quo, so you've got a tough road there.
From Wikipedia
Some computer systems store user
passwords, against which to compare
user log on attempts, as cleartext. If
an attacker gains access to such an
internal password store, all passwords
and so all user accounts will be
compromised. If some users employ the
same password for accounts on
different systems, those will be
compromised as well.
More secure systems store each
password in a cryptographically
protected form, so access to the
actual password will still be
difficult for a snooper who gains
internal access to the system, while
validation of user access attempts
remains possible.
A common approache stores only a
"hashed" form of the plaintext
password. When a user types in a
password on such a system, the
password handling software runs
through a cryptographic hash
algorithm, and if the hash value
generated from the user's entry
matches the hash stored in the
password database, the user is
permitted access. The hash value is
created by applying a cryptographic
hash function to a string consisting
of the submitted password and,
usually, another value known as a
salt. The salt prevents attackers from
building a list of hash values for
common passwords. MD5 and SHA1 are
frequently used cryptographic hash
functions.
There is much more that you can read on the subject on that page. In my opinion, and in everything I've read and worked with, hashing is a better scenario unless you use a very small (< 256 bit) algorithm.
There is absolutely no excuse to keeping plain text passwords on the web app. Use a standard hashing algorithm (SHA-1, not MD5!) with a salt value, so that rainbow attacks are impossible.
I don't understand how your other developer things 'more passwords could match the hash'.
There is argument to a 'hash attack would be faster', but only if you're not salting the passwords as they're hashed. Normally, hashing functions allow you to provide a salt which makes the use of known hash table a waste of time.
Personally, I'd say 'no'. Based on the above, as well as the fact that if you do somehow get clear-text expose, a salted, hashed value is of little value to someone trying to get in. Hashing also provides the benefit of making all passwords 'look' the same length.
ie, if hashing any string always results in a 20 character hash, then if you have only the hash to look at, you can't tell whether the original password was eight characters or sixteen for example.
I encountered this exact same issue in my workplace. What I did to convince him that hashing was more secure was to write a SQL injection that returned the list of users and passwords from the public section of our site. It was escalated right away as a major security issue :)
To prevent against dictionary/hash attacks be sure to hash against a token that's unique to each user and static (username/join date/userguid works well)
If you do not salt your Password, you're suspect to Rainbow Table attacks (precompiled Dictionaries that have valid inputs for a given hash)
The other developer should stop talking about security if you're storing passwords in plaintext and start reading about security.
Collisions are possible, but not a big problem for password apps usually (they are mainly a problem in areas where hashes are used as a way to verify the integrity of files).
So: Salt your passwords (by adding the Salt to the right side of the password*) and use a good hashing algorhithm like SHA-1 or preferably SHA-256 or SHA-512.
PS: A bit more detail about Hashes here.
*i'm a bit unsure whether or not the Salt should to to the beginning or to the end of the string. The problem is that if you have a collisions (two inputs with the same hash), adding the Salt to the "wrong" side will not change the resulting hash. In any way, you won't have big problems with Rainbow Tables, only with collisions
There is an old saying about programmers pretending to be cryptographers :)
Jeff Atwood has a good post on the subject: You're Probably Storing Passwords Incorrectly
To reply more extensively, I agree with all of the above, the hash makes it easier in theory to get the user's password since multiple passwords match the same hash. However,
this is much less likely to happen than someone getting access to your database.
There is truth in that if you hash something, yes, there will be collisions so it would be possible for two different passwords to unlock the same account.
From a practical standpoint though, that's a poor argument - A good hashing function (md5 or sha1 would be fine) can pretty much guarantee that for all meaningfully strings, especially short ones, there will be no collisions. Even if there were, having two passwords match for one account isn't a huge problem - If someone is in a position to randomly guess passwords fast enough that they are likely to be able to get in, you've got bigger problems.
I would argue that storing the passwords in plain text represents a much greater security risk than hash collisions in the password matching.
I'm not a security expert but I have a feeling that if plain text were more secure, hashing wouldnt exist in the first place.
In theory, yes. Passwords can be longer (more information) than a hash, so there is a possibility of hash collisions. However, most attacks are dictionary-based, and the probability of collisions is infinitely smaller than a successful direct match.
It depends on what you're defending against. If it's an attacker pulling down your database (or tricking your application into displaying the database), then plaintext passwords are useless. There are many attacks that rely on convincing the application to disgorge it's private data- SQL injection, session hijack, etc. It's often better not to keep the data at all, but to keep the hashed version so bad guys can't easily use it.
As your co-worker suggests, this can be trivially defeated by running the same hash algorithm against a dictionary and using rainbow tables to pull the info out. The usual solution is to use a secret salt plus additional user information to make the hashed results unique- something like:
String hashedPass=CryptUtils.MD5("alsdl;ksahglhkjfsdkjhkjhkfsdlsdf" + user.getCreateDate().toString() + user.getPassword);
As long as your salt is secret, or your attacker doesn't know the precise creation date of the user's record, a dictionary attack will fail- even in the event that they are able to pull down the password field.
Nothing is less secure than storing plain-text passwords. If you're using a decent hashing algorithm (at least SHA-256, but even SHA-1 is better than nothing) then yes, collisions are possible, but it doesn't matter because given a hash, it's impossible* to calculate what strings hash to it. If you hash the username WITH the password, then that possibility goes out the window as well.
* - technically not impossible, but "computationally infeasible"
If the username is "graeme" and the password is "stackoverflow", then create a string "graeme-stackoverflow-1234" where 1234 is a random number, then hash it and store "hashoutput1234" in the database. When it comes to validating a password, take the username, the supplied password and the number from the end of the stored value (the hash has a fixed length so you can always do this) and hash them together, and compare it with the hash part of the stored value.
more passwords could match the hash and a dictionary/hash attack would be faster.
Yes and no. Use a modern hashing algorithm, like an SHA variant, and that argument gets very, very week. Do you really need to be worried if that brute force attack is going to take only 352 years instead of 467 years? (Anecdotal joke there.) The value to be gained (not having the password stored in plain text on the system) far outstrips your colleague's concern.
Hope you forgive me for plugging a solution I wrote on this, using client side JavaScript to hash the password before it's transmitted: http://blog.asgeirnilsen.com/2005/11/password-authentication-without.html
Related
So I've been looking at hashing passwords in vb.net and came across this thread (https://security.stackexchange.com/questions/17421/how-to-store-salt/17435#17435) and it showed about the salt only increasing the time to make brute force attack if the salt is known to the intruder as they need to make a new rainbow table. Could this be made more secure by making the salt a hash of the plaintext?
As an example to hash "plaintext" but adding a salt the string, this salt then being a hash of "plaintext" making "32nfdw213123" as example then hashing the total "plaintext32nfdw213123". In this case the salt is different for every value used but when used for verification doing the same process to a correct check string should produce the same salt and therefore the same hash value and verify. Is this actually more secure?
Thanks
TLDR: not really.
Longer answer:
Let's say some baddie has your database with all the passwords in it. He can now start brute-forcing passwords. Your goal is trying to make the brute-forcing as hard as possible.
So, theoretically, given that he has your database, he probably also knows how you're hashing your passwords (remember that security through obscurity is a bad form of defense). Your salts are no longer random, so our baddie can create a new rainbow table.
I've been doing some research about storing salts, and apparently the most common way to do it is to store it in a separate column in the same table as the username and password. I've seen that all over this and other websites, but to me this is like putting the key right next to the safe. If anyone ever gets access to the authentication table the hackers would win. If they do but the salt isn't found there they wouldn't have as much to go on.
I operate a three tiered system and would prefer some method of storing the salt somewhere on Java operated middle-tier that is behind a firewall and not accessible directly from the internet. Perhaps some XML or something that none of the other parts of the application will touch?
Let's go over what the salt really is, then.
It's a way of making sure
Two users with the password aren't obvious.
"password" with PBKDF2-HMAC-SHA-512, 10000 iterations, keeping only the first 32 bytes of output, stored in Base64, and a salt of empty string is ALWAYS
9MpQfAfQvTG8d5oIdWgmpv2d2X1DrCXkspoJM6vqA/M=
Thus, if you have 5% of your users with that as their password hash, you can be pretty sure it's either "password" or "12345" or one of the other worst passwords.
Attackers cannot precompute attack lists in advance of leaks and then nearly-instantly match their "rainbow table" of results to the leak to eliminate that entire precomputed list, then get on with cracking the hard passwords without wasting precious time on the easy ones.
So, if we have an attacker with a password list of "password" and "12345", and you use no salt, they have already figured out that those results with the setup above are:
9MpQfAfQvTG8d5oIdWgmpv2d2X1DrCXkspoJM6vqA/M=
and
I2bEyBbaxTBvHdJ7rIu7kdR2liwGMCg62lyuoj41NB8=
Thus, the attacker gets your password list however, and they nearly instantly eliminate spending any computation time on the MANY of your users chose terrible passwords, which means they have more time, and combinations, left to try on the higher difficulty targets.
If you use a 16 byte cryptographically random salt for each userid, then instead of needing to perform the hash algorithm once for each password*rule on their list, they have to perform it 2^128 times for each, which is computationally infeasable at this time, let alone the storage requirements.
There's no point in keeping the salt secret - it serves those two purposes without any need for more secrecy than the password hash itself.
I am developing my first web app that requires a login, and it has come to the point when i must decide how to store the passwords. I have been doing a lot of reading on the proper way to hash the password and adding a salt. It occurred to me that most of the ways that are recommended would rely on some variation of information that is stored in the database with the password hash, be it some variation of using all or part of the username as a salt or some other random value.
Instead I was thinking of using the user own password as a salt on the password. Using an algorithm to jumble the password and adding it to itself in some way as the salt. Of course this to would be compromised if an attacker got access to both the stored hashes and the source code of the algorithm, but any salt would be compromised in such a situation. My application really probably does not need this level of security, but it was just something that i started to think about when reading.
I just wanted to get some feedback from some more experienced developers. Any feedback is appreciated.
If you derrive the salt from the password itself, you will loose the whole benefit of salting. You can then build a single rainbow-table to get all passwords, and equal passwords will result in equal hash-values.
The main reason to use a salt is, that an attacker cannot build one single rainbow-table, and get all the passwords stored in your database. That's why you should add a random unique salt for each password, then an attacker would have to build a rainbow table for each password separately. Building a rainbow-table for a single password makes no sense, because brute forcing is faster (why not just stop when the password was found).
Don't be afraid to do it right, often the programing environments have support to create safe hashes and will handle salting for your (e.g. password_hash() for PHP). The salt is often combined with the hash for storing, that makes it easy to store it in a single database field.
I wrote a small tutorial about securely storing passwords, maybe you want to have a look at it.
Simply duplicating the password may still be vulnerable to dictionary attacks, e.g. the password "hello" becomes "hellohello", and thus might be part of a dictionary.
Using a scrambled password as the salt enables the attacker to use a dictionary and then generate a rainbow table for all entries by adding the scambled password on every entry.
Why change a proven algorithm which can be understood by any developer? Just do it the default way and your code will be maintainable by anyone else.
"My application really probably does not need this level of security" - until that point in time it was hacked. Use a salt, it takes almost no additional effort. Do it now.
"eliminate the need of storing the password salt at all": the salt can be very small (6 bytes). It will hardly affect performance.
I just wanted to get some feedback from some more experienced developers. Any feedback is appreciated.
John Steven of OWASP performed an analysis, including threat modes, for password storage system. It explains the components and their purpose, like the hash, the iteration count, the salt, the HMACs, the HSMs, etc. See the Secure Password Storage Cheat Sheet and Secure Password Storage paper.
Cracking is not the only threat here. More than likely, the guy trying to break into your organization is going to be using one of the top passwords from the millions of passwords gathered from the Adobe breach, the LinkedIn breach, the Last.fm breach, the eHarmony breach, the <favorite here> breach.... For example:
25 most-used passwords revealed: Is yours one of them?
The 30 Most Popular Passwords Stolen From LinkedIn
Top 100 Adobe Passwords with Count
Why bother brute forcing when you have a list of thousands of top rated passwords to use?
So your FIRST best defense is to use a word list that filters a user's bad password choices. That is, don't allow user's to pick weak or known passwords in the first place.
If someone gets away with your password database, then he or she is going to use those same password lists to try and guess your user's passwords. He or she is probably not even going to bother brute forcing because he or she will have recovered so many passwords using a password list.
As I understand it, these word lists are quite small when implemented as a Bloom Filter. They are only KB in size even though there are millions of passwords. See Peter Gutmann's Engineering Security for an in depth discussion.
I've read on SO (and from other websites found on Google after I tried to look into it a little bit more) that the correct secure way to store passwords in a database is to store the hashed + salted value of a password. On top of that, the salt should be different for each user so hackers can't do harm even if they have the encrypted values.
I'm not quite sure what salting means. From my understanding, you hash the password, then you use another value that you hash (the salt) and combine those two together so the algorithm to retrieve the original password is different for every user.
So basically, what I'd have to do is hash a password, then use a different hash on a different value for each user (ie: the user name or email address) and then I can do a simple math operation on those two values to get the encoded password.
Is that correct or did I just not understand anything about password hashing + salting?
A simple explanation or example would prove to be helpful as the sites I've found don't quite explain clearly what salting a password is.
Edit: After reading comments and answers left so far, I understand that I didn't really understand what a salt was because I'm missing some key concepts and I was making false assumption.
What I'd like to know is: how do you consistently get the same salt if it is randomly-generated? If the salt is stored in the database like some people have mentioned, then I can see how you keep getting the same salt, but that brings another question: How does it make the passwords more secure if anyone with access to the database have access to the salt? Couldn't they just append the (known) salt to all the passwords they try and the result would be the same (bar some minor time loss) than not having one at all?
Let me try and clarify a little bit with a somewhat oversimplified example. (md5() is used for example purposes only - you should not use it in practice.)
A salt is just a random string of characters that is appended to the password before it is hashed. Let's say you have the password letmein, and you hash it like this...
echo md5('letmein')
...you'll get the output 0d107d09f5bbe40cade3de5c71e9e9b7. If you google this, you'll get a number of pages telling you that this is the MD5 hash for letmein. A salt is intended to help prevent this sort of thing.
Let's suppose you have a function, randomStringGenerator() that generates a random $x-character string. To use it to salt a password, you'd do something like this:
$password = 'letmein';
$salt = randomStringGenerator(64); //let's pretend this is 747B517C80567D86906CD28443B992209B8EC601A74A2D18E5E80070703C5F49
$hash = md5($password . $salt);
You'd be then performing md5(letmein747B517C80567D86906CD28443B992209B8EC601A74A2D18E5E80070703C5F49), which returns af7cbbc1eacf780e70344af1a4b16698, which can't be "looked up" as easily as letmein without a salt.
You'd then store BOTH the hash and the salt, and when the user types in their password to log in, you'd repeat the process above and see if the password the user entered with the stored salt appended hashes to the same thing as the stored hash.
However! Since general hashing algorithms like MD5 and SHA2 are so fast, you shouldn't use them for storing passwords. Check out phpass for a PHP implementation of bcrypt.
Hope that helps!
One uses a salt to avoid the attacker creating a rainbow table, e.g. a table containing all (usual) passwords and the corresponding hashes, sorted (or somehow easily accessible) by hash. If the attacker has such a table or can create it, and then gets your password database with unsalted hashes, he can easily look up the passwords, even for all of your users at once.
If the hashes are salted (and the attacker gets the salt with the hashes), he will still be able to do the same attack (with only slightly more work to input the salt) - but now this work of building a rainbow table is useless for the next hash with another salt, which means this will need to be done for each user again. This alone is the goal of the salt. A dictionary attack on your single account still needs the same time as before, just the rainbow table is useless. (To do something against the dictionary attack, see below.)
How exactly the salt is used depends on the algorithm in use. Some hash algorithms (for example bcrypt, which is specially made for password hashing) have a special salt input parameter (or generate the salt themselves and include it in the output):
H = bcrypt(password, hardness) or H = bcrypt(salt, password, hardness)
(The first variant generates the salt itself, while the second takes it from the outside. Both include the hash and the hardness parameter in the output.)
Others need to be used in some special mode to use the salt.
A simple variant which works for most hash algorithms would be using HMAC, with the salt as "message" input, the password as key:
HMAC(password, salt) = Hash(password ⊕ opad || Hash(ipad ⊕ password || salt) )
where opad and ipad are some constant padding values.
Then you store the salt together with the hash. (For a slightly higher barrier, you could store the hash in another location than the salt. But you will still need both for login.) For login, you then will give the password and the stored salt to your hash function, and compare the result with the stored hash. (Most bcrypt libraries have a "password verification" function build in, which do this.)
For password storage it is important to use a slow hash algorithm, not a fast one, to avoid (or really: slow down) brute force or dictionary attacks on the passwords, as most people will have quite short passwords. bcrypt is an algorithm which was made just for this goal (its slowness is adaptable by a parameter).
If you use a fast hash function, be sure to repeat it often enough to be slow again. (But better, really: use bcrypt.)
Although #Chris and #Pualo have very good answers. I wanted to add one more thing about salting passwords that hasn't been expressed.
Salting a password is not a real protection mechanism. It doesn't matter if you are using bcrypt or any other mechanism. It is simply a delaying tactic, nothing more.
By using a different salt value per password you are forcing the hacker to create a rainbow table per password in order to crack them. This increases the amount of time it takes, but by no means does it make it impossible. Bear in mind that with cloud based computing you can spin up a large number of machines to create the rainbow tables and you can see that the delay is pretty small.
Further, most of the zombie machines out there are available for rent...
That said, the reason why you go through the trouble is to buy time. Time to notice that you've been breached, repair it and inform your users of the breach. That's it.
If an attacker obtained enough access to your database to pull the list of passwords, then it is pretty much guaranteed that they've obtained everything else. So, by this point you've already lost everything. The only question is how long does it take you to plug the hole, reset everyone's password and tell them that they should reset the passwords on any other account they may have where they used the same one. If you're Sony, then this time is apparently measured in months, if not years... ;) Try to be a little faster than that.
So, although it is the responsible thing to do it is only one part of your defensive tool belt. If you've been breached then you can bet those usernames and passwords will show up on a site somewhere at some point in the near future. Hopefully before then you've already cleaned up your house.
Using salt prevent precomputed rainbow-tables usage, as an example if a user use "Password" as a password, MD5("Password"), SHA1("Password"), or WhatEver("Password") may be well-known results stored in some rainbow tables.
If you use a different salt value per person - called a nonce - you'll get MD5(HMAC("Password","RandomSaltValue")), SHA1("Password","AnotherRandomSaltValue"), ... that mean two different hashed password values for the same initial password.
Now the question about storing these salts value...i think they can be stored into the database, the idea of salts are to prevent rainbow-style attack, not the database compromised issue.
Although bcrypt slows the process significantly down, it still would probably be feasible to attack your scheme if lots of computations can be made in parallel. I know it's unlikely and this would have to be a quite resourceful attacker indeed, but let's imagine the site you protect would contain photos and documents from Area 51 :) In that case, given enough parallelization, you could still be in trouble even if using bcrypt.
That's why I like the approach of scrypt - not only does it involve computational cost, but also it imposes memory constraints, specifically to introduce cost in terms of space and to make these kinds of parallel attacks infeasible. I can only recommend reading the paper that is linked on that site, it illustrates the concept really well.
Although, it seems that bcrypt and even more scrypt seem to get less attention in terms of cryptanalysis than PBKDF2outlined in RSA's PKCS#5. See this discussion for details.
I'd say first of all that security is very hard to do right, and that you really should rely on existing libraries to do as much as possible for you. For basic operations like password storage and validation that's definitely true.
EDIT: Removed erroneous info. I'll stick with the only good advice I had, which was not to roll your own.
What about Secure hash and salt for PHP passwords? It even has examples in PHP.
Wondering whether it matters if a salt is unique for a single given user each time the password is changed, or whether it's not a big deal to reuse the same salt each time.
I currently generate a new random string as the salt each time a given user updates the password. This way each time the user has a new password their is also a salt change. It's easy to do, so why not.
Well... here's why. I need to store the previous X passwords to ensure a password is not reused. In the old days (the last time I wrote code for this), I could just store previous MD5 hashes, and compare new ones to that list. Well, now that I am using salted hashes where the salt is unique each time, those comparisons are no longer possible as the previous salts are no longer known.
To make that system work, I have two choices: store a history of the salts in addition to the final hashes, or reuse the same salt for any one given user with each password update. Either of these would allow me to build values that could be compared to a history.
The latter is less work, but does it lose any strength? From a practical standpoint, I don't see that it does. Thought I'd get a second opinion here. Thanks.
To keep the question "answerable" -- would reusing the same salt for any one user have an acceptably minimal reduction of protection in order to maintain a searchable password history (to prevent pswd recycling)?
Reusing the same salt means that if a user is explicitly targeted by a hacker, they could produce a "password to hash" dictionary using "the user's salt" - so that even if the user changes their password, the hacker will still immediately know the new password without any extra work.
I'd use a different salt each time.
As for storing the MD5 hash plus salt - presumably you're already storing the salt + hash, in order to validate the user's current password. Why can't you just keep that exact same information for historical checks? That way you can use one piece of code to do the password checking, instead of separating out the current and historical paths. They're doing the same thing, so it makes sense for them to use the same code.
EDIT: To explain what I mean, consider a 4 character salt, prepended to the password... and for the sake of argument, imagine that someone only uses A-Z, a-z and 0-9 in their password (and the salt).
If you don't know the salt ahead of time (when preparing a dictionary attack) then in order to prepare a dictionary for all 8 character "human" passwords, you need to hash 62^12 concatenated passwords. If, however, you always know what the first 4 characters of the concatenated password will be (because you know the salt ahead of time) then you can get away with only hashing 62^8 values - all those beginning with the salt. It renders the salt useless against that particular attack.
This only works with a targeted user of course - and only if the attacker can get at the hash list both before and after the password change. It basically makes changing the password less effective as a security measure.
Another reason for using salt in password hashes is to hide the fact that two users use the same password (not unusual). With different hashes an attacker won't see that.
Firstly, stop using MD5 (if you are using it), and use SHA-2, MD5, SHA-0, and SHA-1, are all dead hashes.
-- Edit:
I now agree with Jon Skeet, and suggest you consider generating a new salt with each password change. It covers a small case where the attacker may get the salt+hash, then not be able to gain access again, but will still allow him (with some guessing of how you combine them), to calculate what the hashes could be for all future passwords. It's very small, and is not so important, because the password sizes will need to be significantly small (say, 8 chars) for even calculating them all offline to be practical. Yet it exists.
Secondly, to consider whether or not it matters, we need to think about the purpose of salts. It is to prevent offline attacks against someone who has a complete listing of only the passwords.
On this basis, if the salt is equally "difficult" to obtain before and after password changes, I see no use a new salt (it's just as at-risk as it was before). It adds additional complexity, and in implementing complexity is where most security problems occur.
I might be being incredibly dim here, but, where would you store the salt that would be inaccessable to someone with enough access to get the hashed password.