What is the "simple" hashing algorithm that CouchDB uses to encrypt passwords? - firebase-authentication

(TL;DR: I have two questions at the bottom.)
I have been looking through CouchDB's documentation to learn about its hashing algorithm, but I'm unable to find important details.
The most information I've gathered has been from this page: 1.5.2. Authentication Database
Here's my problem:
I have a bunch of users in a _users database in my CouchDB instance on Cloudant.
I need to be able to migrate users from CouchDB to Firebase.
Firebase offers a super-handy-dandy auth migration tool for this. However, in order to utilize its auth migration tooling, I need to know exactly which hashing algorithm is being used for the "simple" password_scheme.
For every user in my _users database, I have the "salt" and "password_sha" available.
Given the name "password_sha", I assume that the "simple" "password_scheme" uses either SHA1, SHA256, SHA512, PBKDF_SHA1, or PBKDF2_SHA256.
None of the users' docs in my database have a "derived_key". Almost all of them do not have a defined "password_scheme". If any of them do have a defined "password_scheme", it is always "simple" (and never "pbkdf2").
Once I know exactly which hashing algorithm CouchDB uses, I then need to know how many rounds or iterations were used to hash the password.
The Firebase docs say:
"you must provide the number of rounds (between 1 and 8192 for SHA1, SHA256 and SHA512, and between 0 and 120000 for PBKDF_SHA1 and PBKDF2_SHA256) used to hash the password."
However, I cannot find any documentation/information on this.
So my questions are:
What is the hashing algorithm CouchDB uses for the "simple" "password_scheme"? (Is it SHA1, SHA256, SHA512, PBKDF_SHA1, PBKDF2_SHA256, or something else?)
How many rounds or iterations are used to hash the passwords?

I posted this same question to a few chats linked in the CouchDB homepage, and Robert Newson, owner of CouchDB, told me the following in Slack:
"simple" is a one round of SHA-1 (with salt)
https://github.com/apache/couchdb/blob/main/src/couch/src/couch_passwords.erl#L26
So to directly answer these two questions:
The "simple" CouchDB hashing algorithm uses SHA1.
And it is only one round of hashing.

Just for the sake of completeness, this simple scheme is not what more recent versions of CouchDB (for sure not >= 2) use. The current default is to use pbkdf2 with the following values:
iterations = 10
keylen = 20
size = 16
encoding = 'hex'
digest = 'SHA1'
see couch-pwd if you need to generate or validate CouchDB-style passwords.

Related

How do websites store passwords long term?

So how do websites keep passwords long term?
I mean really important websites, say a government or a big ecommerce or social networking website.
Sure, they store a hash (or salted hash) of the password in the webserver-connected datastore that is used for authentication, but is that it?
NOTE: I am not asking about hashing or salting, I'm asking about where the store the metadata (e.g., hash or salted hash) such that it's always available?
In fact, how do websites like Facebook store passwords? I'm guessing they would have multiple copies of the hash spread out over the world? And backed up to tape once in a while?
You only need the hash of the user's password for most applications. Usually, the actual password isn't stored, for security reasons. If the datastore is compromised, you wouldn't want the hacker to be able to gain the actual passwords for the users.
That's usually why, actually, the hashes are salted in the first place. Salting makes it much harder to use a rainbow table (a precomputed table of all possible combos for passwords going through a certain type of hash) to regain the original password, which the user may be using on other sites.
This was answered in more depth here: Best way to store password in database
the question is too broad to answer. It isn't just relevant for government web pages; it would be a real security issue if there are clear text passwords stored. Depending on security needs, there are password hashes used in most cases. If users need a certificate (e.g. stored on a card, or obtained using another process), there might be a public key of the user stored on the server (instead of the hash).
Your question also asks on completely different topics. For sure, a web backend database also needs backups (not only for passwords), and there are several load balancing techniques which may also consider geolocation topics etc.

Now that I know how to salt & hash passwords, a few more questions

So, let's assume I have read every article/post about appropriately salting and hashing passwords in order to secure user credentials.
This means I am not wondering what hashing algorithm to use (SHA1 vs. SHA2 vs. PDKBF2), how to generate the salt, how to store the salt, how to append the salt, or whether I should be writing the code myself vs. leveraging well-established libraries like bcrypt. Please, avoid rambling about these issues here as I have read 50+ other pages of that already.
Just assume the following is my approach (also note I understand this is not flawless or likely sufficient for applications like financial service, I am really just wondering if this is an acceptable min bar to claim that I "do the right thing").
User comes to my amazing website (www.myamazingwebsite.com) and logs in with email and pass.
I pull her salt and hash from my database. Assume the salt is lengthy enough, unique per-user, and created using a CSPRNG upon user registration.
I prepend the salt to her input password, hash it using SHA-512, run 1,000 iterations, then compare it to the hashed value pulled from the db:
var hash = sha512(salt + password);
for (i = 0; i < 1000; i++) {
hash = sha512(salt + password + hash);
}
If they match, the user is authenticated. Otherwise, they are not.
Now, my question is how secure is my above approach. The questions I would like help answering:
Do I need to change the salt periodically? For example, perhaps I could re-compute and store a new hash using a newly created random salt after every successful login. This seems like it would be more secure but I am not sure what standard practice is here.
The request to the server will be done via https. Does that mean I can assume that I can process all of the hashing and validation logic server side? Would most folks consider this sufficient, or do I need to consider some hybrid both on client and server side?
Anything else I am overlooking or need to consider?
Thanks in advance, I appreciate the help.
1) Assuming you've done the right thing and do not store their password, you can't change the salt unless they are logging in. I suppose you could change their salt every time they do log in, but it doesn't really help (and might hurt).
Here's why: Having a unique salt on everyone simply makes it harder for an attacker that has access to your database from attempting to guess the passwords. If you've done things correctly, he would have to use a different salt for each person. He can't just start guessing passwords using a site-wide salt and see if it matches anyone. As long as you have a unique salt for each user, you are doing the best you can.
In fact, changing the salt does nothing but give an attacker with access to your database over time MORE information. Now he knows what their password looks like salted two different ways. That could (theoretically) help crack it. For this reason, it would actually be ill advised to change the salt.
2) Https is sufficient. If someone can compromise https, then any additional client side hashing or such will not help. The clients computer is compromised.
3) I think you have a fair understanding of best password practices. Don't overlook other security issues like sql-injection and cross-site scripting.
Do I need to change the salt periodically?
No. The salt is a per-user public parameter that servers two purposes. First, it ensures that an attacker cannot build an offline dictionary of passwords to hashes. Second, it ensures two users with the same password have different hashed password entries in the database.
See the Secure Password Storage Cheat Sheet and Secure Password Storage paper by John Steven of OWASP. It takes you through the entire threat model, and explains why things are done in particular ways.
The request to the server will be done via https. Does that mean I can assume that I can process all of the hashing and validation logic server side?
This is standard practice, but its a bad idea. Its a bad idea because of all the problems with SSL/TLS and PKI in practice. Though this is common, here's how it fails: the SSL/TLS channel is setup with any server that presents a certificate. The web application then puts the {username, password} on the wire in the plain text using a basic_auth scheme. Now the bad guy has the username and password.
There's lots of other problems with doing things this way. Peter Gutmann talks about this problem (and more) in his Engineering Security book. He's got a witty sense of humor, so the book is cleverly funny at times, too even though its a technical book.
Would most folks consider this sufficient, or do I need to consider some hybrid both on client and server side?
If possible, use TLS-PSK (Preshared Key) or TLS-SRP (Secure Remote Password). Both overcome the problems of basic_auth schemes, both properly bind the channel, and both provide mutual authentication. There are 80 cipher suites available for TLS-PSK and TLS-SRP, so there's no shortage of algorithms.
Anything else I am overlooking or need to consider?
Cracking is not the only threat here. More than likely, the guy trying to break into your organization is going to be using one of the top passwords from the millions of passwords gathered from the Adobe breach, the LinkedIn breach, the Last.fm breach, the <favorite here> breach.... For example:
25 most-used passwords revealed: Is yours one of them?
The 30 Most Popular Passwords Stolen From LinkedIn
Top 100 Adobe Passwords with Count
Why bother brute forcing when you have a list of thousands of top rated passwords to use?
So your FIRST best defense is to use a word list that filters a user's bad password choices. That is, don't allow user's to pick weak or known passwords in the first place.
If someone gets away with your password database, then he or she is going to use those same password lists to try and guess your user's passwords. He or she is probably not even going to bother brute forcing because he or she will have recovered so many passwords using a password list.
As I understand it, these word lists are quite small when implemented as a Bloom Filter. They are only KB in size even though there are millions of passwords. See Peter Gutmann's Engineering Security for an in depth discussion.

The proper way of implementing user login system

I want to make a user login system for the purpose of learning. I have several questions.
I did some research and found that the proper way of implementing a user login system is to store the user name/id and the encrypted/hashed version of the password in the database. When a user logs in, the password is encrypted client side (MD5, SHA-1 etc.) and sent to the server where it is compared with the one in database. If they match, the user log in successfully.
This implementation prevents DBAs or programmers seeing the cleartext of the password in the database. It can also prevent hackers intercepting the real password in transit.
Here is where I'm confused:
What if the hackers know the hash/encrypted version of password (by hacking the database) or DBAs, programmers get the hashed version of the password by just simply reading the text in the database. They could then easily make a program that sends this hashed version of the password to the server allowing them to successfully log in. If they can do that, encrypting the password doesn't seem very useful. I think I misunderstanding something here.
Is this (the way I described above) the most popular way to implement user login functionality? Does it follow current best practices? Do I have to do everything manually or does some database have the built-in ability to do the same thing? Is there a most common way/method of doing this for a website or a web app? If so, please provide me with details.
My former company used couchDB to store user login info including passwords. They did not do too much with the encryption side of things. They said couchDB will automatically encrypt the password and store it in the documents. I am not sure if this is a safe way. If so, then it is pretty convenient for programmers because it saves lots of work.
Is this way (point 3) secure enough for normal use? Do other database system such as mySQL have this kind of ability that can do the same thing? If so, does it mean that using mySQL built-in method is secure enough?
I am not looking for a very super secure way of implementing user login functionality. I am rather looking for a way that is popular, easy-to-implement, proper, secure enough for most web applications. Please give me some advice. Details provided will be really appreciated.
When a user login, client side code will encrypt the password by MD5 or SHA-1 or something like that, and then send this encrypted password to server side and then compare it with the one in database. If they are matched, the user log in successfully.
No, no, the client needs to send the unhashed password over. If you hash the password on the client side then that hash is effectively the password. This would nullify the security of the cryptographic hashing. The hashing has to be done on the server side.
To secure the plaintext password in transit it needs to be sent over a secure channel, such as an encrypted TLS (SSL) connection.
Passwords should be salted with a piece of extra data that is different for each account. Salting inhibits rainbow table attacks by eliminating the direct correlation between plaintext and hash. Salts do not need to be secret, nor do they need to be extremely large. Even 4 random bytes of salt will increase the complexity of a rainbow table attack by a factor of 4 billion.
The industry gold standard right now is Bcrypt. In addition to salting, bcrypt adds further security by designing in a slowdown factor.
Besides incorporating a salt to protect against rainbow table attacks, bcrypt is an adaptive function: over time, the iteration count can be increased to make it slower, so it remains resistant to brute-force search attacks even with increasing computation power.... Cryptotheoretically, this is no stronger than the standard Blowfish key schedule, but the number of rekeying rounds is configurable; this process can therefore be made arbitrarily slow, which helps deter brute-force attacks upon the hash or salt.
A few clarifications:
Don't use MD5. It's considered broken. Use SHA but I'd recommend something a little better than SHA1. - https://en.wikipedia.org/wiki/MD5
You don't mention anything about salting the password. This is essential to protect against Rainbow tables. - https://en.wikipedia.org/wiki/Rainbow_tables
The idea of salting/hashing passwords isn't really to protect your own application. It's because most users have a few passwords that they use for a multitude of sites. Hashing/salting prevents anyone who gains access to your database from learning what these passwords are and using them to log into their banking application or something similar. Once someone gains direct access to the database your application's security has already been fully compromised. - http://nakedsecurity.sophos.com/2013/04/23/users-same-password-most-websites/
Don't use the database's built in security to handle your logins. It's hacky and gives them way more application access than they should have. Use a table.
You don't mention anything about SSL. Even a well designed authentication system is useless if the passwords are sent across the wire in plain text. There are other approaches like Challenge/Response but unfortunately the password still has to be sent in plain text to the server when the user registers or changes their password. SSL is the best way to prevent this.

SHA1-hashing for web authentication in place of Blowfish

Being unable to locate a working php/javascript implementation of blowfish, I'm now considering using SHA1 hashing to implement web-based authentication, but the lack of knowledge in this particular field makes me unsure of whether the chosen method is secure enough.
The planned roadmap:
User's password is stored on the server as an MD5 hash.
Server issues a public key (MD5 hash of current time in milliseconds)
Client javascript function takes user password as input, and calculates its MD5 hash
Client then concatenates public key and password hash from above, and calculates SHA1 of the resulting string
Client sends SHA1 hash to the server, where similar calculations are performed with public key and user's password MD5 hash
Server compares the hashes, a match indicates successful authentication.
A mismatch indicates authentication failure, and server issues a new public key, effectively expiring the one already used.
Now, the problematic part is about concatenating two keys before SHA1, could that be prone to some kind of statistical or other attacks?
Is there any specific order in which keys should be concatenated to improve the overall quality (i.e. higher bits being more important to reliability of encryption)?
Thank you in advance.
If you're only using the 'public key' (which isn't actually a public key, it's a nonce, and should really be random, unless you really want it to be usable over a certain timeframe, in which case make sure you use HMAC with a secret key to generate it so an adversary cannot predict the nonce) to prevent replay attacks, and it's a fixed size, then concatenation might not be a problem.
That said, I'm a bit concerned that you might not have a well-thought-out security model. What attack is this trying to prevent, anyway? The user's password hash is unsalted, so a break of your password database will reveal plaintext passwords easily enough anyway, and although having a time-limited nonce will mitigate replay attacks from a passive sniffer, such a passive sniffer could just steal the user's session key anyway. Speaking of which, why not just use the session key as the nonce instead of a timestamp-based system?
But really, why not just use SSL? Cryptography is really hard to get right, and people much smarter than you or I have spent decades reviewing SSL's security to get it right.
Edit: If you're worried about MITM attacks, then nothing short of SSL will save you. Period. Mallory can just replace your super-secure login form with one that sends the password in plaintext to him. Game over. And even a passive attacker can see everything going over the wire - including your session cookie. Once Eve has the session cookie, she just injects it into her browser and is already logged in. Game over.
If you say you can't use SSL, you need to take a very hard look at exactly what you're trying to protect, and what kinds of attacks you will mitigate. You're going to probably need to implement a desktop application of some sort to do the cryptography - if MITMs are going around, then you cannot trust ANY of your HTML or Javascript - Mallory can replace them at will. Of course, your desktop app will need to implement key exchange, encryption and authentication on the data stream, plus authentication of the remote host - which is exactly what SSL does. And you'll probably use pretty much the same algorithms as SSL to do it, if you do it right.
If you decide MITMs aren't in scope, but you want to protect against passive attacks, you'll probably need to implement some serious cryptography in Javascript - we're talking about a Diffie-Hellman exchange to generate a session key that is never sent across the wire (HTML5 Web storage, etc), AES in Javascript to protect the key, etc. And at this point you've basically implemented half of SSL in Javascript, only chances are there are more bugs in it - not least of which is the problem that it's quite hard to get secure random numbers in Javascript.
Basically, you have the choice between:
Not implementing any real cryptographic security (apparently not a choice, since you're implementing all these complex authentication protocols)
Implementing something that looks an awful lot like SSL, only probably not as good
Using SSL.
In short - if security matters, use SSL. If you don't have SSL, get it installed. Every platform that I know of that can run JS can also handle SSL, so there's really no excuse.
bdonlan is absolutely correct. As pointed out, an adversary only needs to replace your Javascript form with evil code, which will be trivial over HTTP. Then it's game over.
I would also suggest looking at moving your passwords to SHA-2 with salts, generated using a suitable cryptographic random number generator (i.e. NOT seeded using the server's clock). Also, perform the hash multiple times. See http://www.jasypt.org/howtoencryptuserpasswords.html sections 2 and 3.
MD5 is broken. Do not use MD5.
Your secure scheme needs to be similar to the following:
Everything happens on SSL. The authentication form, the server-side script that verifies the form, the images, etc. Nothing fancy needs to be done here, because SSL does all the hard work for you. Just a simple HTML form that submits the username/password in "plaintext" is all that is really needed, since SSL will encrypt everything.
User creates new password: you generate a random salt (NOT based off the server time, but from good crypto random source). Hash the salt + the new password many times, and store the salt & resulting hash in your database.
Verify password: your script looks up salt for the user, and hashes the salt + entered password many times. Check for match in database.
The only thing that should be stored in your database is the salt and the hash/digest.
Assuming you have a database of MD5 hashes that you need to support, then the solution might be to add database columns for new SHA-2 hashes & salts. When the user logs in, you check against the MD5 hash as you have been doing. If it works, then follow the steps in "user creates new password" to convert it to SHA-2 & salt, and then delete the old MD5 hash. User won't know what happened.
Anything that really deviates from this is probably going to have some security flaws.

Replacing plain text password for app

We are currently storing plain text passwords for a web app that we have.
I keep advocating moving to a password hash but another developer said that this would be less secure -- more passwords could match the hash and a dictionary/hash attack would be faster.
Is there any truth to this argument?
Absolutely none. But it doesn't matter. I've posted a similar response before:
It's unfortunate, but people, even programmers, are just too emotional to be easily be swayed by argument. Once he's invested in his position (and, if you're posting here, he is) you're not likely to convince him with facts alone. What you need to do is switch the burden of proof. You need to get him out looking for data that he hopes will convince you, and in so doing learn the truth. Unfortunately, he has the benefit of the status quo, so you've got a tough road there.
From Wikipedia
Some computer systems store user
passwords, against which to compare
user log on attempts, as cleartext. If
an attacker gains access to such an
internal password store, all passwords
and so all user accounts will be
compromised. If some users employ the
same password for accounts on
different systems, those will be
compromised as well.
More secure systems store each
password in a cryptographically
protected form, so access to the
actual password will still be
difficult for a snooper who gains
internal access to the system, while
validation of user access attempts
remains possible.
A common approache stores only a
"hashed" form of the plaintext
password. When a user types in a
password on such a system, the
password handling software runs
through a cryptographic hash
algorithm, and if the hash value
generated from the user's entry
matches the hash stored in the
password database, the user is
permitted access. The hash value is
created by applying a cryptographic
hash function to a string consisting
of the submitted password and,
usually, another value known as a
salt. The salt prevents attackers from
building a list of hash values for
common passwords. MD5 and SHA1 are
frequently used cryptographic hash
functions.
There is much more that you can read on the subject on that page. In my opinion, and in everything I've read and worked with, hashing is a better scenario unless you use a very small (< 256 bit) algorithm.
There is absolutely no excuse to keeping plain text passwords on the web app. Use a standard hashing algorithm (SHA-1, not MD5!) with a salt value, so that rainbow attacks are impossible.
I don't understand how your other developer things 'more passwords could match the hash'.
There is argument to a 'hash attack would be faster', but only if you're not salting the passwords as they're hashed. Normally, hashing functions allow you to provide a salt which makes the use of known hash table a waste of time.
Personally, I'd say 'no'. Based on the above, as well as the fact that if you do somehow get clear-text expose, a salted, hashed value is of little value to someone trying to get in. Hashing also provides the benefit of making all passwords 'look' the same length.
ie, if hashing any string always results in a 20 character hash, then if you have only the hash to look at, you can't tell whether the original password was eight characters or sixteen for example.
I encountered this exact same issue in my workplace. What I did to convince him that hashing was more secure was to write a SQL injection that returned the list of users and passwords from the public section of our site. It was escalated right away as a major security issue :)
To prevent against dictionary/hash attacks be sure to hash against a token that's unique to each user and static (username/join date/userguid works well)
If you do not salt your Password, you're suspect to Rainbow Table attacks (precompiled Dictionaries that have valid inputs for a given hash)
The other developer should stop talking about security if you're storing passwords in plaintext and start reading about security.
Collisions are possible, but not a big problem for password apps usually (they are mainly a problem in areas where hashes are used as a way to verify the integrity of files).
So: Salt your passwords (by adding the Salt to the right side of the password*) and use a good hashing algorhithm like SHA-1 or preferably SHA-256 or SHA-512.
PS: A bit more detail about Hashes here.
*i'm a bit unsure whether or not the Salt should to to the beginning or to the end of the string. The problem is that if you have a collisions (two inputs with the same hash), adding the Salt to the "wrong" side will not change the resulting hash. In any way, you won't have big problems with Rainbow Tables, only with collisions
There is an old saying about programmers pretending to be cryptographers :)
Jeff Atwood has a good post on the subject: You're Probably Storing Passwords Incorrectly
To reply more extensively, I agree with all of the above, the hash makes it easier in theory to get the user's password since multiple passwords match the same hash. However,
this is much less likely to happen than someone getting access to your database.
There is truth in that if you hash something, yes, there will be collisions so it would be possible for two different passwords to unlock the same account.
From a practical standpoint though, that's a poor argument - A good hashing function (md5 or sha1 would be fine) can pretty much guarantee that for all meaningfully strings, especially short ones, there will be no collisions. Even if there were, having two passwords match for one account isn't a huge problem - If someone is in a position to randomly guess passwords fast enough that they are likely to be able to get in, you've got bigger problems.
I would argue that storing the passwords in plain text represents a much greater security risk than hash collisions in the password matching.
I'm not a security expert but I have a feeling that if plain text were more secure, hashing wouldnt exist in the first place.
In theory, yes. Passwords can be longer (more information) than a hash, so there is a possibility of hash collisions. However, most attacks are dictionary-based, and the probability of collisions is infinitely smaller than a successful direct match.
It depends on what you're defending against. If it's an attacker pulling down your database (or tricking your application into displaying the database), then plaintext passwords are useless. There are many attacks that rely on convincing the application to disgorge it's private data- SQL injection, session hijack, etc. It's often better not to keep the data at all, but to keep the hashed version so bad guys can't easily use it.
As your co-worker suggests, this can be trivially defeated by running the same hash algorithm against a dictionary and using rainbow tables to pull the info out. The usual solution is to use a secret salt plus additional user information to make the hashed results unique- something like:
String hashedPass=CryptUtils.MD5("alsdl;ksahglhkjfsdkjhkjhkfsdlsdf" + user.getCreateDate().toString() + user.getPassword);
As long as your salt is secret, or your attacker doesn't know the precise creation date of the user's record, a dictionary attack will fail- even in the event that they are able to pull down the password field.
Nothing is less secure than storing plain-text passwords. If you're using a decent hashing algorithm (at least SHA-256, but even SHA-1 is better than nothing) then yes, collisions are possible, but it doesn't matter because given a hash, it's impossible* to calculate what strings hash to it. If you hash the username WITH the password, then that possibility goes out the window as well.
* - technically not impossible, but "computationally infeasible"
If the username is "graeme" and the password is "stackoverflow", then create a string "graeme-stackoverflow-1234" where 1234 is a random number, then hash it and store "hashoutput1234" in the database. When it comes to validating a password, take the username, the supplied password and the number from the end of the stored value (the hash has a fixed length so you can always do this) and hash them together, and compare it with the hash part of the stored value.
more passwords could match the hash and a dictionary/hash attack would be faster.
Yes and no. Use a modern hashing algorithm, like an SHA variant, and that argument gets very, very week. Do you really need to be worried if that brute force attack is going to take only 352 years instead of 467 years? (Anecdotal joke there.) The value to be gained (not having the password stored in plain text on the system) far outstrips your colleague's concern.
Hope you forgive me for plugging a solution I wrote on this, using client side JavaScript to hash the password before it's transmitted: http://blog.asgeirnilsen.com/2005/11/password-authentication-without.html