One, often over-looked reason, that website designers chose simple MD5 or SHA1 hashing (and why many still do) is for portability. Years ago this was more of an issue than it is today. They may wish to change web frameworks, programming languages, service providers, etc. and when doing so they want a simple, easy way to use the hashes they have.
Also, many big service providers still use MD5 or SHA1. I saw a company migrate to Google Mail from an old legacy mail system and part of that was integrating some authentication services too. This was several years ago, but at that time, Google accepted hashes in two formats only... MD5 or SHA1.
It's pretty easy to migrate users from one hashing algorithm to another when they next log in:
if( user has an old style hash ){
if( password verifies against old style hash ){
add a new style hash
delete the old style hash
log them in
}
} else {
if( password verifies against new style hash ){
log them in
}
}
Eh, why not just bcrypt the SHA1/MD5 hashes? Your auth check will just become bcrypt(SHA1(pass)) rather than bcrypt(pass). You can convert all passwords right away and I don't see any significant downside to it.
That would be fine if you wanted to retain the legacy hashing function. I was describing a migration from one hashing algorithm to another. You're describing modifying the existing hashing algorithm, which is something different.
It depends if your goal is to have security for everyone right away or if it's to only use one hashing function. I don't see anything wrong with a round of SHA1 in your setup, especially when it allows you to secure everyone's password immediately...
Will there not be a time lapse whilst you run bcrypt on all the existing hashes to update your users table? Given that bcrypt is intentionally 'slow', this might be a problem for applications with large numbers of users (albeit a one-off cost).
I agree that for small sites, changing the auth check and updating all the existing rows (and deleting/overwriting the old hashes) is probably the best solution though.
> Will there not be a time lapse whilst you run bcrypt on all the existing hashes to update your users table?
No, just SHA1 the password and check, and then bcrypt(SHA1) and check, until the encryption process finishes. Or just check the length, or store the hash type along with the hash, a la Django. These problems are trivial, really.
If you assume 6 million users and one second to compute a password hash, that's a couple of months on a single CPU. Takes a while, but no big deal, and it can proceed while your site stays live.
I like this idea for sites that e.g. use Microsoft ASP.NET Membership where all you can do is set a machineKey attribute about what kind of password hashing to use, and don't support any "old hash vs. new hash" alternatives for changing. Anyone know of an implementation that would do this and "drop in" to an existing ASP.NET site? Or pointers to how to approach developing this?
The problem is that you may have to keep this code for a VERY long time (people don't log into sites for 3+ years and expect their old passwords to still work).
Lugging around legacy code is never a good idea.
4. Expire all passwords and ask your users to come reset them. - No legacy code necessary
Regular users should have no problems if there's an accompanying blog post
Non-regular users might just remember about your site/service and come back :)
5. Use bcrypt/password stretching. Store the work value alongside the password and upgrade it as people log in.
To me that's not really keeping legacy code around; just an extra variable...
It's not great, though - it means that someone who hacks your server can get users' passwords a lot easier. I read a couple of weeks ago that Blizzard's games don't send the passwords to the server, they send a hash. Things are necessarily crappier on the web, of course.
Edit: there's more detail at the link below. It looks clear that in at least some of their schemes they deliberately do not send the client password to the server, which sounds like a decent idea.
Sending a hash is no different than sending a plain text password. Because the attacker has complete control at their end and can just hack a client that sends that same hash even if they don't know the original password.
Exactly - "Things are necessarily crappier on the web, of course."
I wonder, though. Could there be a "code has changed" warning from the client? I mean, authentication should be pretty damn stable, and maybe even universal. If someone does modify the page, it'd be nice to know if that change was reflected on other sites, and it'd be nice to know that someone I trust had signed off on it (cryptographically).
A simple alternative is to build it into browsers. A password field could generate a salt per-domain and automatically encrypt any queries to password-fields. The server doesn't even need to know about it. You'd have to be more than a little careful building it, obviously, and you'd have to find a way to deal with passwords used on more than one site, but it could work.
Note that getting the client to send only the hashed password is incredibly silly. If there's a leak, the hackers do not even have to crack those passwords.
What would be the point of hashing on the client? Let's say you're using md5, and you hash the password before you send it. Chances are, that un-salted hash can be easily decrypted by any number of reverse-lookup tables around online.
You can't salt the hash client side, because anyone can look at your JS and find your salting tactic.
The best approach to securing information going from client to server is SSL.
However any salting tactic that can be pushed and used on the client side would have a tough time using a salt that is on a per-user basis. This means that if you could salt it client side, you would need to have a static salt, which is significantly less secure than a unique salt per user.
How many web login forms (as an example) are doing client side hashing? The only ones I know are of the recent 'test your password in the linkedin leak' variety.
How do you degrade for NoScript users?
And where's the harm in that anyway (assuming TLS)?
Graceful degradation is pretty easy. Override the form submit handler to hash the password, set a flag that the password is hashed, empty the original password, and submit the form. Users with JS blocked will just submit the unhashed password as usual.
Of course, if you're salting the hash uniquely for each user, then this approach isn't very helpful.
Finally, if your server will accept hashed passwords, then getting the hash is as good as getting the password for access to your site. The only benefit is that the password is hidden so you may avoid compromising security on other sites.
> Of course, if you're salting the hash uniquely for each user, then this approach isn't very helpful.
I've seen this advised a lot. However, where are you going to put the per-user salt? I presume in your users table, or somewhere in your db. Does this not mean that the salts are just as likely to be hacked as the encrypted passwords?
Are you not better off with either a single salt that's stored somewhere outside your db or with some other scheme for algorithmically picking a salt based on the user id?
Salts aren't meant to be secret (or rather, no more secret than the password hashes themselves). The goal is to prevent a precomputed brute force attack.
If you store straight unsalted hashes, you can precompute the hash for every likely password, store the precomputed hash -> password mappings in a nice efficient data structure (http://en.wikipedia.org/wiki/Rainbow_table) and use the same precomputed tables to reverse every hash in the system with a very fast lookup. And every other system using the same unsalted hash scheme.
If you have a salt, this doesn't work -- you'd need a set of precomputed tables for each salt value, which rather defeats the object of precomputation.
If you take it a step further, and use a storage scheme like bcrypt or PBKDF2, not only are you protected from the precomputation attacks, but testing each password candidate takes much longer than a straight cryptographic hash -- so brute force attacks becomes much slower, too.
To put it differently, hashing without salting lets an attacker crack the entire database at once. Try a password, see if the hash is in the database, repeat.
With salts, the attacker can only attack a single user at a time. Try a password with one user's salt, see if the hash is in the database, try the next user's salt, see if that hash is in the database, repeat.
My point was that sending the hash the server doesn't work if the hash is salted per user since you don't know what the salt is. (I suppose you asynchronously fetch the salt value after the user puts in a username, but at the solution as a whole starts to fall apart.)
" … at that time, Google accepted hashes in two formats only... MD5 or SHA1."
That's, ummm, _concerning_…
I don't suppose anyone knows how securely Google are storing my gmail password? I _hope_ it's not unsalted MD5 or SHA1. (Especially since google search is probably the best general purpose md5 hash reverser most people have access to.)
Last migration I did (maybe 12 months ago), google only accepted passwords as an sha1 hash IIRC. I was told however, that was only an intermediary step, and google runs that hash through another KDF for internal use. The sha1sum of the password is really only used for transport to their API, as an alternative to sending it in plain-text which is what your browser does when you login to google services, so I don't see it as a huge problem. Sure it could be more secure, but interoperability and keeping a relatively low barrier of entry are very important to them too.
But in that case the password has only been compromised on that one website, as opposed to every other website where it's being used (likely a non-zero number of them given the average user's password habits). I'm no security expert but I think client-side password hashing with the domain name as the salt seems like a pretty good idea, especially for sites without HTTPS logins (but it also helps in the case of a database leak even for sites with HTTPS logins). Of course, for non-HTTPS logins a network attacker could modify the HTML form code to remove the client-side hashing without the user's knowledge, but it's still at least as secure as the alternative, modulo 'false sense of security' type arguments.
Edit: never mind the last parenthetical; it pretty much wouldn't help in a database leak at all (just adds one extra hashing step to the cracking process), sorry. Still helps for non-HTTPS logins though.
Edit 2: ... though maybe if the client-side hash were something strong like bcrypt, it would help in the case of database leaks on HTTPS sites that refuse to use strong hashing on the server side for performance reasons. Sorry for the rambly disorganized post.
For MITM attacks, sure. But it's a fairly portable, inexpensive way to make it more difficult to use your password with other protocols. For example, intercepting a plaintext Google password has a decent chance of making someone's bank account vulnerable.
Right. If someone is able to infect the browser with malware, MitM your HTTPS connection, or even just load mixed HTTP/HTTPS content, then they are able to run Javascript in the login page. If they are able to run Javascript in the login page, then they are able to monitor the keystrokes as it is typed in.
This is not theoretical. This is what Tunisia did to Facebook and it's what online banking trojans (e.g. Zeus) do every day.
> Google accepted hashes in two formats only... MD5 or SHA1.
It's not surprising that they don't support importing custom formats. They support two formats to bulk-import and then surely convert to their standard from there.
Also, many big service providers still use MD5 or SHA1. I saw a company migrate to Google Mail from an old legacy mail system and part of that was integrating some authentication services too. This was several years ago, but at that time, Google accepted hashes in two formats only... MD5 or SHA1.