I've asked these questions in interviews and a lot of programmers seem to know the basics (what a hash is), but seem to fall short when it comes to things like what a salt is, modern storage techniques (bcrypt, scrypt, PBKDF2) and how rainbow attacks can be used to crack hashes.
It's a bit tricky to fail someone in an interview because they don't know security. I think the better way to handle security in a company is to hire a programmer that knows it and teaches it to other programmers during code reviews.
Is there a good article or breakdown on modern best security practices? I am a relatively experienced programmer and know what a hash and salt are, and have read the wikipedia article on rainbow tables (though I now forget what they are other than that they make attacks of certain things easier), but that's the extent of my knowledge.
That site recommends libsodium which by default uses Argon2. The issue is that Argon2 is not very mature yet. Also if you are using python there is no good library out there. Ideally you want something well tested since it is possible for libraries to have bugs as well.
How is this a failure? That's correct. (If your password hashing library doesn't handle this automatically, of course. But it does the same internally.)
The big thing about this, is that it is perfectly "OK" to store both the algorithm, cost, and salt alongside the hash.
Most people seem to think, and myself included when I was new-to-it, that storing all those things together would compromise the security. The point of the hash is that it is impossible (almost) to get to the hash without the user's password, and there is no way to get to the password with the entire string you posted.
I'm naive about these things, but I was under the impression that salt just thwarted pre-computed hash tables? I guess should be "just" in quotes.
So somebody with resources and motive could still brute-force that string. It seems that storing the salt somewhere else would add a comparable amount of security as the salt itself. It seems prudent along the lines of "don't put all your eggs in one basket."
> but I was under the impression that salt just thwarted pre-computed hash tables?
Yes. Because if you had two users with the password 'dadada' they would hash to the same value
Now 1234:dadada hashes differently then 1326:dadada hence preventing the use of a prehashed table (you could go through all salts for common passwords, but it's usually a bit long as well)
Rather than expecting the password hash library to store something into your application DB, you should be managing the access to that DB yourself.
In our case, we use an immutable attribute of each user as their hash. This might be an internal identifier, or the timestamp on which their account was created, or something like that.
Rather than expecting the password hash library to store something into your application DB, you should be managing the access to that DB yourself.
You do manage it yourself. Password hashing library doesn't access your database, it produces a string that you store, which includes salt and password hash.
In our case, we use an immutable attribute of each user as their hash
What? You really need to talk to security-competent people.
> In our case, we use an immutable attribute of each user as their hash.
I assume you mean "as their salt". And even then, why the half-measure? Just laziness? Sure, a guessable/computable salt is better than no salt, but it's not nearly as good as a random salt.
why the half-measure? Just laziness? Sure, a guessable/computable salt is better than no salt, but it's not nearly as good as a random salt.
But isn't the salt essentially safe to make public anyway? That being the case, how does it matter what value you use, so long as it differs between users?
Ideal salt is a large (e.g. 16 bytes or more) random byte string generated for each password.
If there's a reason for it (in most cases, there is none), some trade offs are possible, e.g.:
Salt is a large random string unique per user, not per password.
Given two hashes of passwords for the same user it reveals whether passwords are the same.
Salt is a small random string or some predictable value.
Attackers can precompute guesses and then look them up.
If you use some immutable identifier per user as salt, both of these attacks are possible. Is there a reason for this? Since you already store password hash in your database, I'm 100% certain that it's not, you can generate large random salt per each password hash and store it.
As for "safe to make public": there are many things in crypto called "public" where "public" doesn't mean that the whole world is free to get it, but instead means an opposite of "private", or, as I like to call them, "non-secret". Yes, salt can be made public, but shouldn't (unless there's a reason for it — like in a kind of client-side crypto where server stores salt and sends it to clients) to avoid precomputation.
Salt is a large random string unique per user, not per password.
Of course it's per user.
But "large" makes some sense. My current implementation has maybe 20-22 bits of uniqueness in the salt, certainly less than 16 bytes.
I don't think 16 bytes is necessary even as insurance against the future. Rainbow tables are still expensive to build.
On the other hand, maybe to build just a small table addressing the stupidest passwords ("password","12345678",etc.) it's worth making it more difficult.
> I don't think 16 bytes is necessary even as insurance against the future.
The birthday problem comes into play here.
If you have 22 bits of entropy in your salt, after 2048 users (2^11) you will find two with the same salt, with 50% probability. If they also use the same password, this makes attacking your users much easier.
Don't make it easy for attackers. Use 16 bytes from a CSPRNG. Better yet: Use a password hashing library that takes care of this for you.
If you use a 128-bit (16-byte) salt, you have a 50% chance of a collision after 2^64 passwords.
It being unique goes most of the way, you're right (though hopefully it actually is unique!). I was being dramatic when I said "not nearly as good". But making the salt easily guessable does allow an attacker to precompute rainbow tables, etc. So if there was a breach and an attacker got a dump of your password hashes, it might mean the difference between you having time to invalidate those passwords or not.
I'm not too familiar with this library, but on inspection this approach seems to have a couple of drawbacks that libraries like bcrypt solve for you:
1) You need to store the salt alongside the password.
2) If you want to futureproof the stretching factor (e.g. change from 100000 to 1000000), you need to store that alongside the password hash as well.
3) If you want to futureproof the hashing algorithm, you need to store that alongside the password hash.
The value of the *crypt solutions is that they store the input parameters as part of the stored secret. So you can make adjustments later on without invalidating existing stored passwords, or having to resort to annoying "double-hashes" to migrate to a new approach.
I don't understand your comment about the ORM needing to handle passwords. It's a simple fetch of a field from the DB, which you then pass as an input to your password validator. How is that any harder than fetching a salt and a hash and passing those to your validator?
You should use the md5 function to produce a string you can compare with the md5 function.
md5 is obsolete, does not offer much security, and should not be used in new programs except for the purposes of interoperating with old programs, and even in that case, one should weigh the risks of interoperation with the costs of replacing.
If you are looking for a suitable replacement for md5, you should instead be looking for specific use-cases, such as content verification (links...), authenticated content verification (links...), password verification (links...), random number generation (links...), shared secret encryption (links...)
Personally, I think it's quite enough if a programmer at least knows that he doesn't know certain aspects of security — and instead of importing standard library's md5 module, spends at least 20 minutes on Google when faced with such a task.
Recruiters should be administering work sample tests tuned to the signal of what the company makes, and the company should be doing security audits at a regular interval.
It's strange that candidates would be expected to have any knowledge of security when security is very hard to get right. This is the domain of a dedicated specialist, not a generalist who's usually pressed for time and whose work usually isn't audited.
If you want to get security 90% right, you can spend a few days doing the cryptopals challenges. But you need to get security 100% right, because 90% right means your systems can be compromised. A situation like that pretty much demands security audits.
(It doesn't help that the go-to answer of "Who should we contact to get a security audit?" is "Someone that will charge half a year's salary of a decent engineer," though.)
Oh come on. Asking someone about password ecryption is basic computer science and general intelligence . Its. It about writing bcrypt. Its about thinking for five minutes about why we have passwords and what protection they need to be meaningful passwords.
That you would deny someone employment at your company based on their lack of bcrypt knowledge -- knowledge which can be gained in five minutes on Google, which candidates don't have access to during an interview -- is evidence of how broken our hiring processes are.
Auditing isn't enough in my experience. Most audits are just focus on paper security. They'll run nessus and some other automated tools, do no filtering and have your developers actually check any results, and then go through an extensive box checking exercise that is incapable of handling any app specific context.
In the worst case, they may recommend outdated or harmful practices that will actually lower security. For example, complex password rules and rotations instead of strength checking and password managers. This leads to users just writing things down since they can't memorize ever changing passwords. Or they may force you to install antivirus on systems that don't need it, even though antivirus systems have been shown to be full of security flaws that actually weaken system security when installed into privileged or sensitive subsystems.
Paper security is distinct from genuine security. They overlap certainly, but paper security is not enough to make you secure.
I think you underestimate the difficulty of this concept. It's natural to me now, but I remember when I was first learning about this stuff, it took me months of on and off again study to understand why you do stuff, how you do stuff, and what's dumb to do, and that's just for learning the basics. If you hang out with someone who know's what they're doing, they can teach you what to do in a few pithy sentences, but the stuff about entropy and the differences between encryption, authentication, and hashing etc take time, as well as understanding what the nature of the likely attacks against these are.
That is to say, you can tell someone to google and they'll find "Use bcrypt", but they'll still feel scared and confused because they don't even know if this is quality suggestion or not let alone what bcrypt is doing and why it's better than MD5.
That would be useful, but won't solve the problem alone. It's highly likely there were any number of engineers in these companies fuming about poor practice, but lacking management support to prioritise it.
Look at how TalkTalk ignored a security researcher's warnings a year before incurring a major attack. They were also storing in plain-text, available to support staff, and came up with comments like "We're squeaky-clean on security". This is a PR response, not one from the engineers who know what's going wrong.
https://paul.reviews/value-security-avoid-talktalk/
In theory, everyone knows what's the best (or at least a good enough) way to do something like this.
In practice, there's a lot more things involved leading to stupid decisions like this. Something that was supposed to be temporary made permanent by growing technical debt, unclear responsibilities, moving priorities, etc.
Security is never any company's top priority just because it's not visible. Changing a color of a button is normally more important.
Yeah, I'm currently taking a computer security class for master students and we spend quite some time on passwords. Made me think that this should be in an obligatory course for bachelor students (or some stripped down version that teaches the basics).
It's mind-boggling how many well-known companies stored passwords badly.
As another comment mentioned I think the issue is far less that computer scientists don't know good PW practice and far greater that management doesn't care to give them time to work on it.
My guess: someone was registering thousands of bots using common swearing words + some number as their password, and this particular spam campaign happened to be particularly large.
Sending your password in plain text in email doesn't mean it's stored in plain text; it could be copied from memory into the email before being discarded at the end of execution of the initial request.
Yep. Pavel Durov was forced out by various Kremlin connected folk for daring to stand up to the FSB and the Kremlin. He also went public about their unlawful requests. He then fled Russia.
Just yesterday I clicked on the "Forgot Password" link on the AutoTrader.com website. I was expecting a reset link in my email and instead they just emailed me my password in cleartext. This is a huge website! Not some small business. It completely baffles me.
I wish that there was some way to shame these companies. I've seen some websites that list some of these offenders but they don't appear to be effective enough. I want news articles written about these companies in the magazines that the CIOs care about, with their photo right next to the article.
Yes, it's 2016, but they say that passwords were collected in 2011-2012/2013. Also, phrasing in Russian sources suggests that passwords were not dumped from DB, but actively collected by other means.
Or PBKDF2, which would be my preference in any stuffy corporate environment. Nobody ever got fired for following NIST/FIPS standards.
I'd use bcrypt if I were working on my own project though. I think scrypt goes too far, I would actually be concerned about its speed and memory utilization hurting responsiveness for any sort of web server or other latency-sensitive system.
I just logged into VK to change my password. It happily accepted a 30-character generated password (alpha-numeric with upper/lowercase and special characters). However, i was not able to login with it after changing the password. Tried using their password recovery tool that texts you an MFA code. It never came. Attempting to send another one gave me "You exceeded daily attempts limit" error message. I guess tomorrow i'm going to TRY to login to just delete the damn account all together.
judging from that 'gram shot, it looks like VK didn't bother actually deleting user profiles (a lá Ashley Madison). Will be interesting to see the fallout from this, if confirmed.
OT question: VK is the "Russian Facebook". The article claims it has 280 million users, but a quick Google shows Russia has a population of 143 million.
What gives? Russian speaking countries? Multiple accounts per user?
VK is actually available in essentially all modern languages and is very popular in Belarus, Ukraine, and Kazakhstan; and sometimes used in countries with large numbers of immigrants from those four countries (Canada, Israel, etc) to check on family members or friends.
It's crazy how passwords are stored in these sites with millions of users. Secure password storage is one of the top priorities of mine when I am training internees or teaching someone web app dev.