I think SHA-2 should be "minor weakness discovered" (if not outright "unbroken"), not "weakened".
At the onset of the SHA-3 competition, everyone was nervous about SHA-2: it appeared as though a good attack was inevitable, what with the cryptanalytic attacks on SHA-1.
But as the competition went on, things got calmer. The attacks against SHA-2 that were so expected simply weren't coming[1]. And so now the status quo is that SHA-2 seems pretty darn safe, and the real focus of the SHA-3 competition shifted towards not necessarily having a direct replacement for SHA-2, in the sense of performance, but instead having a design that was sufficiently different to not allow SHA-2 attacks to apply to it. And Keccak is just that: quite different.
Anyway, my point is that SHA-2 is mislabeled. Honestly, I think cryptographers recommend it the most out of any of the hash functions currently; SHA-3's software performance is rather... lacking.
[1] Some may argue that this is because cryptographers were focused on the SHA-3 candidates, but I'm not so sure
The attacks are on reduced-round versions of SHA-256 and SHA-512. Those do not describe any attacks faster than (dumb) brute force against SHA-256 or SHA-512.
I think labeling SHA-256 as orange is highly misleading. The SHA-2 family of hashes is going nowhere unless the partial-round attacks get a lot closer to the full-round versions. They're still about 20+ rounds away.
If the SHA-2 family have weaknesses, and SHA-2 is used for generating Bitcoin blocks, whoever breaks this first will be an overnight millionaire, just make sure you break them slowly (about 20 a day max) to avoid suspicion that the hashing is compromised. Sell as much as possible and then release your paper.
Breaking SHA2 (i.e. developing an economically-feasible preimage attack) would indeed crash bitcoin, but the converse is not true.
A near-collision attack on double SHA256 (if you treat it as a single hash not a pair of independent hashes) would also crash bitcoin, but would not necessarily be a threat to use of SHA256 for authentication purposes.
A bitcoin block solution just needs the hash to include enough leading zeros. Authentication (nearly always being automated) requires every bit to match - hitting 255 of 256 bits is no better than hitting 0 bits, as either way your message will be rejected.
Break any and the value of bitcoins will crash hard.
You can't take millions out of a system which doesn't actually have millions worth of liquidity backing it. The total worth of all bitcoins is actually far smaller than the market capitalization of all bitcoins because the trade of them is based off an assumption that many if not most coins are dead coins.
Well, MtGox monthly volume is over 1 million BTC [1] or $100 million at current prices. I think you should be able to take out at least a few million without raising suspicion
In case of SHA2 compromise in particular, generating, say 20% of daily blocks would hardly be noticed (and can be easily explained as a new shipment of ASICs). This is about 720 BTC or $72000 daily.
Use of SHA-1 for digital signature generation has been deprecated by NIST since 2011. It's disallowed after 2013-- which is important for software aiming for government use.
Indeed it was. Wang's breakthrough work [1,2] broke most of the common hash functions at the time, and later also SHA-1 [3]. The SHA-3 competition was motivated by this streak of new successful attacks.
This is actually a pretty interesting question. The answer, at least for Merkle-Damgard hash functions (MD5, SHA-1, SHA-2, etc) is that concatenating (or "cascading") hash functions doesn't really improve the strength of the resulting construction.
Merkle-Damgard hash functions look like this:
function MD(M, H, C):
for M[i] in pad(M):
H := C(M[i], H)
return H
For message M, initial state H, and compression function C. In other words, pad the message, break it into blocks, and use the compression function to "mix" each block into the hash function's state. The final state is the result.
Antoine Joux showed that for any MD hash function, generating many collisions is not much more difficult than generating one. Here's how:
1. Take the initial state H[0].
2. Find two single-block messages that collide under C with H[0] as the input state. Call the result H[1].
3. Now find another pair of single-block messages that collide under C with H[1] as the input state. Call the result H[2].
4. Iterate as needed.
Here's the trick: each single-block collision you find is actually doubling your set of colliding messages. This is because for each block of the message, you can select either of two candidate blocks. So if the effort required to find a collision in a b-bit hash function is 2^(b/2), the effort to find 2^n such collisions is only n * 2^(b/2).
What does this mean for cascaded hash functions? Well, consider a hypothetical construction that simply concatenates a 128-bit hash function with a 160-bit function. A plan of attack might look like this:
1. Find 2^80 colliding messages under the 128-bit function. This should take roughly 80 * 2^64 ~= 2^70.3 units of "effort".
2. Evaluate each message under the 160-bit function. There's probably a collision in there somewhere. This will take around 2^80 work, dwarfing what we did in the first step.
The effort to find a collision under both functions is thus only about what it takes to find a collision under the stronger of the two.
Shameless plug: we will have a few problems exploring the consequences of Joux collisions in set seven of the Matasano crypto challenges.
Ah, so in fact the naive concatenating solution I gave, in addition to being just as easy because the attacker only has to break half of it, is actually even easier because the attacker has two targets to collide with.
What about non-concatenative methods? For example, could you do something like "shift and xor". For example, say you have a 4-byte hash function, so that:
And then you could iterate that to extend to as many bytes as you want? And then maybe you'd want to xor the first and last byte together so that nothing from the first hash remains - although now I'm thinking intuitively which generally seems to be a bad idea with crytpography.
> Ah, so in fact the naive concatenating solution I gave, in addition to being just as easy because the attacker only has to break half of it, is actually even easier because the attacker has two targets to collide with.
I wouldn't say it's easier. Remember that we need to find a single message that generates a collision under both hash functions. So our strategy is to generate a massive number of collisions for the shorter function and hope that there's one pair of messages in there that collide under the longer function.
> What about non-concatenative methods?
I think this will again boil down to finding a single message that generates a collision under both hash functions. It won't matter too much whether you XOR the hashes or concatenate them.
Hmm. It still seems to me that the simple concatenative method could still be broken at least slightly more quickly, since each function can be collision tested separately, but my brain is vaguely gesturing at comprehension about why xoring is still weak.
I feel like we should be teaching the principles of crypto to young children so that we end up with some humans that can grok it as easily as the rest of us do algebra. But there are a great many things I would want young children to be taught if I were made the Benevolent Dictator of All School Boards.
Hmm. That seems like it's still susceptible to the weakness sdevlin mentioned. For c = 1, it is the same thing that I suggested initially, and for larger c, it merely makes the hash more expensive to compute by iterating it, but the concatenation weakness would still hold, wouldn't it? It seems to me that this is perhaps useful for generating keys from passphrases, but not for lengthening a hash used to store a password in a database.
Yes, I think so. IIUC, with this construction, the difficulty only increases linearly with the length of the output, rather than exponentially, as one might expect.
I don't think there's a generically secure way to extend short hash functions to get an exponential difficulty increase. Otherwise, we could just construct arbitrary-length hash functions using small (e.g. 32-bit) building blocks without needing to cryptanalyze the result.
But then again, I haven't been paying attention to the literature lately.
It's used in Bitcoin as well (after SHA-2) to shorten up addresses. Come to think of it, I think Tor uses it for .onion addresses... which they shorten to 80 bits (of preimage resistance!)
But it does matter, because if you can exploit a weakness in the hash function you can figure out the salt, strip it, and then use your precomputed dictionary.
That's not really how it works. Either way, you shouldn't be using just a regular "hash function" anyway. Even basic constructions like PBKDF2 use HMAC constructions where SHA1 and even MD5 are still pretty safe to use (although not very computationally expensive.)
It'd be nice to have whirlpool in the list - I remember when it was seen as the great new hope for a good hash, but I haven't heard anything about it in recent years.
These aren't cryptographic hash functions exactly, though, at least not in the sense that a cryptographer would think. I mean, they will fit just about any definition of a cryptographic hash function you can think of, but really it's not that useful to label them as such. Instead, they're usually called key derivation functions.
On top of that, even if we were to include these in a hash function list, they're decidedly not the most important ones. Most important for password storage and other key derivation, perhaps, but the applications of hash functions are far more general. The preimage resistance of scrypt is reliant on SHA256, for instance.
Cryptographic hash functions must be efficient to compute. Those examples (scrypt, bcrypt, etc) were designed to be difficult to compute. Those are password hash functions, not cryptogrpahic hash functions. Two totally different things with different purposes.
Well, in complexity theory (which theoretical cryptography uses heavily), an efficiently computable function is one that has a polynomial-time algorithm to compute it.
I mean, wouldn't you say that scrypt is efficient to compute? For instance, is 5 seconds not a relatively quick function evaluation? Compare that to super-polynomial-time attacks, some of which wouldn't succeed before our Sun burned out and Earth died. And if you ramp up the security parameters to an insane degree, the user can no longer compute the function themselves. That's the reason for the "efficient to compute" clause in most definitions.
So, while you're right that the fact that KDFs are designed to be much slower than hashes is what really separates them, that doesn't disqualify KDFs from (technically) being cryptographic hash functions. At least, not if you view the definition in a theoretical sense, which is the appropriate way to do so. Still, I agree with your premise; in a practical sense, KDFs shouldn't feel like they are cryptographic hashes, since their purpose is markedly different.
Can you explain why a 40 char text is 1000 times easier to crack than a 10 char password according to the graphic?
Is it assumed you don't use any numbers/symbols and the attacker knows your dictionary?
The "40 char text" is based on NIST guidelines for estimating the entropy in English text -- i.e., dictionary words which make grammatical sense together. The "10 char password" is for 10 random printable ASCII characters.
You're correct. See the original scrypt paper by Percival [1], halfway down page 13, for a description of the categories. The table itself is at the top of page 14.
At the onset of the SHA-3 competition, everyone was nervous about SHA-2: it appeared as though a good attack was inevitable, what with the cryptanalytic attacks on SHA-1.
But as the competition went on, things got calmer. The attacks against SHA-2 that were so expected simply weren't coming[1]. And so now the status quo is that SHA-2 seems pretty darn safe, and the real focus of the SHA-3 competition shifted towards not necessarily having a direct replacement for SHA-2, in the sense of performance, but instead having a design that was sufficiently different to not allow SHA-2 attacks to apply to it. And Keccak is just that: quite different.
Anyway, my point is that SHA-2 is mislabeled. Honestly, I think cryptographers recommend it the most out of any of the hash functions currently; SHA-3's software performance is rather... lacking.
[1] Some may argue that this is because cryptographers were focused on the SHA-3 candidates, but I'm not so sure