This sounds like pretty much the same argument that @eganist and I made to Googl...

tialaramex · on Sept 8, 2018

Do you really feel that deprecation of the HPKP self-ransoming foot gun came out of nowhere?

buu700 · on Sept 8, 2018

Well, the timing wasn't 100% random since the newly supported Expect-CT header was HPKP's "replacement", but I do think three years is a ridiculously small turnaround time between initially adding support for the feature and killing it with low adoption as a stated reason.

I'd also say the footgun aspects of HPKP are a weak excuse to kill it, given that nothing really new about them has been discovered that wasn't acknowledged as a consideration in the original spec. If anything, I think it would've made more sense to improve the UX for both end users and admins/devs to reduce the likelihood of deployment mistakes (better documentation and tooling) and the potential for damage when mistakes did happen (e.g. make HPKP error screens skippable like any other TLS errors).

dagenix · on Sept 8, 2018

> make HPKP error screens skippable like any other TLS errors

That largely defeats the point. Almost no one knows what to make of those errors. And just training everyone to ignore them makes HPKP pointless.

The problem with HPKP was that it could be used to attack any site on the internet with no way for websites to opt out. Basically, the same problem as with certificate authorities - but worse.

Those issues we're know when the spec was written, true. But, it was still a dangerous and extremely difficult to deploy feature. Good riddance.

buu700 · on Sept 8, 2018

That largely defeats the point. Almost no one knows what to make of those errors. And just training everyone to ignore them makes HPKP pointless.

How so? Are you suggesting that TLS as a whole is pointless because browsers allow users to skip the error screens ("Proceed to balls.com (unsafe)")?

Even if the error screen is skippable, it makes it clear to the user that something is very wrong and that they're advised to abort their usage of the site.

The problem with HPKP was that it could be used to attack any site on the internet with no way for websites to opt out. Basically, the same problem as with certificate authorities - but worse.

Definitely agreed on that. (One of our BH/DC demos, RansomPKP, showed that you could actually pivot from a server compromise or MitM to deploying ransomware that would brick an entire domain and hold it hostage until you got paid.)

I just think there are much better ways of going about addressing that. Specifically, my proposal to the Chromium team was as follows:

1. Short-term (Chrome 67): Any time dynamic PKP is used, print a console warning that additional requirements are planned to be attached to the use of dynamic PKP with a relevant link.

2. Medium-to-long-term (ecosystem-wide collaboration): Disregard dynamic PKP headers unless the domain in question has some kind of new indicator in the certificate to show that the CA has validated that the site owner is really really sure they want to use HPKP and understands the risks involved (i.e. offload the whitelisting/validation responsibility from individual browser vendors to the broader CA industry).

(And I had some other ideas about roping CAA into the mix to address some specific concerns, but it wasn't critical or the meat of the idea.) The response was kind of handwavy — not so much caring about the footgun aspect (i.e. accidental self-bricking and hostile pinning), but more an entirely unrelated concern about the HPKP implementation being hard to maintain for some unspecified reasons.

Those issues we're know when the spec was written, true. But, it was still a dangerous and extremely difficult to deploy feature. Good riddance.

I hear this repeated a lot, and frankly I think it's nonsense. I just can't see how anyone with a basic knowledge of deploying TLS would be confused about how HPKP works. The idea that only a veteran sysadmin or crypto expert can understand how to use certs, public keys, and hashes just seems really elitist to me.

Obviously people have occasionally screwed up in the wild, but in those cases I think the fault lies more in the tooling and documentation than in the existence of the standard itself. Further, if we do collectively feel that everyone's hands need to be held, attaching additional requirements to its usage as in my proposal would neatly accomplish that while minimally imposing on all of us who already depend on HPKP in production.

dagenix · on Sept 8, 2018

> Even if the error screen is skippable, it makes it clear to the user that something is very wrong and that they're advised to abort their usage of the site.

If users do abort their usage of the site, the site is effectively bricked. If they don't, then HPKP accomplished nothing because users are using the site despite a possible mitm.

I guess a user could use the site, but more cautiously - such as not entering passwords. That's possible - but I'm skeptical that many users would actually do so.

> I hear this repeated a lot, and frankly I think it's nonsense. I just can't see how anyone with a basic knowledge of deploying TLS would be confused about how HPKP works.

It's not that it's hard to understand, it's that it's hard to actually implement it. You need to have multiple certs in case one of them gets compromised. And if you mess that up, then you either self brick yourself or you need to keep using a known compromised cert.

HPKP just wasn't worth it for most websites - a reduction in the risk of someone presenting a forged cert in exchange for the risk of accidentally self bricking your website.

buu700 · on Sept 8, 2018

If users do abort their usage of the site, the site is effectively bricked. If they don't, then HPKP accomplished nothing because users are using the site despite a possible mitm.

That's exactly the same situation as any other TLS failure, not at all unique to HPKP in any way that I'm seeing.

It's still effectively bricked for non-advanced users and partially bricked for careful advanced users in the way you noted, but at least users can choose for themselves how to proceed, and admins of bricked sites can give them guidance that doesn't involve following convoluted instructions to navigate about:config or chrome://net-internals.

It's not that it's hard to understand, it's that it's hard to actually implement it. You need to have multiple certs in case one of them gets compromised. And if you mess that up, then you either self brick yourself or you need to keep using a known compromised cert.

More accurately, multiple keys, not multiple certs. All you need is to back up the spare key somewhere without throwing it out, which is a minor annoyance but not at all technically difficult.

If users are having trouble with understanding and/or following through with this, I would start with building a better interface than the openssl CLI (possibly as a certbot command) before deciding that the entire concept of key pinning is somehow inherently too difficult to be useful.

HPKP just wasn't worth it for most websites - a reduction in the risk of someone presenting a forged cert in exchange for the risk of accidentally self bricking your website.

Yeah, it should certainly be highly discouraged for almost everyone, but getting rid of it after we already have it is a huge step backwards for the 1% of sites with strict enough security requirements to justify it.

dagenix · on Sept 8, 2018

> That's exactly the same situation as any other TLS failure, not at all unique to HPKP in any way that I'm seeing.

Yup. But the feeling I'm getting is that browser vendors see this behavior as non-ideal since it trains users basically ignore the error. Yeah, in theory the user gets to make their own decision. My theory is that almost no user is actually equipped to make such a decision.

> admins of bricked sites can give them guidance that doesn't involve following convoluted instructions to navigate about:config or chrome://net-internals.

I see this as a worst case outcome - explicitly telling users its ok to bypass a security warning.

> All you need is to back up the spare key somewhere without throwing it out, which is a minor annoyance but not at all technically difficult.

Not technically hard, but still plenty of ways to mess it up. And once it's messed up, there isn't much of a good way to fix it.

buu700 · on Sept 8, 2018

Ah, well that's fair, and I think I'd generally agree with that. I don't have an alternative proposal for handling TLS failures in general, but I think it's silly to arbitrarily make HPKP's UX a special case, and then cite that special case UX as a reason for deprecating it.

tialaramex · on Sept 9, 2018

How is this UX behaviour a special case? HSTS also requires the brickwall UX, so does the OpenSSH key change scenario.

The original sin the Browsers had is that the initial SSL UI was built by people who had no security UX background because almost nobody had any security UX background. This was the era when PGP was considered usable security technology.

So when HCI studies start being done (e.g. at Microsoft) and they come back with the scary result that real users just perceive TLS error dialogs and interstitials as noise to be skipped, there is a problem. Lots of real world systems depend upon skipping these errors. I worked for a large Credit Reference Agency which had an install of Splunk, but for whatever insane reason it was issued a cert for like 'splnkserver.internal' and the only HTTP host name that it accepted was 'splunkserver.internal'. So every single user of that log service had to skip an interstitial saying the name doesn't match. For years. Probably still happens today.

Browsers couldn't just say "OK, that was bad, flag day, now all TLS errors are unskippable" because of the terrible user experience induced, so what happened instead is a gradual shift, one step at a time, from what we know was a bad idea, to what we think is a better idea. That means e.g. "Not Secure" messages in the main browser UI replacing some interstitials, and brick walls ("unskippable errors") in other places where we're sure users shouldn't be seeing this unless they're being attacked.

HPKP was new, so like HSTS it does not get grandfathered into the "skippable because this is already so abused we can't salvage it" state. If you went back and asked HPKP designers "Should we do this, but with skippable UI?" they would have been unequivocal, "No, that's pointless". HPKP and HSTS only improve security if the users don't just ignore them, and the only way we've found to make the user actually pay any attention is to make the error unskippable.

Yes that means "badidea" and subsequent magic phrases in Chrome were, as they say themselves, a bad idea. Because users who know them just skip the unskippable errors and end up back in the same bad place.

buu700 · on Sept 9, 2018

Thanks for all the interesting context and backstory; I wasn't aware of any of that.

In any case, if it was unclear, my point here wasn't that I necessarily dislike the brickwall UI. In light of the studies you've referenced, I definitely prefer it, and if it were up to me it would be enabled for all of TLS regardless of how many existing services with broken deployments are out there.

My point is that, if the more secure UX is part of the reason for Google's decision, I would rather have HPKP with a less secure UX than not have it at all.

dagenix · on Sept 8, 2018

Fair enough. The way that HPKP was deployed and designed and then undeployed was really quite awkward.

0xbadcafebee · on Sept 9, 2018

The more awkward point to me was that HPKP and HSTS were invented to begin with. It's like everyone is sitting on top of a pink elephant, going "Well everyone, we can't acknowledge the pink elephant in the room, but we can make it a nice hat."