I’ve always been confused about this too. It seems to me like asymmetric encrypt...

PeterWhittaker · on March 10, 2022

The basic idea is simple: A certificate is just a signed document. The document has been encrypted with a private key so that anyone can decrypt it with the public key and know which private key was used to encrypt it.

The complexity comes from the many layers of detail that get added.

First, we don't encrypt the document directly, because public key operations on arbitrarily sized data can be slow, require padding, etc. So we hash the document, and encrypt the hash: Hashes are fast, they produce fixed sized output (mostly), etc.

Second, the document itself isn't an arbitrary document, it contains very specific, structured pieces of information. The two most important things are a) the public key being certified and b) the identity of the owner of that key.

We need "b", the identity, so that when we decrypt a signature using "a", the certified public key, we can have some assurance as to who the holder/owner of the private key was.

Identity gets complicated: A person? A machine? An email address? Etc. The original X.509 spec had directory names (think LDAP names) only; later revisions added email addresses, machine identifiers, etc.

Third, why should we trust a certificate? And what for?

Above, I wrote that the important things were "a", the certified public key, and "b", the owner's/holder's identity. Well, we also need "c", the certifier's identity.

The original X.509, e.g., basically had a, b, and c, and some date information (during what period is/was the certified key, a, considered valid); we date-limit keys for security assurance, because we cannot guarantee that key pairs generated today won't be cracked tomorrow or next year or next decade. The bigger we make the keys, the more resistant they are, but the slower is the crypto, so we strike a balance and go for more frequently generated and certified smaller keys, unless we really need long-term, in which case we use bigger keys. But hash algorithms get cracked too, so, again, a tradeoff.

So I've got a certifier's identity. Great. How do I verify the certificate they issued? For that, I need THEIR public key. Now we start getting into chains.

Somehow, I have to have a trustworthy copy of the public key of the certifier, the so-called certification authority (CA). The CA's certificate looks like mine and yours, a document containing a (key), b (identity), and c (CA), and some date info, etc.

I use that trustworthy copy of the CA's public key to verify a certificate, and that lets me know that I can rely on a, the key in that certificate, for some purpose.

But what purpose? In the original X.509, it was "fill your boots", whatever purpose you like.

Does that mean I can verify with that key? Encrypt with it? Use it for email, ssh, TLS, or even acting as a CA myself?

Later versions of X.509 addressed these questions by adding various markers that a CA includes to say to relying parties "I say this certificate is good only for these purposes". You are free to use it for other purposes, but if you get burned, you cannot go back to the CA, because those markers basically told you the certificate wasn't to be relied on for that.

Three of the most common restrictions are 1) is/isNot a CA (that is, does the CA recognize the certificate subject, the b, as a CA themselves? mostly the answer is no, this is an end-user or end-device; while there is nothing we can do to stop them from issuing certificates on their own, we DO NOT recognize them - the chain ends here); 2) encrypt/verify, used in two-pair cryptosystems, e.g.: the CA is saying that THIS key is good for verification, but don't encrypt with it, while THAT one is good for encryption, don't verify with it (there can be policy and security and operational reasons for all of those); and 3) purpose, e.g., this certificate is good for email only, that certificate is good for TLS only, this certificate represents a person, that certificate represents a machine, etc.

Fourth, what about that trusted copy of the CA's public key? Where does that come from? That requires a specific action OUTSIDE the overall infrastructure, outside the chain of trust. In many cases, it is a certificate import: We specifically say to our system "trust this specific certificate". How it gets used, the purposes for which it gets used, depend upon where it was imported (into a root certificate store, an intermediate store, a machine store, a user store) and upon the certificate properties previously discussed.

Fifth, and probably last, what if something goes wrong along the way? What if, despite my best precautions, a private key is compromised during the valid lifetime of the certificate?

How do I let people know? That's where revocation lists and online status checkers come in. I cannot remember who said it, but they described revocation lists as wandering anti-matter for certificates.

The basic idea of an RL is that it is a signed list of zero or more certificate identifiers - turns out certificates also include serial numbers which are intended to be unique within the set of all certificates issued by a specific CA. If the certificate is compromised, the CA writes the certificate's serial number on an RL.

Part of verifying a certificate is verifying that it isn't on an RL. Like certificates, RLs have validity periods, so one has to verify that the RL is also valid.

And that's pretty much what the original X.509 said about RLs. The expectation was that they would be in the directory (the whole point of X.500), in the CA's entry (the CA's certificate contains their directory name as their "b"; cf "a" and "b" way above). Original X.509 did make one distinction, between an RL for end-users, a CRL, and one for authorities, an ARL, but that distinction wasn't well described.

What if I am not using a directory? What if I want to have multiple RLs, maybe for different purposes? What if they get very large? How do I segment them? How do I tell people where to look for the RL that would apply to a given certificate?

And how do I know I have the right RL? If I am using a directory, what if an attacker replaces the RL I am expecting with another valid one? E.g., puts the ARL where I expect to find the CRL? Absent other information, the ARL will validate and of course the enduser certificate I am validating won't appear on it (it's on the CRL, the one the attacker hid), so it will appear valid.

Finally, what if the reason for revocation doesn't affect my purpose? The simplest example would be revoking certificates because of a change in business purpose. Maybe the business went out of business in July, so anything signed by them after July is invalid, but anything signed before then might still be.

Later versions of X.509 addressed all of these questions by including reason markers, revocation time markers, and a few other things.

Most importantly, these later versions of X.509 included the distribution point concept, the most important applications of which are 1) segmenting RLs so that instead of having a single monolithic CRL, a CA can maintain many smaller, perhaps more frequently issued RLs, and 2) tying a certificate to a specific RL BEFORE it is compromised.

In other words, the certificate contains a marker that says "if ever I am compromised, the CA will list me on an RL in this place", wherever that is. It goes one step further (since we trust signatures, not places): RLs themselves contain issuing distribution points that identify where they are from or are/were expected to be: The IDP on an IRL should/must watch the RL distribution point identified in the certificate itself.

That way you can know for sure that this is the right RL: If this certificate is NOT listed on this specific RL, then the CA has NOT revoked it.

This gets around attacks on the storage place, e.g., the directory, etc.

And that's just the management side of PKI, Public Key Infrastructure, and doesn't touch the crypto side, which is as complex.

So, yeah, it's a complex ball of wax.

(I worked for a PKI company for many years in the 1990s, did some PKI work before then, some X.500 work before that, and an awful lot of PKI, security, compliance, audit, and directory consulting in the first two decades of this millennium, but right now I mostly write code, either for high assurance network security devices or for an integrated risk and compliance management software product; this quarter is mostly the hardware product....)