Digital signatures and PKI

How signatures work, what certificates actually certify, chains of trust, and how the CA system fails.

Digital signatures

A digital signature proves three things about a message: integrity (not modified), authenticity (made by the keyholder) and non-repudiation (the signer can't deny it, since only their private key could have produced it).

Mechanics worth knowing precisely, because "describe how A signs a message to B" is a staple question:

A hashes the message (signing the hash, not the message; it's fixed-size and fast).
A encrypts/transforms the hash with their private key → the signature.
A sends message + signature.
B hashes the received message themselves, runs the signature through A's public key, and compares. Match → genuine and intact.

("Encrypting with the private key" is the standard teaching model and fine for exams; real schemes such as RSA-PSS, ECDSA, and Ed25519 are mathematically more particular.)

A subtlety that catches people: signatures use hashes, so a broken hash breaks the signature scheme: collide two documents and one signature validates both. This is exactly how the SHA-1 retirement was forced, and earlier how the Flame malware forged a Microsoft code-signing certificate via an MD5 collision.

The key authenticity problem

Signatures shift the problem rather than ending it: B verified the message with "A's public key", but who says that key belongs to A? An attacker who can swap their own public key in place of A's wins everything. Binding identities to public keys is the entire purpose of PKI (public key infrastructure).

Certificates

A digital certificate (X.509) is a signed statement by a certificate authority (CA): "this public key belongs to this identity." Contents to know: subject (who), subject's public key, issuer (which CA), validity period, allowed usages, and the CA's signature over all of it.

Chain of trust: your browser/OS ships with a store of trusted root CA certificates. Roots sign intermediate CAs (roots are kept offline for safety); intermediates sign end-entity (leaf) certificates for actual websites. Verification walks the chain: leaf signed by intermediate, intermediate signed by root, root in the trust store, plus checks that names match, dates are valid, and nothing is revoked.

Certificate types you may be asked to compare: DV (domain validated: automated proof of domain control; what Let's Encrypt issues free), OV/EV (organisation/extended validation: paperwork verifying the legal entity; EV's special browser UI was removed years ago, so its practical value is debated). A wildcard covers *.example.com; a self-signed certificate has no chain at all; fine for testing, meaningless for trust on the open internet.

Revocation

Keys get stolen; certificates must be cancellable before expiry. Two mechanisms, both flawed:

CRL (certificate revocation list): CA publishes a signed list of revoked serials. Big, slow to propagate.
OCSP: ask the CA about one certificate in real time. Adds latency and a privacy leak (CA sees your browsing); a soft-fail OCSP check (ignore timeouts) is trivially blocked by an attacker. OCSP stapling fixes part of this: the server fetches a signed freshness proof and staples it into the TLS handshake.

In practice, browsers increasingly ship aggregated revocation data themselves (e.g. CRLSets/CRLite-style mechanisms) because classic per-lookup revocation failed. Short-lived certificates (Let's Encrypt's 90 days, and shorter coming industry-wide) reduce how much revocation matters.

When CAs go wrong

Any trusted CA can issue for any domain; that's the system's structural weakness. The most-cited case study is DigiNotar (2011): the Dutch CA was hacked, fraudulent *.google.com certificates were issued and used to intercept Iranian users' Gmail, and the CA was distrusted out of existence.

Certificate Transparency (CT) is the modern mitigation: CAs must log every issued certificate to public, append-only, cryptographically verifiable logs; browsers reject certificates without CT proof. Domain owners can watch the logs and spot rogue issuance for their names. (CT logs are also a recon goldmine: they enumerate subdomains, which is why they appear again in the CTF topic.)

Code signing and the wider uses

The same machinery signs software (so your OS can check an update came from the vendor), email (S/MIME), documents, and container images. The rec