TLS Certificate Management: From Handshake to Automated Rotation
Every HTTPS connection starts with a TLS handshake. Behind that handshake is a certificate — a small file that binds a public key to a domain identity. Managing those certificates at scale is one of the most operationally critical tasks in modern infrastructure.
The TLS Handshake#
The TLS 1.3 handshake completes in a single round trip:
Client Server
│ │
│──── ClientHello + key share ──────▶│
│ │
│◀─── ServerHello + key share ───────│
│◀─── {Certificate} ────────────────│
│◀─── {CertificateVerify} ──────────│
│◀─── {Finished} ───────────────────│
│ │
│──── {Finished} ───────────────────▶│
│ │
│◀═══ Application Data ════════════▶│
Key improvements over TLS 1.2: the handshake is encrypted after the ServerHello, the number of round trips drops from two to one, and legacy algorithms like RSA key exchange are removed entirely.
Certificate Chain of Trust#
A certificate is only trusted if the client can build a chain from the leaf certificate back to a trusted root CA:
Root CA (self-signed, in trust store)
└── Intermediate CA (signed by Root)
└── Leaf certificate (signed by Intermediate, bound to your domain)
Common pitfalls:
- Missing intermediate — The server must send the full chain minus the root. If you skip the intermediate, some clients will fail verification.
- Expired intermediate — Even if the leaf is valid, an expired intermediate breaks the chain.
- Cross-signed roots — Some CAs use cross-signatures for backward compatibility. This creates multiple valid chain paths.
Let's Encrypt Automation#
Let's Encrypt issues domain-validated (DV) certificates for free using the ACME protocol:
- Account registration — Your ACME client generates a key pair and registers with the CA.
- Order — Request a certificate for one or more domain names.
- Challenge — Prove domain control via HTTP-01 (place a file at
/.well-known/acme-challenge/), DNS-01 (create a TXT record), or TLS-ALPN-01. - Finalize — Submit a CSR. The CA signs and returns the certificate.
- Renewal — Certificates expire after 90 days. Automate renewal at the 60-day mark.
# Certbot with automatic Nginx configuration
certbot --nginx -d example.com -d www.example.com
# DNS challenge for wildcard certificates
certbot certonly --dns-cloudflare \
--dns-cloudflare-credentials ~/.secrets/cloudflare.ini \
-d "*.example.com"
The 90-day lifetime is intentional. Short-lived certificates reduce the window of compromise and force operators to automate.
cert-manager in Kubernetes#
cert-manager is the standard for automated certificate lifecycle in Kubernetes. It watches Ingress resources and provisions certificates from configured issuers.
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: ops@example.com
privateKeySecretRef:
name: letsencrypt-prod-key
solvers:
- http01:
ingress:
class: nginx
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: api-tls
namespace: production
spec:
secretName: api-tls-secret
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- api.example.com
- api-internal.example.com
renewBefore: 720h # 30 days before expiry
cert-manager stores the certificate and private key in a Kubernetes Secret. It handles renewal automatically and updates the Secret in place.
Mutual TLS (mTLS)#
Standard TLS authenticates the server to the client. mTLS adds the reverse — the server also verifies the client certificate. This is foundational for zero-trust service meshes.
How it works:
- The server sends a
CertificateRequestmessage during the handshake. - The client responds with its own certificate and a
CertificateVerifysignature. - The server validates the client certificate against a trusted CA bundle.
Use cases:
- Service-to-service authentication in Kubernetes (Istio, Linkerd)
- API authentication without tokens
- IoT device identity
Service A ──── mTLS ────▶ Service B
│ │
├── presents client cert ├── presents server cert
└── verifies server cert └── verifies client cert
Service meshes like Istio automate mTLS by injecting sidecar proxies that handle certificate issuance, rotation, and verification transparently.
Certificate Rotation#
Certificate rotation replaces an active certificate before it expires, without downtime. The strategy depends on your infrastructure:
Graceful reload — The server process reloads the certificate from disk without restarting. Nginx supports nginx -s reload. Envoy watches the filesystem by default.
Blue-green rotation — Deploy new instances with the new certificate, shift traffic, then drain old instances.
Dual-certificate overlap — Configure the server to present both old and new certificates during a transition window. Clients that have pinned the old certificate continue to work.
Rotation timeline:
Day 0 Day 60 Day 80 Day 90
│ │ │ │
▼ ▼ ▼ ▼
Issue Renew & Old cert Old cert
cert deploy new still valid expires
Always monitor certificate expiration with tools like Prometheus and the x509_cert_not_after metric. Alert at 30 days, page at 7 days.
OCSP Stapling#
The Online Certificate Status Protocol (OCSP) lets clients check whether a certificate has been revoked. Without stapling, the client contacts the CA's OCSP responder directly — adding latency and leaking browsing data.
With OCSP stapling, the server fetches the OCSP response from the CA periodically and includes it in the TLS handshake:
# Nginx configuration
ssl_stapling on;
ssl_stapling_verify on;
ssl_trusted_certificate /etc/ssl/chain.pem;
resolver 8.8.8.8;
Benefits:
- Eliminates the extra round trip to the OCSP responder
- Improves privacy — the CA never sees which clients visit your site
- Reduces CA infrastructure load
OCSP Must-Staple is a certificate extension that tells clients to reject the certificate if no stapled response is present. This prevents downgrade attacks where an attacker strips the OCSP response.
Certificate Pinning#
Certificate pinning binds a host to a specific certificate or public key, preventing man-in-the-middle attacks even if a trusted CA is compromised.
Types of pinning:
- Leaf pinning — Pin the exact certificate. Requires rotation coordination.
- Public key pinning — Pin the SPKI hash. Survives certificate renewal as long as the key pair stays the same.
- CA pinning — Pin the issuing CA. Most flexible but least secure.
Where to pin:
- Mobile apps (via network security config on Android or
URLSessionpinning on iOS) - Internal services with known CA infrastructure
- CLI tools that communicate with a single backend
Avoid HTTP Public Key Pinning (HPKP) — it was deprecated because misconfiguration could permanently lock users out of a site. Modern alternatives include Certificate Transparency (CT) log monitoring.
Key Takeaways#
- The TLS 1.3 handshake encrypts most of the exchange and completes in one round trip.
- Certificate chains must be complete — always serve the intermediate certificate.
- Let's Encrypt and the ACME protocol make free, automated DV certificates the default.
- cert-manager automates the full certificate lifecycle in Kubernetes.
- mTLS provides mutual authentication and is essential for service mesh architectures.
- Rotate certificates well before expiry and monitor with automated alerts.
- OCSP stapling eliminates revocation-check latency and improves privacy.
- Pin certificates in controlled environments but avoid HPKP for public websites.
Build and explore system design concepts hands-on at codelit.io.
269 articles on system design at codelit.io/blog.
Try it on Codelit
GitHub Integration
Paste any repo URL to generate an interactive architecture diagram from real code
Related articles
Build this architecture
Generate an interactive architecture for TLS Certificate Management in seconds.
Try it in Codelit →
Comments