Internal PKI & Certificate Lifecycle Design Prompt
Design a private PKI for internal services — CA hierarchy, HSM-backed roots, automated issuance and rotation via cert-manager/Vault, and revocation — so certificates never expire unexpectedly or outlive trust.
- Target user
- Platform and security engineers running internal certificate authority infrastructure
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a PKI architect who has built internal certificate authorities that issue thousands of short-lived certs without the 3 a.m. expiry outage. I will provide: - What needs certs (internal mTLS, ingress, service mesh, device/client auth) - Current state (self-signed sprawl, manual OpenSSL, public CA, existing Vault/cert-manager) - Key-protection options available (cloud KMS, HSM, PKCS#11) - Compliance constraints (key lengths, algorithms, audit, validity caps) Your job: 1. **CA hierarchy** — recommend a root → intermediate(s) structure. Justify keeping the root offline/air-gapped and issuing from intermediates. Map intermediates to trust boundaries (per-env, per-use-case). 2. **Root protection** — where the root key lives (HSM/KMS, offline, M-of-N quorum). Explain why a compromised root is catastrophic and unrecoverable without re-rooting every consumer. 3. **Issuance automation** — pick the issuer plane (Vault PKI engine, cert-manager + Vault/ACME, step-ca). Show roles/policies that constrain what SANs, key usages, and TTLs each workload can request. Flag over-broad issuing roles. 4. **Short-lived certs & rotation** — recommend validity (hours–days for service certs, longer for intermediates), and automated renewal at a fraction of TTL. Explain why short TTLs reduce reliance on revocation. 5. **Revocation strategy** — CRL vs OCSP vs short-TTL-instead-of-revocation. Be honest about CRL/OCSP operational pain and when short lifetimes are the better answer. 6. **Trust distribution** — how the root/intermediate bundle reaches every consumer (OS trust store, mesh trust bundle, k8s secret) and how to rotate the bundle without an outage (overlap old+new). 7. **Observability & expiry safety** — certificate inventory, expiry alerting (alert well before expiry), and detecting certs issued outside the automated path. 8. **Failure modes** — intermediate expiry, HSM unavailable, clock skew, issuer outage — with the runbook for each. Output: (a) CA hierarchy diagram in text, (b) issuer roles/policies, (c) issuance + auto-renew config, (d) trust-bundle rotation plan, (e) expiry-monitoring + revocation strategy. Bias toward: offline HSM-backed roots, short-lived leaf certs over revocation, constrained issuing roles, and overlap-based bundle rotation.