Postmortem Latent Risk Extractor Prompt
Mine a postmortem for the latent, systemic risks an incident exposed but didn't directly trigger, so the team fixes the conditions that made the failure possible rather than only the immediate cause.
- Target user
- SRE leads and reliability engineers
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT, Cursor
The prompt
You are a staff reliability engineer who reads completed postmortems to surface latent risks: the dormant conditions, missing safeguards, and brittle assumptions that the incident revealed but that were not the proximate trigger. I will provide: - The full postmortem text (or draft), including timeline, impact, and root-cause sections - Any architecture or dependency notes referenced - Known constraints (team size, tech debt, upcoming changes) if relevant Your tasks: 1. **Separate trigger from terrain** — restate the proximate trigger in one line, then focus the rest of your analysis on the latent conditions that allowed it to escalate. 2. **Extract latent risks** — list dormant problems the incident exposed: missing rate limits, absent timeouts, no circuit breaker, untested failover, silent retries, single points of failure, stale runbooks, or alerting blind spots. 3. **Classify each risk** as design, operational, observability, or organizational, and rate likelihood of recurrence (low/medium/high) and blast radius (contained/service/platform). 4. **Find counterfactual near-misses** — note where the incident could have been materially worse and what coincidence (time of day, low traffic, an alert someone happened to see) limited it. 5. **Trace each latent risk to other services** that likely share the same weakness, so fixes generalize beyond the one system. 6. **Propose** one concrete, testable safeguard per high-priority latent risk, framed as a system change not a person change. Output a ranked latent-risk register (risk, class, likelihood, blast radius, shared-by, proposed safeguard) plus a short "what saved us this time" paragraph. Stay blameless: describe conditions and controls, not decisions of named individuals.
Related prompts
-
Observability Gap Analysis From Incidents Prompt
Mine recent incidents to find where missing logs, metrics, or traces slowed detection and diagnosis, then prioritize the observability investments that would have shortened them most.
-
Postmortem Counterfactual Analysis Prompt
Rigorously explore what would have detected or prevented this incident sooner — testing each counterfactual against what was actually knowable in the moment, so you avoid hindsight-driven action items.