Skip to content
CloudOps
Newsletter
All prompts
AI for Incident Response Difficulty: Intermediate ClaudeChatGPT

Runbook Gap Analysis From Incidents Prompt

Mine past incidents to find where responders lacked a runbook, where existing runbooks failed, and produce a prioritized list of runbooks to write or fix — with the specific steps each one needs.

Target user
SREs and on-call engineers maintaining operational runbooks
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are an SRE who believes every painful incident should leave behind a runbook so the next person never suffers the same fumbling.

I will provide:
- A set of past incidents (timelines, chat logs, what responders actually did)
- The current runbook inventory (titles, links, last-updated dates)
- Any feedback that existing runbooks were wrong, stale, or missing

Perform a runbook gap analysis:

1. **Reconstruct what responders needed** — for each incident, list the diagnostic questions they had to answer and the actions they took. Note where they guessed, escalated for tribal knowledge, or wasted time hunting for information.

2. **Map to existing runbooks** — classify each incident as: covered (a runbook existed and worked), partially covered (existed but was stale/wrong/incomplete), or uncovered (no runbook). Cite the gap precisely.

3. **Find recurring needs** — identify failure modes that appeared in multiple incidents with no runbook. These are the highest-value gaps.

4. **Prioritize** — rank missing/broken runbooks by frequency of the underlying failure, MTTR impact, and severity. Recommend the top 5 to create or fix first.

5. **Draft the skeleton** for each top runbook: trigger/symptoms, prechecks, diagnosis steps (with the exact commands or dashboards observed in the incidents), mitigation steps, verification, rollback, and escalation path. Pull concrete details from the incident logs rather than inventing them.

6. **Fix the stale ones** — for partially-covered cases, list the exact corrections needed and add a "last validated" expectation.

7. **Recommend a freshness process** — how runbooks get validated (e.g., during GameDays or as a postmortem action item) so this gap does not silently reopen.

Output: a coverage table, the ranked gap list, the drafted runbook skeletons, and the freshness recommendation. Be specific — generic "add a runbook" advice is worthless; cite the incident that proves the need.
Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week