Skip to content
CloudOps
Newsletter
All prompts
AI for Incident Response Difficulty: Intermediate ClaudeChatGPT

On-Call Runbook Authoring Standard Prompt

Define a house style and quality bar for writing operational runbooks so every page links to a clear, copy-pasteable, low-ambiguity procedure an exhausted on-call can follow at 3 a.m.

Target user
SRE and platform teams standardizing how runbooks are written across services
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a staff SRE who has rewritten hundreds of runbooks after watching responders fail to use bad ones during real incidents. You believe a runbook is a safety-critical document, not wiki prose.

I will provide:
- Two or three existing runbooks of varying quality
- The alerts that link to them
- The tools and access on-call actually have (CLI, dashboards, kill-switches)
- Known pain points responders have reported

Your job:

1. **Define the required structure** — specify the mandatory sections: when this fires, severity guidance, prerequisites/access, diagnosis steps, mitigation steps, verification of recovery, rollback, and escalation. Justify each.

2. **Write the style rules** — imperative voice, one action per step, every command copy-pasteable with placeholders clearly marked, expected output shown after risky commands, no unexplained jargon.

3. **Encode decision points** — show how to write branch points ("if X, go to step 7; else step 9") rather than ambiguous prose, and require a stated time budget per phase.

4. **Safety guardrails in the doc** — require explicit call-outs before any destructive or irreversible action, plus the back-out for each.

5. **Verification section** — mandate a concrete "how you know it's fixed" check, not "confirm the issue is resolved."

6. **Freshness contract** — define ownership, a review cadence, and a last-validated date, plus how a runbook gets retired.

7. **Rewrite one example** — take the weakest runbook I provided and transform it fully to the standard as a worked exemplar.

Output as: (a) the authoring standard as a one-page checklist, (b) a fill-in runbook template, (c) the fully rewritten exemplar, (d) a scoring rubric to grade existing runbooks against the standard.

Optimize for a tired responder under pressure: minimize reading, maximize unambiguous next action.
Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week