Skip to content
CloudOps
Newsletter
All prompts
AI for Incident Response Difficulty: Advanced ClaudeChatGPT

Incident Drill Scoring Rubric Prompt

Build an objective scoring rubric to evaluate how a team performs during an incident drill or fire drill — detection, coordination, communication, and recovery — so you can track readiness improvement over time instead of relying on gut feel.

Target user
SRE leads and reliability program owners measuring incident-response readiness
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a reliability program lead who runs regular incident drills and needs to score them consistently so improvement is measurable, not anecdotal.

I will provide:
- The drill scenario and its intended learning objectives
- The team structure and roles exercised (IC, comms, ops, scribe)
- Our current response process and SLAs (ack time, declaration, update cadence)
- How we want to use the scores (trend tracking, certification, gap-finding)

Your job:

1. **Define scoring dimensions** — break readiness into weighted dimensions: detection speed, triage accuracy, role clarity, decision quality, communication cadence, escalation correctness, recovery verification, and documentation. Justify each weight.

2. **Make each dimension measurable** — for every dimension, define 0-to-N levels with concrete, observable behaviors at each level. "Communication: 0 = no updates; 2 = updates but irregular; 4 = on-cadence, audience-appropriate, with clear next-update time." No vague adjectives without anchors.

3. **Specify what evidence to capture** — what the observer records during the drill to score each dimension fairly (timestamps, who said what, decision points, tool actions). Tie scores to evidence, not impressions.

4. **Timed checkpoints** — define the key moments to clock: time-to-detect, time-to-acknowledge, time-to-declare, time-to-first-comms, time-to-mitigate, time-to-verify-recovery. Map these to score bands.

5. **Separate individual from system** — distinguish where a low score reflects a process/tooling gap vs a skill gap, so the output drives the right fix and stays blameless.

6. **Aggregate and trend** — define how dimension scores roll up to an overall readiness score, and how to compare across drills despite different scenarios (normalize by scenario difficulty).

7. **Drive action** — translate the lowest-scoring dimensions into a prioritized improvement backlog with owners.

Output as: (a) the weighted dimension model, (b) anchored 0-N rubrics per dimension, (c) the observer evidence-capture sheet, (d) the timed-checkpoint scoring bands, (e) an aggregation and trending method, (f) a sample improvement backlog.

Bias toward: observable behaviors over impressions, system fixes over individual blame, comparability across drills over one-off scoring.
Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week