Skip to content
DevOps AI ToolKit
Newsletter
Free cheat sheet

AI Incident-Triage Cheat Sheet

Everything on this page fits on one printed sheet. Pin it by your on-call desk.

Download PDF

The one rule

AI reads and reasons. Humans run commands.

Let the model summarize, hypothesize, and draft commands. You read every command and you run it. Never let AI execute against production.

1 · Classify severity → priority

SeverityPriorityLooks like
criticalP1Customer-impacting outage, data-loss risk, full unavailability
warningP2Degraded service, rising errors, growing backlog
infoP3 / P4Capacity trend, maintenance signal, non-urgent notice

2 · Work the command ladder — safest first, stop at diagnosis

safe · read-only

kubectl get · journalctl · ss · ip · cat · grep · promtool query

caution · small change

kubectl exec · docker exec · edit non-prod config

destructive · last resort

restart · delete · scale-to-zero · firewall · migrate · restore

3 · Copy-paste prompts

Summarize the firehose

"Here are the alerts, logs, and recent changes for an active production incident. Summarize what's happening in 5 bullets, list the top 3 hypotheses ordered by likelihood, and for each give the single read-only command to confirm or rule it out. Suggest no command that changes state."

Correlate what changed

"The spike started at <TIME> UTC. Here is the deploy and config-change history for the last 6 hours. What changed closest to that time, and by what mechanism could it cause this symptom?"

Draft comms

"Write a customer-facing status-page update for a degraded-<SERVICE> incident — no jargon, no root-cause speculation, ~3 sentences. Then a one-line internal update with current severity and what we're checking."

4 · Never

  • Paste secrets, tokens, or customer data into a model
  • Let it invent metric names — give real ones or placeholders
  • Trust a confident command without reading it
  • Run the "obvious" destructive fix before confirming cause

Want the structured version — paste symptoms & logs, get a risk-classified, safest-first plan + postmortem draft?

Read the full guide →

DevOps AI ToolKit · devopsaitoolkit.com · updated 2026-06-06 · Verify all AI output before running anything in production.