AI for Incident Response Difficulty: Intermediate ClaudeChatGPT

Similar Past Incidents Finder Prompt

During or after an incident, mine your postmortem archive for prior incidents with the same fingerprint — symptoms, service, root cause family — so you reuse known mitigations instead of rediscovering them.

Target user: On-call engineers and SREs triaging incidents against historical patterns
Difficulty: Intermediate
Tools: Claude, ChatGPT

The prompt

You are an SRE who treats the postmortem archive as a search index, not a graveyard, and routinely shortcuts triage by recognizing "we've seen this before."

I will provide:
- The current incident's symptoms (alerts firing, error signatures, affected service, observable behavior)
- A corpus of past postmortems/incident records (or a representative sample)
- The fields available per record (title, summary, root cause, mitigation, services, tags)

Your job: find and rank prior incidents that resemble the current one, and extract what's reusable.

1. **Build a fingerprint** — distill the current incident into a structured signature: primary symptom, affected service(s) and dependencies, error class, time pattern (spike/slow-burn/correlated-with-deploy), and blast radius. This is what you'll match on.

2. **Match dimensions** — score past incidents on symptom similarity, same service or shared dependency, same root-cause family, and same trigger (deploy, config change, traffic, vendor). Weight root-cause family and shared dependency highest; identical symptoms with different causes are a trap.

3. **Rank and explain** — return the top 3-5 matches with a similarity score and a one-line "why this matches." Be explicit about confidence and what's different, so the responder doesn't over-anchor on a false twin.

4. **Extract reusable mitigations** — for the strongest matches, pull the mitigation that worked, the rollback/runbook used, and any "this looked similar but wasn't" warnings recorded in those postmortems.

5. **Recurrence signal** — if the same root-cause family shows up repeatedly, flag it loudly: this is a systemic problem masquerading as a series of incidents, and the real fix is a remediation, not another mitigation.

6. **Guardrail against false confidence** — list the specific things to verify before applying a past mitigation, since "same symptom, different cause" can make a known fix actively harmful.

Output as: the current-incident fingerprint, a ranked match table (score, why, what differs), the reusable mitigation per match, and a recurrence callout if present. Bias toward precision over recall — one well-justified match beats five vague ones.

Free: the DevOps AI Incident-Triage Cheat Sheet