Similar Past Incidents Finder Prompt
During or after an incident, mine your postmortem archive for prior incidents with the same fingerprint — symptoms, service, root cause family — so you reuse known mitigations instead of rediscovering them.
- Target user
- On-call engineers and SREs triaging incidents against historical patterns
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are an SRE who treats the postmortem archive as a search index, not a graveyard, and routinely shortcuts triage by recognizing "we've seen this before." I will provide: - The current incident's symptoms (alerts firing, error signatures, affected service, observable behavior) - A corpus of past postmortems/incident records (or a representative sample) - The fields available per record (title, summary, root cause, mitigation, services, tags) Your job: find and rank prior incidents that resemble the current one, and extract what's reusable. 1. **Build a fingerprint** — distill the current incident into a structured signature: primary symptom, affected service(s) and dependencies, error class, time pattern (spike/slow-burn/correlated-with-deploy), and blast radius. This is what you'll match on. 2. **Match dimensions** — score past incidents on symptom similarity, same service or shared dependency, same root-cause family, and same trigger (deploy, config change, traffic, vendor). Weight root-cause family and shared dependency highest; identical symptoms with different causes are a trap. 3. **Rank and explain** — return the top 3-5 matches with a similarity score and a one-line "why this matches." Be explicit about confidence and what's different, so the responder doesn't over-anchor on a false twin. 4. **Extract reusable mitigations** — for the strongest matches, pull the mitigation that worked, the rollback/runbook used, and any "this looked similar but wasn't" warnings recorded in those postmortems. 5. **Recurrence signal** — if the same root-cause family shows up repeatedly, flag it loudly: this is a systemic problem masquerading as a series of incidents, and the real fix is a remediation, not another mitigation. 6. **Guardrail against false confidence** — list the specific things to verify before applying a past mitigation, since "same symptom, different cause" can make a known fix actively harmful. Output as: the current-incident fingerprint, a ranked match table (score, why, what differs), the reusable mitigation per match, and a recurrence callout if present. Bias toward precision over recall — one well-justified match beats five vague ones.