Using AI to Generate Incident Hypotheses Without Anchoring the Team
A murky incident is where teams tunnel on the wrong cause. Here's how to use AI to broaden your hypothesis list without letting its first guess anchor everyone.
- #incident-response
- #ai
- #troubleshooting
- #sre
- #on-call
The most dangerous moment in an incident isn’t when you don’t know what’s wrong — it’s when you think you know what’s wrong and you’re wrong. Anchoring is the silent killer of incident response. Someone says “it’s probably the deploy” in the first two minutes, and for the next forty minutes the whole team investigates the deploy, ignoring the actual cause sitting in plain sight, because the group has tunneled. By the time someone questions the assumption, you’ve burned half your MTTR chasing a ghost.
I’ve started using AI specifically to fight this — to broaden the hypothesis space at the start of an incident rather than narrow it. It’s a genuinely useful application, and it’s also one where the tool can make the problem worse if you misuse it. The difference is entirely in how you frame the ask.
Anchoring, and why humans are bad at the opening
Under stress, humans reach for the most available explanation — the recent change, the thing that broke last time, the system they’re most worried about. That’s efficient when it’s right and catastrophic when it’s wrong, and stress makes us more likely to commit early and less likely to revisit. The opening minutes of an incident are exactly when a broad, calm differential-diagnosis mindset matters most and is hardest to maintain.
A model doesn’t get stressed and doesn’t have a pet theory. Asked the right way, it’ll generate a wide list of plausible causes for a symptom — including the boring, unglamorous ones humans skip (“is it a certificate expiry? a disk filling up? an upstream dependency?”). That breadth is the antidote to tunnel vision.
The right framing: breadth, then ranking, never a single answer
The way you prompt this determines whether it helps or hurts. The wrong prompt is “what’s causing this incident?” — that invites a single confident answer, and a single confident answer from an authoritative-sounding tool is a fresh anchor, possibly a worse one than the human would’ve picked.
The right prompt deliberately asks for a spread: “Checkout requests are timing out, error rate climbing, started ~02:00, no obvious recent deploy. Generate a broad list of at least eight distinct possible causes across categories — application, dependencies, infrastructure, data, configuration, external. For each, give the quickest signal I could check to confirm or rule it out. Do not rank them yet; I want breadth.”
Now the AI is doing what it’s good at — generating a comprehensive differential — and explicitly not doing the thing that anchors. The output is a checklist of things to rule out, which is exactly the calm, systematic posture you want and struggle to hold at 3am.
Pro Tip: Always ask for the cheapest disconfirming check alongside each hypothesis, not just the hypothesis. “Could be DNS — check resolution time on the affected host” turns a list of guesses into an investigation plan. A hypothesis you can’t quickly test is just a distraction; a hypothesis paired with a 10-second check is a step forward.
Hypotheses are leads, the team does the diagnosis
Here’s the discipline that keeps this safe: every item the model produces is a lead to investigate, never a conclusion. The AI broadens the search; the humans run the checks and read the real signals. The model saying “could be connection-pool exhaustion” means “go look at the pool metrics,” not “it’s the pool, apply the pool fix.”
This matters because the failure mode of over-trusting AI hypotheses is just a higher-tech version of the anchoring you were trying to escape. If the team treats the model’s first listed item as the answer, you’ve replaced human tunnel vision with machine tunnel vision. So the rule holds: AI for generating the space of possibilities, humans for testing them against reality and deciding what’s actually true. The model never touches production to “check” its own guess — it suggests the check; a human runs it.
Keeping the team un-anchored, together
There’s a team-dynamics benefit too. When a broad AI-generated differential is on the screen, it’s easier for a junior engineer to say “did we rule out the certificate thing?” without feeling like they’re second-guessing the senior who confidently blamed the deploy. The list depersonalizes the hypothesis space — it’s not “challenging Sarah’s theory,” it’s “working through the checklist.” That social cover is quietly valuable for keeping an incident’s thinking honest.
The incident commander still runs the show — deciding which leads to pursue, in what order, with whom. The AI just makes sure the menu of leads is complete. This pairs naturally with the IC role and the systematic triage approaches covered across the incident-response category.
When not to reach for this
If the cause is genuinely obvious — the deploy went out two minutes before everything broke and rolling it back fixes it — don’t ceremonially consult an AI for hypotheses. This technique is for the murky incidents, the ones where you’re staring at a symptom with no obvious trigger and you can feel the team starting to tunnel. That’s where breadth pays. Knowing when you actually need it is its own judgment call — a human one.
Tooling and habit
A strong reasoning model like Claude or ChatGPT does this well, since it’s a reasoning-over-symptoms task. The free AI Incident Response Assistant can structure the symptom-to-hypotheses step as a standard early move in your response flow, and keeping a consistent “broad differential” prompt in your prompt workspace means every responder gets the same un-anchoring benefit. For more reusable framings, see the prompt library.
The real point
AI won’t diagnose your incident — it doesn’t know your systems and it shouldn’t be trusted to. What it can do, reliably, is widen your field of view at the exact moment human psychology wants to narrow it. Used as a breadth engine and a check-generator, with every lead confirmed against real signals by real people, it makes your team measurably harder to fool. The diagnosis stays human. The decisions stay human. The actions stay human. The model just keeps you from missing the obvious thing while you tunnel on the wrong one.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.