Skip to content
CloudOps
Newsletter
All guides
Post Mortems with AI By James Joyner IV · · 9 min read

Writing an Internal Incident Review With AI (For Engineers, Not Execs)

Exec updates and engineer reviews need opposite things. Here's how to use AI to draft the deep technical incident review engineers learn from.

  • #incident-response
  • #ai
  • #postmortem
  • #engineering
  • #sre

There are two documents that come out of a serious incident, and people keep trying to write them as one. There’s the executive update — short, business-impact-focused, light on mechanism. And there’s the internal engineering review — the deep technical account of exactly how the system failed, the kind of thing another engineer reads to genuinely understand the failure mode. They serve opposite audiences and they require opposite things, and when you smush them together you get a document that’s too shallow for engineers and too technical for execs.

I’ve leaned on AI for both, but they need different handling. The exec update is mostly a compression-and-tone problem. The internal review is the hard one, because here the technical nuance is the entire value, and nuance is exactly what AI is most likely to quietly sand off. Let me focus on the hard one.

What makes an engineering review valuable

The internal review exists so that engineers who weren’t in the incident can learn the failure mode well enough to recognize, prevent, or fix it. That means it has to carry real depth:

  • The precise mechanism — not “the database had issues” but “connection-pool exhaustion under a thundering-herd retry storm after the timeout config change.”
  • The contributing factors and how they combined — failures are rarely one thing.
  • The dead ends — what you thought it was and why you were wrong, because that’s where the real learning lives.
  • The systemic angle — what about the system allowed this, beyond the immediate trigger.

Every one of those is a place where over-summarizing destroys the value. A review that says “we had a database problem and fixed it” is worse than useless — it implies understanding that didn’t transfer.

Using AI without flattening the nuance

The instinct is to ask AI to “summarize the incident into a review.” Resist it. Summarization is the enemy of a good engineering review, because the details you’d compress away are the point.

Instead I use AI as a structuring and drafting partner over material I provide, with explicit instructions to preserve technical specificity: “Below are my notes, the timeline, and the technical findings. Draft an internal engineering review with these sections: summary, detailed mechanism, contributing factors and how they combined, investigation path including dead ends, and systemic factors. Preserve all specific technical details — service names, config values, error signatures. Do not generalize or simplify the mechanism. Where my notes are vague, mark a TODO rather than inventing detail.”

The model is genuinely good at organizing my scattered technical notes into a coherent narrative while keeping the specifics intact. What it must not do is decide which details “don’t matter” — that’s a judgment that depends on understanding the system, which it doesn’t have.

Pro Tip: Explicitly forbid the model from generalizing the root cause, and ask it to quote your exact technical findings rather than paraphrasing them. Models reflexively smooth specifics into vaguer, “cleaner” prose — “a configuration issue” instead of “the maxIdleConns was set to 2.” For an engineering audience, the specific value is the lesson.

The human owns the mechanism

This is where the AI-drafts-humans-own principle gets its sharpest test. In an exec update, if the model’s phrasing is a little off, the cost is low. In an engineering review, if the model misstates the mechanism — says the cause was the cache when it was the connection pool, or implies a clean single cause when reality was a messy combination — it teaches the wrong lesson to every engineer who reads it. That’s actively harmful. A confidently wrong engineering review is worse than none, because it spreads false confidence.

So my review pass is rigorous and technical. I verify the mechanism is stated correctly and completely. I check that contributing factors are presented as a combination, not collapsed into a single tidy cause (real incidents almost never have one). I make sure the dead ends survived — the model loves to drop them because they don’t fit a clean narrative, but they’re often the most instructive part. The model assembles the draft; I own the technical truth of it.

Generating the exec version from the engineering one

Here’s a nice efficiency: write the deep engineering review first (with the care above), then have AI generate the exec update from it. “From this technical review, write a 150-word executive summary focused on customer/business impact, duration, and what we’re doing to prevent recurrence. No deep technical mechanism.” Going depth-first-then-compress is far safer than the reverse, because the compression is happening from a human-verified source of truth, and compression is a task models do reliably.

This connects to the broader comms practice covered across the incident-response category — different audiences, one verified source, no contradictions.

Where to learn the failure modes

The best engineering reviews become reference material — the corpus that future incidents get matched against and that new on-call engineers read to ramp up. That’s a quiet argument for doing them well, and for keeping a consistent structure so they’re comparable over time. A standard review-drafting prompt in your prompt workspace helps, and the free AI Incident Response Assistant supports the draft-then-verify loop these documents demand.

Tooling

For deeply technical drafting, a strong reasoning model like Claude handles the organize-without-simplifying task well, and ChatGPT is a fine alternative. If your reviews live alongside code and you want the assistant to reference the actual implementation, an in-editor tool like Cursor is useful. For reusable structures, browse the prompt library.

The takeaway

AI makes the internal incident review faster to write without making it shallower — but only if you fight its instinct to summarize and you personally own the mechanism. The whole value of this document is the depth that transfers real understanding to the next engineer. Use AI to organize and draft; keep the technical truth firmly human. Done right, you get a review that’s both honest and quick, which is exactly the combination that makes teams actually write them.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.