Turn Teams Meeting Transcripts Into Postmortems With AI

The worst part of an incident isn’t the incident. It’s the postmortem nobody wants to write three days later, when the timeline has already rotted in everyone’s memory. I’ve sat in plenty of retros where the first ten minutes were just “wait, when did we actually page the database team?” The answer was sitting in a Teams meeting transcript the whole time — we just never used it.

After 25 years of running on-call, I’ve come around to treating the incident bridge transcript as the raw material for the postmortem. Teams records and transcribes the bridge automatically if you turn it on, Microsoft Graph will hand you that transcript over an API, and an LLM is genuinely good at turning a messy spoken timeline into a structured first draft. The model is a fast junior engineer here: it writes the boring scaffolding, and a human edits it before it goes anywhere near a leadership channel.

Get the transcript out of Graph

The first job is fetching the transcript. Teams stores meeting transcripts behind the onlineMeetings resource in Microsoft Graph. You need the meeting’s ID, then you list its transcripts and download the content as VTT.

GET https://graph.microsoft.com/v1.0/users/{organizerId}/onlineMeetings/{meetingId}/transcripts
Authorization: Bearer {token}

That returns transcript metadata. Each entry has a transcriptContentUrl you fetch with the Accept: text/vtt header to get the actual captions:

GET https://graph.microsoft.com/v1.0/users/{organizerId}/onlineMeetings/{meetingId}/transcripts/{transcriptId}/content?$format=text/vtt

The application permission you want is OnlineMeetingTranscript.Read.All, and it requires admin consent. That permission reads transcripts across the tenant, so scope it carefully and gate it behind a service principal that only your incident tooling uses. Do not grant this to a general-purpose automation account.

Pro Tip: VTT is timestamped, which is the whole point. Keep the timestamps when you pass the transcript to the model — they become the postmortem timeline almost for free.

Clean the VTT before the model sees it

Raw VTT is noisy: cue numbers, overlapping speaker labels, filler words. You do not need a fancy parser. A small pre-processing pass that strips cue IDs and collapses each caption into [HH:MM:SS] Speaker: text lines cuts your token count substantially and makes the model’s job easier.

import re

def vtt_to_lines(vtt: str) -> str:
    lines = []
    blocks = vtt.split("\n\n")
    for block in blocks:
        ts = re.search(r"(\d{2}:\d{2}:\d{2})", block)
        speaker = re.search(r"<v ([^>]+)>", block)
        text = re.sub(r"<[^>]+>", "", block.splitlines()[-1]) if block.strip() else ""
        if ts and text:
            who = speaker.group(1) if speaker else "Unknown"
            lines.append(f"[{ts.group(1)}] {who}: {text.strip()}")
    return "\n".join(lines)

This is the kind of glue code AI writes well — describe the input and output format and let it draft the regex, then you sanity-check the edge cases against a real transcript.

Write a prompt that produces a real timeline

The prompt matters more than the model. A vague “summarize this incident” gives you marketing copy. What you want is a structured extraction tied to a postmortem template. I anchor the model to a blameless format and ask for specific sections.

You are drafting an internal, blameless postmortem from an incident
bridge transcript. Use only facts present in the transcript. If a detail
is missing, write "UNKNOWN — confirm with team" rather than guessing.

Produce these sections:
1. Timeline — bulleted, each line "HH:MM — what happened", in order.
2. Detection — how the issue was first noticed.
3. Contributing factors — technical only, no blame on individuals.
4. Remediation — what actions resolved or mitigated the issue.
5. Open follow-ups — anything the team said they'd do later.

Refer to people by role (on-call, DB lead) not by name. Do not invent
metrics, error rates, or root causes not stated in the transcript.

Transcript:
---
{lines}
---

The “write UNKNOWN rather than guessing” instruction is the single most important line. LLMs will happily confabulate a clean root cause that never appeared on the call. Forcing it to flag gaps turns hallucinations into a checklist for the human editor.

Wire it together without leaking tenant data

The flow is: Graph fetches the transcript, your service cleans it, an LLM drafts the postmortem, and a human reviews. Keep the API credentials out of the model’s context entirely — the model never sees your Graph token, your tenant ID, or the service principal secret. It sees cleaned transcript text and nothing else. If you’re using a hosted model, treat the transcript itself as sensitive: incident bridges contain customer names, internal hostnames, and sometimes credentials people read aloud (they shouldn’t, but they do). Run a redaction pass for obvious secrets before the model call if your data policy requires it.

I post the resulting draft back into the incident channel as an adaptive card with an “Open editable doc” action rather than dumping a wall of text. The card carries a clear “AI draft — review before sharing” label so nobody mistakes it for a finished artifact.

{
  "type": "AdaptiveCard",
  "version": "1.5",
  "body": [
    {
      "type": "TextBlock",
      "text": "📝 Postmortem draft (AI-generated — review before sharing)",
      "weight": "Bolder",
      "color": "Warning"
    },
    {
      "type": "TextBlock",
      "text": "Generated from the incident bridge transcript. Timeline and follow-ups extracted automatically.",
      "wrap": true
    }
  ],
  "actions": [
    { "type": "Action.OpenUrl", "title": "Open editable draft", "url": "https://example.com/pm/INC-4821" }
  ]
}

Keep a human in the loop, always

This workflow saves the worst hour of postmortem writing — the blank-page hour. It does not replace the engineer who owns the writeup. The model can mislabel who did what, miss the real contributing factor because it was never said out loud, or soften a hard truth that the team needs to confront. Treat its output as a first pass from a fast but inexperienced colleague: useful, fast, and absolutely requiring review before it represents your team.

The discipline that keeps this safe is the same discipline that keeps any AI workflow safe. The AI moves quickly and a human signs off before anything reaches a tenant, a channel, or a leadership audience. Verify what it claims against the transcript, never hand it real credentials, and keep the Graph permissions tightly scoped.

If you’re building out incident tooling around Teams more broadly, our incident-response dashboard and the Microsoft Teams category cover the surrounding pieces, and the prompt library has starting points you can adapt for the extraction prompt above. For the model side, Claude handles long transcripts well thanks to its large context window.

Where this pays off

The first time you run this on a real incident, the draft will be 80 percent right and 20 percent wrong, and the 20 percent is exactly the stuff you’d have argued about in the retro anyway. That’s the win: the AI surfaces the disagreements faster, and your team spends the meeting deciding what to do instead of reconstructing what happened. Start with low-severity incidents, tune the prompt against a few real transcripts, and only then point it at the events that matter.