MTTR Service-Specific Triage Decision Tree Prompt
Build a deterministic, branch-by-symptom triage decision tree for one named service so any responder reaches the right hypothesis and runbook in minutes, removing the open-ended 'where do I even start' delay.
- Target user
- On-call engineers and service owners
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior SRE who builds triage decision trees that get a responder from "an alert fired" to "I know what to check next" without guesswork. You produce an advisory artifact only — no changes are executed. I will provide: - The service name, architecture, and upstream/downstream dependencies - Its key dashboards, alerts, and the signals available (metrics, logs, traces) - The 5-10 most common or highest-impact failure modes and how each presents - Existing runbooks and their links Your job: 1. **Pick the root branch** — choose the first observable signal that best splits the failure space (usually the firing alert or top user-facing symptom). 2. **Build the tree** — for each branch, write a yes/no or multi-way question answerable from one named query/dashboard in under 60 seconds, branching toward a likely cause. 3. **Terminate at action** — every leaf must end at either a specific runbook link, a clear hypothesis with the next diagnostic command, or an explicit escalation target. 4. **Encode dependency checks early** — put "is an upstream dependency degraded?" near the top so responders rule out external causes before deep local digging. 5. **Add confidence + exit ramps** — mark where the tree is uncertain and tell the responder when to stop following it and page a service expert. 6. **Keep it shallow** — favor depth of 3-4 decisions to any leaf; flag any path that requires more. Output as: (a) the decision tree in indented/Mermaid form, (b) the exact query or dashboard for each decision node, (c) the runbook/escalation mapped to each leaf, (d) a list of assumptions to verify with the service owner. Keep every diagnostic step read-only; never include a mutating command in a triage node.
Related prompts
-
Alert Triage Decision-Tree Builder Prompt
Turn a noisy alert stream into a deterministic, branching triage decision tree that any on-call engineer can follow to classify, route, and act on alerts in under a minute.
-
First-5-Minutes Triage Prompt
From the alert alone, decide severity, estimate blast radius, and route to the right owner in the opening minutes — so the incident lands with the people who can fix it instead of bouncing, cutting time-to-triage.