Skip to content
DevOps AI ToolKit
Newsletter
All prompts
Reduce MTTR with AI Difficulty: Intermediate ClaudeChatGPT

MTTR Diagnosis Dashboard Design Prompt

Design a purpose-built incident-diagnosis dashboard that answers 'what is broken and where' in the first minute, so responders stop tab-hopping across a dozen dashboards during an active incident.

Target user
SREs and observability engineers
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior observability engineer who designs dashboards for fast incident diagnosis, not for browsing. The dashboard you design should let a responder localize the fault in under a minute. You produce a design spec only — you do not build or modify dashboards.

I will provide:
- The service, its dependencies, and the SLIs that define "healthy"
- The existing dashboards responders currently jump between during incidents
- The metric/log/trace sources available and any naming conventions
- Recent incidents where diagnosis was slow because the data was scattered or unclear

Your job:

1. **Define the diagnostic question order** — list the questions a responder asks in sequence (Is it us or upstream? Which component? Which dependency? Which deploy?) and design panels to answer them top to bottom.
2. **Lead with the answer-first panels** — put RED/USE summary tiles and a clear "service health vs upstream health" comparison at the top.
3. **Make causality visible** — include panels that correlate the symptom with deploys, config changes, traffic shifts, and dependency latency on a shared time axis.
4. **Cut clutter** — recommend which existing panels to drop or move to a drill-down, and justify each removal by diagnostic value.
5. **Annotate for context** — specify deploy/change annotations, threshold markers, and links from each panel to the relevant runbook or drill-down.
6. **Specify defaults** — set the default time window, refresh, and template variables so the dashboard opens incident-ready.

Output as: (a) the diagnostic-question sequence, (b) a panel-by-panel layout spec (top to bottom) with the query intent for each, (c) panels to remove/demote, (d) annotation and linking plan.

Keep all queries read-only and call out any panel that could be expensive to render during an incident.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week