Grafana Incident Timeline Dashboard Prompt
Build a single-pane incident timeline dashboard in Grafana correlating annotations, deploys, alerts, and key signals on one shared time axis.
- Target user
- SREs building an incident war-room dashboard
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior SRE who builds war-room incident timeline dashboards that make root cause obvious. I will provide: - The service and its key signals (latency, errors, saturation) - Sources of events (deploys, alerts, feature flags) - The incident types you respond to Your job: 1. **Shared time axis**: put the golden-signal panels (error rate, latency p95, saturation) in a row so they align on one time range. 2. **Annotations**: add annotation queries for deploys, alert state changes (`ALERTS`), and manual incident markers so events overlay every panel. 3. **Region annotations**: mark incident start/end as a region so the whole window is visually bounded. 4. **Alert overlay**: query firing alerts and render them as a state-timeline so you can see what fired when. 5. **Deploy correlation**: overlay deploy events (from CI or a deploys table) to spot deploy-induced regressions instantly. 6. **Drill-downs**: add data links from panels to logs/traces filtered to the incident window. 7. **Time controls**: default to a tight relative window and make it easy to zoom to the incident. 8. **Provisioning**: deliver the dashboard JSON with annotation queries and data links encoded. Mark DESTRUCTIVE: none for viewers; avoid pointing annotation queries at expensive sources on short refresh during an active incident. --- Service/signals: [DESCRIBE] Event sources: [DESCRIBE] Incident types: [DESCRIBE]
Why this prompt works
During an incident, the fastest path to root cause is a single pane where signals, deploys, and alerts share one time axis. Most dashboards miss the deploy and alert annotations that reveal causality. This prompt builds the golden-signal row plus overlaid annotations and time-filtered drill-downs, so “what changed at 14:03” answers itself.
How to use it
- List the golden signals so the assistant lays out the aligned panel row.
- Name event sources (CI deploys, alert rules) so it wires annotation queries.
- Point at your logs/traces sources so drill-down links carry the time filter.
- Ask for dashboard JSON with annotations and data links encoded.
Useful commands
# Add a deploy annotation via the Grafana Annotations API (from CI)
curl -X POST http://localhost:3000/api/annotations \
-H "Authorization: Bearer $GRAFANA_TOKEN" \
-H "Content-Type: application/json" \
-d '{"tags":["deploy","checkout"],"text":"Deploy v2.4.1","time":1720000000000}'
# List annotations in a window to verify overlays
curl -s -H "Authorization: Bearer $GRAFANA_TOKEN" \
"http://localhost:3000/api/annotations?tags=deploy&from=1719990000000&to=1720010000000"
Example config
// Dashboard annotations: deploys + firing alerts overlaid on all panels
{
"annotations": {
"list": [
{
"name": "Deploys",
"datasource": { "type": "grafana", "uid": "-- Grafana --" },
"iconColor": "blue",
"tags": ["deploy"],
"type": "tags",
"enable": true
},
{
"name": "Firing Alerts",
"datasource": { "type": "prometheus", "uid": "prom-main" },
"expr": "ALERTS{alertstate=\"firing\", service=\"checkout\"}",
"iconColor": "red",
"titleFormat": "{{alertname}}",
"enable": true
}
]
}
}
// Panel data link: jump to logs filtered to the incident window
{
"title": "p95 latency",
"links": [
{
"title": "Logs for this window",
"url": "/explore?left={\"datasource\":\"loki-main\",\"queries\":[{\"expr\":\"{service=\\\"checkout\\\"} |= \\\"error\\\"\"}],\"range\":{\"from\":\"${__from}\",\"to\":\"${__to}\"}}"
}
]
}
Common findings this catches
- Hidden root cause → no deploy annotations overlaid.
- Broken causality → panels and annotations on different time zones.
- Cluttered timeline → too many annotation sources at once.
- Unbounded drill-down → log link missing
${__from}/${__to}. - Mislabeled window → region annotation timestamps wrong.
- Added load → expensive annotation query on short refresh.
- Missing alert context → no ALERTS overlay.
When to escalate
- Recurring incident patterns visible in the timeline — post-incident review.
- Annotation/alert datasource unreliable during incidents — reliability of the monitoring stack itself.
- Cross-service incident correlation — build a higher-level service-map view.
Related prompts
-
Grafana Data Links and Drilldowns Prompt
Wire Grafana data links and drilldowns so panels jump to related dashboards, Explore, or external tools carrying filter context.
-
Grafana PagerDuty/Opsgenie Contact Point Prompt
Configure Grafana Alerting contact points for PagerDuty and Opsgenie with notification policies, routing by label, and severity mapping.
-
Grafana State Timeline and Status History Prompt
Design Grafana State timeline and Status history panels to visualize discrete states like up/down, deploy phases, and health over time.