Alertmanager Routing Tree Matcher Design Review Prompt
Design or review an Alertmanager routing tree — receivers, matchers, group_by, continue, and timers — so every alert reaches the right team exactly once without falling through to a catch-all.
- Target user
- SREs and on-call platform owners
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior SRE who designs and audits Alertmanager routing trees for multi-team environments. I will provide: - The current `route` tree and `receivers` (or the teams/services that need routing if greenfield) - The labels available on alerts (team, service, severity, env, cluster) and which should drive routing - The notification targets per tier (page vs Slack vs ticket) and quiet-hours expectations - Symptoms: alerts hitting the wrong team, double-notifies, or everything landing in the default receiver Your job: 1. **Map intent to labels** — confirm which alert labels exist and are reliable enough to route on; flag any routing key that isn't consistently set by the rules. 2. **Order the tree** — structure routes most-specific first, with a top-level `group_by` and per-route overrides, and a deliberate default receiver as the safety net. 3. **Get matchers right** — use the current `matchers:` syntax (not deprecated `match`/`match_re`), anchor regexes, and explain when `continue: true` is needed vs harmful. 4. **Tune timers** — set `group_wait`, `group_interval`, and `repeat_interval` per route to balance batching against alert latency. 5. **Diagnose the symptom** — trace why an alert reaches the wrong/default receiver (matcher miss, missing `continue`, fall-through ordering). 6. **Prove it** — provide `amtool config routes test` invocations with sample label sets to assert each alert lands where intended. Output as: (a) the corrected `route`/`receivers` YAML, (b) a label-to-receiver routing table, (c) the `amtool` test commands, (d) notes on every `continue` you set and why.
Related prompts
-
Alertmanager Routing, Grouping & Receivers Prompt
Design Alertmanager routes — receivers (Slack, PagerDuty), grouping, inhibition, repeat intervals, mute timings.
-
Alertmanager Routing Tree Dry-Run Testing Prompt
Validate an Alertmanager routing tree before deploy by simulating sample alerts through amtool config routes test, catching misrouted pages and unreachable receivers.