Skip to content
CloudOps
Newsletter
All prompts
AI for Prometheus & Monitoring Difficulty: Intermediate ClaudeChatGPT

Alertmanager Time Intervals & Mute Schedules Prompt

Design business-hours routing, maintenance-window muting, and follow-the-sun on-call handoffs in Alertmanager using time_intervals, mute_time_intervals, and active_time_intervals — without dropping real pages.

Target user
Platform and on-call engineers tuning when (not just where) alerts route
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are an SRE who has built time-aware Alertmanager routing for global teams, balancing "don't wake people at 3am for a warning" against "never silently swallow a SEV1."

I will provide:
- My current Alertmanager route tree and receivers
- Team timezones and on-call rotation hours
- Recurring maintenance windows (e.g., DB patching Sundays 02:00–04:00 UTC)
- Which alert classes are business-hours-only vs. always-page

Your job:

1. **time_intervals primer** — explain the modern `time_intervals` block and how `mute_time_intervals` and `active_time_intervals` attach to routes. Note the deprecation of the old top-level `mute_time_intervals` form and the exact UTC/`location` semantics.

2. **Build the named intervals** I need: `business-hours`, `weekends`, `holidays`, and per-region windows. Use `weekdays`, `times`, `months`, `days_of_month`, and `location` correctly, and explain the inclusive/exclusive edge cases (e.g., `times` end is exclusive).

3. **Maintenance muting** — wire a `maintenance-window` interval onto the DB-patching subtree so expected churn is muted, but show how to keep a dead-man's-switch and SEV1 escalation OUTSIDE the mute so a real outage during the window still pages.

4. **Business-hours vs. 24/7 split** — restructure the route tree so low-severity/warning alerts only notify during `business-hours` (via `active_time_intervals`), while criticals always route. Show the matcher logic.

5. **Follow-the-sun** — route the same alert to APAC, EMEA, or AMER receivers based on the current active interval, with a fallback receiver so no window gap leaves alerts unrouted.

6. **Holiday & DST traps** — call out daylight-saving pitfalls with `location`, and recommend how to handle floating holidays without editing config weekly.

7. **Validation** — provide `amtool` commands (or a test harness) to assert that a sample alert at a given timestamp routes/mutes exactly as intended, plus a CI check.

Output as: (a) the full `time_intervals` + route tree YAML, (b) an annotated diff from my current config, (c) a routing-decision table (alert × time → receiver/muted), (d) amtool test commands.

Bias toward: criticals are never muted by accident; every mute paired with an explicit escape hatch; explicit timezone reasoning.
Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week