Grafana Alert Silences and Mute Timings Prompt
Suppress Grafana alert noise during maintenance and off-hours using silences and mute timings without dropping real incidents.
- Target user
- On-call engineers and release managers running maintenance windows
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior on-call engineer who tames Grafana alert noise with silences (ad-hoc, matcher-based) and mute timings (recurring schedules attached to notification policies). I will provide: - The alerts firing and their labels - The window (one-off maintenance vs recurring off-hours) - Who must still be paged for true emergencies Your job: 1. **Pick the mechanism**: silences for one-off, time-bounded suppression by label matcher; mute timings for recurring schedules (nights, weekends, deploy windows) referenced from notification policies. 2. **Write tight matchers**: match on `alertname`, `namespace`, `cluster`, or a `maintenance="true"` label — never a broad `severity=~".*"` that mutes everything. 3. **Bound the time**: set explicit `startsAt`/`endsAt` on silences; short windows over open-ended. Prefer expiry over manual cleanup. 4. **Define mute timings**: use `time_intervals` with `times`, `weekdays`, `days_of_month`, `months`, and `location` (IANA tz) so schedules honor local time. 5. **Attach to policy**: reference the mute timing from a notification policy route via `mute_time_intervals`, scoped to the matching child route only. 6. **Preserve emergencies**: keep a sibling route WITHOUT the mute timing for `severity=critical` so pages still fire during muted windows. 7. **Audit**: list active silences, flag any expiring never or created by departed users; document who/why in the silence comment. 8. **Provision as code**: move recurring mute timings into provisioning YAML so they survive restarts and reviews. Mark DESTRUCTIVE: a broad silence matcher that mutes critical pages, an open-ended silence with no endsAt, or a mute timing on the root policy suppressing everything. --- Alerts firing: [DESCRIBE] Window (one-off / recurring): [DESCRIBE] Must still page for: [DESCRIBE]
Why this prompt works
Silences and mute timings solve different problems: one is ad-hoc and matcher-based, the other is a recurring schedule bound to routing. Engineers routinely reach for a broad silence and accidentally mute production pages. This prompt separates the two, forces narrow matchers with hard expiry, and insists on a non-muted critical path so maintenance windows never swallow a real outage.
How to use it
- State whether the window is one-off or recurring to pick silence vs mute timing.
- List the labels so matchers are precise.
- Name the alerts that must still page so a critical route stays live.
- Provision recurring timings rather than clicking them in per-incident.
Useful commands
# List active silences (Alertmanager-compatible API)
curl -s -H "Authorization: Bearer $TOKEN" \
http://localhost:3000/api/alertmanager/grafana/api/v2/silences | jq
# Create a silence via API
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
http://localhost:3000/api/alertmanager/grafana/api/v2/silences -d '{
"matchers":[{"name":"namespace","value":"payments","isRegex":false}],
"startsAt":"2026-07-03T22:00:00Z","endsAt":"2026-07-04T02:00:00Z",
"createdBy":"jjoyner","comment":"DB migration maintenance"}'
# Fetch mute timings from provisioning
curl -s -H "Authorization: Bearer $TOKEN" \
http://localhost:3000/api/v1/provisioning/mute-timings | jq
Example config
Provisioned mute timing plus a routing policy that spares criticals:
apiVersion: 1
muteTimes:
- orgId: 1
name: off-hours
time_intervals:
- times:
- start_time: "20:00"
end_time: "08:00"
weekdays: ["monday:friday"]
location: "America/New_York"
policies:
- orgId: 1
receiver: slack-default
routes:
- receiver: slack-noncritical
matchers: ["severity = warning"]
mute_time_intervals: ["off-hours"]
- receiver: pagerduty-critical
matchers: ["severity = critical"] # NOT muted — always pages
Common findings this catches
- Muted critical pages → matcher too broad or root-level mute timing.
- Permanent blind spots → silence with no
endsAt. - Timezone drift → wrong
locationshifts the window. - Orphaned silences → created by departed users, never expire.
- False confidence → silence hides notification but alert still fires.
- Maintenance leaks → no
maintenance="true"label to target cleanly.
When to escalate
- Recurring maintenance calendars spanning many teams — alerting platform owner.
- Compliance requirements on suppression audit trails — security/GRC.
- Repeated noise that needs rule tuning rather than muting — service owner.
Related prompts
-
Grafana Alerting Notification Templates Prompt
Author custom Grafana alert notification message templates with Go templating for contact points (Slack, email, PagerDuty).
-
Grafana Incident Timeline Dashboard Prompt
Build a single-pane incident timeline dashboard in Grafana correlating annotations, deploys, alerts, and key signals on one shared time axis.
-
Grafana PagerDuty/Opsgenie Contact Point Prompt
Configure Grafana Alerting contact points for PagerDuty and Opsgenie with notification policies, routing by label, and severity mapping.