Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for Grafana Difficulty: Intermediate ClaudeChatGPT

Grafana Alert Silences and Mute Timings Prompt

Suppress Grafana alert noise during maintenance and off-hours using silences and mute timings without dropping real incidents.

Target user
On-call engineers and release managers running maintenance windows
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior on-call engineer who tames Grafana alert noise with silences (ad-hoc, matcher-based) and mute timings (recurring schedules attached to notification policies).

I will provide:
- The alerts firing and their labels
- The window (one-off maintenance vs recurring off-hours)
- Who must still be paged for true emergencies

Your job:

1. **Pick the mechanism**: silences for one-off, time-bounded suppression by label matcher; mute timings for recurring schedules (nights, weekends, deploy windows) referenced from notification policies.
2. **Write tight matchers**: match on `alertname`, `namespace`, `cluster`, or a `maintenance="true"` label — never a broad `severity=~".*"` that mutes everything.
3. **Bound the time**: set explicit `startsAt`/`endsAt` on silences; short windows over open-ended. Prefer expiry over manual cleanup.
4. **Define mute timings**: use `time_intervals` with `times`, `weekdays`, `days_of_month`, `months`, and `location` (IANA tz) so schedules honor local time.
5. **Attach to policy**: reference the mute timing from a notification policy route via `mute_time_intervals`, scoped to the matching child route only.
6. **Preserve emergencies**: keep a sibling route WITHOUT the mute timing for `severity=critical` so pages still fire during muted windows.
7. **Audit**: list active silences, flag any expiring never or created by departed users; document who/why in the silence comment.
8. **Provision as code**: move recurring mute timings into provisioning YAML so they survive restarts and reviews.

Mark DESTRUCTIVE: a broad silence matcher that mutes critical pages, an open-ended silence with no endsAt, or a mute timing on the root policy suppressing everything.

---

Alerts firing: [DESCRIBE]
Window (one-off / recurring): [DESCRIBE]
Must still page for: [DESCRIBE]

Why this prompt works

Silences and mute timings solve different problems: one is ad-hoc and matcher-based, the other is a recurring schedule bound to routing. Engineers routinely reach for a broad silence and accidentally mute production pages. This prompt separates the two, forces narrow matchers with hard expiry, and insists on a non-muted critical path so maintenance windows never swallow a real outage.

How to use it

  1. State whether the window is one-off or recurring to pick silence vs mute timing.
  2. List the labels so matchers are precise.
  3. Name the alerts that must still page so a critical route stays live.
  4. Provision recurring timings rather than clicking them in per-incident.

Useful commands

# List active silences (Alertmanager-compatible API)
curl -s -H "Authorization: Bearer $TOKEN" \
  http://localhost:3000/api/alertmanager/grafana/api/v2/silences | jq

# Create a silence via API
curl -X POST -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  http://localhost:3000/api/alertmanager/grafana/api/v2/silences -d '{
    "matchers":[{"name":"namespace","value":"payments","isRegex":false}],
    "startsAt":"2026-07-03T22:00:00Z","endsAt":"2026-07-04T02:00:00Z",
    "createdBy":"jjoyner","comment":"DB migration maintenance"}'

# Fetch mute timings from provisioning
curl -s -H "Authorization: Bearer $TOKEN" \
  http://localhost:3000/api/v1/provisioning/mute-timings | jq

Example config

Provisioned mute timing plus a routing policy that spares criticals:

apiVersion: 1
muteTimes:
  - orgId: 1
    name: off-hours
    time_intervals:
      - times:
          - start_time: "20:00"
            end_time: "08:00"
        weekdays: ["monday:friday"]
        location: "America/New_York"

policies:
  - orgId: 1
    receiver: slack-default
    routes:
      - receiver: slack-noncritical
        matchers: ["severity = warning"]
        mute_time_intervals: ["off-hours"]
      - receiver: pagerduty-critical
        matchers: ["severity = critical"]   # NOT muted — always pages

Common findings this catches

  • Muted critical pages → matcher too broad or root-level mute timing.
  • Permanent blind spots → silence with no endsAt.
  • Timezone drift → wrong location shifts the window.
  • Orphaned silences → created by departed users, never expire.
  • False confidence → silence hides notification but alert still fires.
  • Maintenance leaks → no maintenance="true" label to target cleanly.

When to escalate

  • Recurring maintenance calendars spanning many teams — alerting platform owner.
  • Compliance requirements on suppression audit trails — security/GRC.
  • Repeated noise that needs rule tuning rather than muting — service owner.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week