You are a senior SRE who has migrated dashboards from Grafana's legacy alerting to unified alerting — and operated alerts at scale. I will provide: - The alert use case - Current rule and notification config - Symptom (alert not firing, too noisy, wrong destination) Your job: 1. **Unified alerting model**: - **Alert rules** in a folder (groups) - **Contact points** (Slack, PagerDuty, etc.) - **Notification policies** route alerts to contact points - **Mute timings** for time-based suppression - **Silences** for ad-hoc suppression 2. **Alert rule types**: - **Grafana managed** — Grafana stores rule; works with any DS - **Mimir/Loki managed** — rule lives in Mimir/Loki ruler - **Prometheus rule** — via Prometheus 3. **For "multi-dimensional"**: - Single rule with label-based instances - Each unique label combo = separate alert 4. **For "alert not firing"**: - Check rule "State" (Normal/Pending/Alerting) - For=2m means must be active 2m - Query returns no data → No Data behavior 5. **For notification policies**: - Tree structure like Alertmanager - Match by labels - Group by labels - Default policy at root 6. **For mute timings**: - Time-based (weekends, business hours) - Multiple intervals per timing 7. **For state history**: - Grafana stores recent state changes - Useful for postmortems 8. **For migration from legacy**: - Legacy alerts attached to panels - Unified alerts standalone - Migration tool Mark DESTRUCTIVE: changing contact point to wrong webhook (alerts misrouted), mute timing too broad (silence everything), rule deletion without backup. --- Alert use case: [DESCRIBE] Rule config: ``` [PASTE] ``` Symptom: [DESCRIBE]

Why this prompt works

Grafana alerting is powerful but has its own model. This prompt walks it.

How to use it

Define contact points first.
Notification policies for routing.
Rules per use case.
Test with manual fire.

Useful UI / API

# API: list rules
curl -u admin:pass http://grafana:3000/api/v1/provisioning/alert-rules | jq

# Test rule
# UI: Alert rule → Preview

# Pause / resume
curl -X POST -u admin:pass http://grafana:3000/api/v1/provisioning/alert-rules/<uid>/pause -d '{"isPaused":true}'

# Contact points
curl -u admin:pass http://grafana:3000/api/v1/provisioning/contact-points

# Notification policies
curl -u admin:pass http://grafana:3000/api/v1/provisioning/policies

Patterns

Multi-dimensional alert

# Provisioning YAML
groups:
- name: cpu-alerts
  folder: SRE
  interval: 1m
  rules:
  - title: HighCPU
    condition: B
    data:
    - refId: A
      datasourceUid: prometheus
      model:
        expr: 'avg by (instance)(rate(node_cpu_seconds_total{mode!="idle"}[5m]))'
    - refId: B
      datasourceUid: __expr__
      model:
        type: threshold
        conditions:
        - evaluator: { params: [0.8], type: gt }
    for: 5m
    labels:
      severity: warning
      team: platform
    annotations:
      summary: "High CPU on {{ $labels.instance }}"
      runbook: "https://runbooks.example.com/high-cpu"

Each instance value produces its own alert instance.

Notification policy

policies:
- receiver: default-slack
  group_by: [alertname, cluster]
  routes:
  - receiver: pagerduty-critical
    matchers:
    - severity = critical
    group_wait: 10s
  - receiver: slack-platform
    matchers:
    - team = platform
    continue: true
  - receiver: slack-payments
    matchers:
    - team = payments

Mute timing

mute_time_intervals:
- name: weekends
  time_intervals:
  - weekdays: [saturday, sunday]
- name: maintenance-window
  time_intervals:
  - times:
    - start_time: 02:00
      end_time: 04:00
    weekdays: [sunday]

Common findings this catches

Alert doesn’t fire → Check rule state in UI; query returns nothing.
Wrong recipient → policy match condition wrong.
Mute timing affects everything → scoped to right policy?
Alert always firing → No Data behavior set wrong.
State lost on restart → ephemeral storage; need DB.
Multi-dim alert explosion → cardinality on labels.
Slow rule eval → query expensive; recording rule.

When to escalate

Migration from legacy at scale — staged.
Cross-team policy design — coordination.
Multi-org Grafana alerting strategy — strategic.

Grafana Unified Alerting Prompt

Why this prompt works

How to use it

Useful UI / API

Patterns

Multi-dimensional alert

Notification policy

Mute timing

Common findings this catches

When to escalate

Related prompts

Alert Fatigue Reduction Strategy Prompt

Alertmanager Routing, Grouping & Receivers Prompt

Prometheus Alert Rule Generator Prompt

Why this prompt works

How to use it

Useful UI / API

Patterns

Multi-dimensional alert

Notification policy

Mute timing

Common findings this catches

When to escalate

Related prompts

Alert Fatigue Reduction Strategy Prompt

Alertmanager Routing, Grouping & Receivers Prompt

Prometheus Alert Rule Generator Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet