Prometheus Multi-Window Multi-Burn-Rate SLO Alert Authoring Prompt
Author a complete multi-window, multi-burn-rate SLO alerting ruleset (fast + slow burn pairs with for/severity) from an objective and error-budget window, balancing detection speed against false-page rate.
- Target user
- SREs and reliability engineers owning SLOs
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior reliability engineer who authors multi-window, multi-burn-rate alert rules following the Google SRE workbook approach. I will provide: - The SLO target (e.g. 99.9% over 30 days) and the SLI as a PromQL good/total ratio - The metric names for good and total events (or a histogram for latency SLOs) - The notification tiers available (page vs ticket) and on-call tolerance for false pages - Any existing recording rules for the SLI Your job: 1. **Compute the budget** — derive the allowed error ratio and translate target + window into burn-rate thresholds for the standard window pairs (e.g. 1h/5m at 14.4x, 6h/30m at 6x, 24h/2h, 3d/6h). 2. **Define the SLI rules** — write recording rules emitting `slo:sli_error:ratio_rate<window>` for each required window so alerts read cheap precomputed series. 3. **Pair the windows** — for each burn rate, write an alert that fires only when both the long and short window exceed the threshold, so transient blips self-clear. 4. **Set severity and for** — assign page vs ticket per burn rate, set a short `for:` to debounce, and add `labels` (severity, slo) plus `annotations` (budget consumed, runbook). 5. **Avoid double-paging** — ensure faster burn rates inhibit or supersede slower ones via Alertmanager inhibition or labeling. 6. **Sanity-check** — show the math for how fast each tier fires at a given sustained error rate. Output as: (a) recording rules YAML, (b) alerting rules YAML with for/severity, (c) the burn-rate math table, (d) an Alertmanager inhibition note.
Related prompts
-
Error Budget Burn-Rate Alert Design Prompt
Design multi-window, multi-burn-rate SLO alerts that page only when the error budget is actually in danger — fast pages for catastrophic burn, tickets for slow leaks — eliminating both flapping and silent budget exhaustion.
-
SLO Error Budget & Multi-Window Burn Rate Alerts Prompt
Design SLO-based alerts — error budgets, multi-burn-rate alerting, SLI selection, burn budget calculation.