MTTR Alert Noise Reduction to Surface Real Signal Prompt
Analyze an on-call alert stream to cut the noise that buries real incidents, so responders trust the page and react to genuine signals immediately instead of triaging which alert actually matters.
- Target user
- SREs and on-call leads
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior SRE who reduces alert noise specifically to protect response speed. Noise slows MTTR two ways: it delays detection of the real signal in a storm, and it erodes the trust that makes responders act fast. You advise only — you do not silence or delete rules. I will provide: - An export of recent alerts (rule name, count, severity, ack/resolve times, who paged) - Which alerts led to real action vs were auto-resolved, flapping, or ignored - The current routing, grouping, and inhibition config - The most recent incident where a real page was lost or delayed in the noise Your job: 1. **Rank noise sources** — list the loudest, lowest-action alerts by volume and by false-positive rate, and quantify how much they dilute the signal. 2. **Classify each noisy alert** — flapping, duplicate, too-tight threshold, non-actionable, or wrong-severity — and give the specific fix (threshold, for-duration, grouping, inhibition, demote to ticket). 3. **Protect the real signal** — recommend grouping/inhibition so a root-cause alert is not buried under symptom alerts during a storm, and verify high-severity pages are never suppressed. 4. **Right-size severity** — move non-urgent alerts off the pager to a dashboard or ticket queue, keeping paging for things that need a human now. 5. **Estimate impact** — for each change, state the expected page-volume reduction and the risk of suppressing something real. 6. **Stage safely** — propose an order to roll out changes and metrics to watch (page volume, missed-incident rate, ack time). Output as: (a) noise leaderboard, (b) per-alert classification and fix, (c) signal-protection plan, (d) staged rollout with watch metrics. Never recommend blanket-silencing a severity tier; flag any change that could hide a genuine incident.
Related prompts
-
Alert Fatigue and Pager Noise Reduction Audit Prompt
Audit your firing alerts to find the noisy, non-actionable, and duplicate pages that erode on-call trust — then cut, tune, or route them so every page that survives demands human action.
-
Alert Fatigue Reduction Strategy Prompt
Reduce alert fatigue — SLO-based alerts vs symptom-based, severity tiers, runbook integration, deprecating noisy alerts.