Is-This-Real Page Triage Prompt
Help a freshly paged on-call engineer decide in the first two minutes whether an alert is a real incident worth waking people for, a transient blip, or pure noise — before they over- or under-react.
- Target user
- On-call engineers deciding whether a page warrants declaring an incident
- Difficulty
- Beginner
- Tools
- Claude, ChatGPT
The prompt
You are a calm, experienced on-call engineer mentoring someone who just got paged at 3 AM. The two failure modes are equally bad: declaring a major incident over a transient blip, or dismissing a real outage as noise and going back to sleep. Your job is to help them decide fast and proportionately. I will paste the page: the alert name, what it measures, the threshold and current value, how long it's been firing, whether it's self-resolved before, and any quick signal I can pull (a dashboard reading, customer reports, related alerts). Your job: 1. **Real, transient, or noise?** — from the evidence, give your read: a real degradation, a likely transient that will self-clear, or known noise. State the single signal that most drives your read and your confidence. 2. **The two-minute checks** — give 2 to 4 fast, read-only checks that would confirm or kill the hypothesis (a specific dashboard, a synthetic check, customer-report channel, the related-alert pattern). Order them by speed. 3. **Customer-impact gut check** — is anything user-facing actually affected, or is this an internal threshold with no external symptom yet? This usually decides whether to escalate. 4. **Proportionate next step** — recommend one of: declare an incident and escalate, keep watching for N minutes with a specific tripwire, or acknowledge and snooze with a reason. Justify the proportionality. 5. **The tripwire** — if the recommendation is "watch," give the exact condition (metric crossing X, second alert firing, first customer report) that flips it to "declare now," so the engineer isn't re-deciding every minute. 6. **Capture for tuning** — one line noting whether this alert is a candidate for tuning if it turns out to be noise, to feed the follow-up without slowing the decision. Output as: (a) the real/transient/noise read with confidence, (b) the ordered two-minute checks, (c) the proportionate recommendation, (d) the tripwire that escalates. Propose; the engineer decides. When the evidence is genuinely ambiguous, lean toward a short watch with a clear tripwire rather than either ignoring it or over-escalating. Never tell someone it's safe to dismiss a page you can't confirm is noise.
Why this prompt works
The first two minutes after a page are where on-call engineers most often get it wrong in both directions: spinning up a war room over a blip that would have self-cleared, or hitting snooze on a real outage because the alert “usually clears itself.” This prompt addresses the actual decision a sleepy responder faces — not “what’s the root cause” but “is this even worth waking up for” — and forces a proportionate answer backed by fast, read-only checks instead of a coin flip.
The tripwire mechanism is what makes “keep watching” a real decision rather than a deferral. Without a concrete escalation condition, “watch it for a bit” becomes re-deciding every sixty seconds while half-asleep. By demanding the exact metric or event that flips the call to “declare now,” the prompt lets the engineer set the trap and stop ruminating, which is both faster and safer.
The guardrails are tuned to the asymmetry of the mistake. Dismissing a real incident is usually worse than briefly over-watching a fake one, so the prompt refuses to bless dismissing any page it can’t positively confirm is noise and defaults ambiguity toward a short, tripwired watch. It also captures flapping alerts for tuning instead of just snoozing them, so the same page doesn’t fire the same uncertain decision next week.
Related prompts
-
Alert Triage Decision-Tree Builder Prompt
Turn a noisy alert stream into a deterministic, branching triage decision tree that any on-call engineer can follow to classify, route, and act on alerts in under a minute.
-
Incident Severity Classification Rubric Prompt
Design a clear, defensible SEV classification rubric that on-call engineers can apply in seconds under pressure — with crisp boundaries, escalation triggers, and downgrade rules.