Prometheus Error Guide: Alert Stuck 'Pending' and Never Firing
Fix Prometheus alerts stuck in Pending or missing from /alerts: tune for and evaluation_interval, verify the expression returns series, and check rule loading and silences.
- #prometheus-monitoring
- #troubleshooting
- #errors
- #alerting
Exact Error Message
This failure has no error string — it is a behavioral one. An alert you expect to page either sits in Pending forever and never reaches Firing, or it never appears on the /alerts page at all. There is nothing in the log to grep for; the symptom is the absence of a firing alert.
# /alerts shows the rule stuck in Pending and never advancing:
HighErrorRate (0 active) # rule loaded, but never produces a pending/firing instance
HighErrorRate (1 active)
Labels State Active Since Value
... PENDING 2026-06-27 03:10:55 0.42 # stays PENDING, never FIRING
Or the rule is missing entirely from /rules, which means the file was never loaded.
What the Error Means
A Prometheus alerting rule has a lifecycle: each evaluation_interval, Prometheus runs the rule’s expr. If the expression returns one or more series, those become Pending alert instances. An instance stays Pending until the expr has been continuously true for the rule’s for: duration; only then does it transition to Firing and get pushed to Alertmanager.
“Stuck in Pending” means the condition is true sometimes but not continuously long enough to satisfy for:. “Never appears” means the expr returns no series at evaluation time, or the rule file was never loaded. Either way, no alert is sent — even though Alertmanager and notifications are perfectly healthy. This is a rule-evaluation problem, distinct from a delivery problem.
Common Causes
for:longer than the condition holds — the expression is true for 2m butfor: 5m, so it resets to Pending and never fires.- Expression returns no series — a typo, wrong label matcher, or a metric that does not exist means there is nothing to evaluate, so no alert appears.
evaluation_intervaltoo coarse vsfor:— withevaluation_interval: 1mandfor: 90s, the rule effectively needs two consecutive true evaluations; transient spikes never accumulate.- Rule file not loaded —
rule_files:glob does not match, or the file failed validation, so the rule is absent from/rules. - Flapping — the condition resolves between evaluations, resetting the Pending timer each cycle.
- Alertmanager routing/inhibition/silence — the rule fires correctly but a silence, inhibition, or route swallows it (this is the boundary with the delivery guide below).
- Missing labels for routing — the alert fires but lacks the label Alertmanager routes on, so it matches no receiver.
How to Reproduce the Error
Write a rule whose for: outlasts a brief condition:
groups:
- name: demo
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{code="500"}[5m]) > 0.5
for: 10m
labels: { severity: page }
annotations: { summary: "Error rate high" }
Generate a 2-minute error burst. The alert enters Pending, but because the burst ends before 10m elapses, the timer resets and it returns to inactive — never firing.
Diagnostic Commands
Start at the /rules and /alerts pages — they are the source of truth:
# Is the rule even loaded?
curl -s http://localhost:9090/api/v1/rules \
| jq -r '.data.groups[].rules[] | select(.type=="alerting") | [.name, .state, .health] | @tsv'
HighErrorRate pending ok
DiskFilling inactive ok
Run the alert’s expression directly — if it returns nothing, the alert can never fire:
curl -s http://localhost:9090/api/v1/query \
--data-urlencode 'query=rate(http_requests_total{code="500"}[5m]) > 0.5' \
| jq '.data.result | length'
Use the built-in ALERTS and ALERTS_FOR_STATE series to see lifecycle state over time:
ALERTS{alertname="HighErrorRate"} # 1 when pending or firing
ALERTS_FOR_STATE{alertname="HighErrorRate"} # unix ts the condition went active
# How long has the condition actually held? Compare to your for:
time() - ALERTS_FOR_STATE{alertname="HighErrorRate"}
Check evaluation interval and whether evaluations are healthy:
grep -nE 'evaluation_interval|rule_files' /etc/prometheus/prometheus.yml
rate(prometheus_rule_evaluation_failures_total[5m]) > 0
prometheus_rule_group_last_duration_seconds
Validate the rule file and unit-test the firing logic:
promtool check rules /etc/prometheus/rules/*.yml
promtool test rules /etc/prometheus/tests/alerts_test.yml
If the rule is firing but no page arrives, check Alertmanager silences and routing:
curl -s http://localhost:9090/api/v1/alerts | jq -r '.data.alerts[] | [.labels.alertname, .state] | @tsv'
amtool silence query --alertmanager.url=http://localhost:9093
Step-by-Step Resolution
1. Confirm the rule is loaded. If it is missing from /rules, fix the rule_files: glob and reload.
rule_files:
- "/etc/prometheus/rules/*.yml" # ensure this matches your actual path
promtool check config /etc/prometheus/prometheus.yml
curl -s -X POST http://localhost:9090/-/reload
2. Confirm the expression returns series. Run the expr in the query box or via the API. If it returns 0 results, fix the metric name, labels, or threshold — the alert cannot fire on an empty result.
3. Right-size for: against how long the condition really holds. Use time() - ALERTS_FOR_STATE{...} to see the actual hold time, then set for: below it (e.g. for: 2m for a condition that holds ~3m).
4. Align evaluation_interval and for:. for: should be a comfortable multiple of evaluation_interval (e.g. interval 30s, for: 2m = 4 evaluations). Avoid for: values that need only one or two evaluations to fire — they are fragile.
global:
evaluation_interval: 30s
5. For flapping conditions, smooth the expression with a longer range or avg_over_time so brief dips do not reset the timer.
expr: avg_over_time(rate(http_requests_total{code="500"}[5m])[5m:]) > 0.5
6. If it fires but does not page, the problem is Alertmanager routing/silence/inhibition — see the delivery guide below. Check the alert carries the labels your routes match on.
Prevention and Best Practices
- Always
promtool test rulesnew alerts so the Pending→Firing transition is verified before deploy. - Keep
for:a small multiple ofevaluation_interval; document why eachfor:value was chosen. - Alert on the alerting pipeline itself:
prometheus_rule_evaluation_failures_totalandprometheus_rule_group_last_duration_seconds(groups taking longer than the interval skip evaluations). - Use
ALERTSandALERTS_FOR_STATEin a meta-dashboard to spot rules that go Pending but never fire. - Reload with
promtool check configfirst so a bad rule file does not silently drop your rules.
Related Errors
- Prometheus Alertmanager notifications failing — the delivery counterpart. That guide covers alerts that fire but produce no notification (SMTP/webhook/routing failures). This guide covers alerts that never fire in the first place. If
/alertsshows FIRING but no page arrives, go there; if it shows PENDING or nothing, stay here. - Prometheus ‘out of order sample’ ingestion errors — silently dropped samples can make an alert’s
exprevaluate to nothing.
Frequently Asked Questions
Why is my alert stuck in Pending forever? The condition is not staying true for the full for: duration. Either the for: is too long for how long the spike lasts, or the metric flaps. Use time() - ALERTS_FOR_STATE{...} to measure the real hold time and lower for: accordingly.
My alert never even shows up as Pending — why? Its expr returns no series at evaluation time. Run the expression directly in the query box. If you get zero results, the metric name, label matchers, or threshold is wrong, or the data is missing. There is nothing to make Pending.
Is this the same as Alertmanager not sending notifications? No. If /alerts shows the alert Firing but you get no notification, that is a delivery/routing problem covered in the Alertmanager notifications-failing guide. If the alert is stuck Pending or missing, the rule never fired and this guide applies.
How should for: relate to evaluation_interval? for: must be at least one evaluation_interval and is best set to several (e.g. 4×). If for: is smaller than the interval it fires on the first true evaluation; if it is an awkward non-multiple, you effectively round up to the next evaluation, which surprises people.
The rule is missing from /rules entirely — what now? The file was never loaded. Check the rule_files: glob matches the file’s real path, run promtool check rules on it (a syntax error drops the whole group), then POST /-/reload and re-check /rules.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.