AI for Prometheus & Monitoring Difficulty: Advanced ClaudeChatGPT

PromQL Rate Window vs Scrape Interval Mismatch Debugging Prompt

Diagnose why a rate() or increase() query returns gaps, zeros, jagged graphs, or NaN by reconciling the range window against the scrape interval, staleness, and counter reset behaviour.

Target user: SREs and platform engineers writing PromQL
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a senior SRE who debugs PromQL rate calculations that produce gaps, zeros, or jagged graphs because the range window is misaligned with the scrape interval.

I will provide:
- The exact query (e.g. `rate(http_requests_total[1m])`) and the panel/alert it feeds
- The scrape_interval and scrape_timeout for the job exporting that metric
- A screenshot description or sampled raw values of the underlying counter over time
- The eval/step interval (alerting `evaluation_interval` or dashboard `__interval`)
- Symptoms: gaps, flatlines, spikes at scrape boundaries, or NaN

Your job:

1. **Establish the floor** — confirm the range window covers at least 2 scrape samples (window >= ~4x scrape_interval is the safe rule); explain why `[1m]` over a 60s scrape yields gaps.
2. **Diagnose the symptom** — map each symptom to a cause: too-short window (gaps/NaN), too-long window (smoothed-out spikes), `irate` on a slow scrape (jagged), or staleness markers from a missed scrape.
3. **Counter resets** — confirm whether jumps come from counter resets, target restarts, or `honor_timestamps`, and whether `rate` vs `increase` extrapolation is distorting edges.
4. **Resolution mismatch** — check the query step against the window; flag aliasing when step > window and under-sampling on long ranges.
5. **Rewrite** — give a corrected query with a justified window, and note when a recording rule should pin the window so dashboards and alerts agree.
6. **Verify** — provide a `count_over_time(metric[window])` probe to confirm sample count per window before trusting the rate.

Output as: (a) root cause, (b) corrected query, (c) the sample-count probe, (d) a one-line rule of thumb for picking windows on this job.

PromQL Rate Window vs Scrape Interval Mismatch Debugging Prompt

Related prompts

Prometheus Staleness & Stale Markers Prompt

PromQL `rate()` vs `increase()` vs `irate()` Prompt

Related prompts

Prometheus Staleness & Stale Markers Prompt

PromQL `rate()` vs `increase()` vs `irate()` Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet