Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for Prometheus & Monitoring Difficulty: Advanced ClaudeChatGPT

PromQL Rate Window vs Scrape Interval Mismatch Debugging Prompt

Diagnose why a rate() or increase() query returns gaps, zeros, jagged graphs, or NaN by reconciling the range window against the scrape interval, staleness, and counter reset behaviour.

Target user
SREs and platform engineers writing PromQL
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior SRE who debugs PromQL rate calculations that produce gaps, zeros, or jagged graphs because the range window is misaligned with the scrape interval.

I will provide:
- The exact query (e.g. `rate(http_requests_total[1m])`) and the panel/alert it feeds
- The scrape_interval and scrape_timeout for the job exporting that metric
- A screenshot description or sampled raw values of the underlying counter over time
- The eval/step interval (alerting `evaluation_interval` or dashboard `__interval`)
- Symptoms: gaps, flatlines, spikes at scrape boundaries, or NaN

Your job:

1. **Establish the floor** — confirm the range window covers at least 2 scrape samples (window >= ~4x scrape_interval is the safe rule); explain why `[1m]` over a 60s scrape yields gaps.
2. **Diagnose the symptom** — map each symptom to a cause: too-short window (gaps/NaN), too-long window (smoothed-out spikes), `irate` on a slow scrape (jagged), or staleness markers from a missed scrape.
3. **Counter resets** — confirm whether jumps come from counter resets, target restarts, or `honor_timestamps`, and whether `rate` vs `increase` extrapolation is distorting edges.
4. **Resolution mismatch** — check the query step against the window; flag aliasing when step > window and under-sampling on long ranges.
5. **Rewrite** — give a corrected query with a justified window, and note when a recording rule should pin the window so dashboards and alerts agree.
6. **Verify** — provide a `count_over_time(metric[window])` probe to confirm sample count per window before trusting the rate.

Output as: (a) root cause, (b) corrected query, (c) the sample-count probe, (d) a one-line rule of thumb for picking windows on this job.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week