Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for Prometheus & Monitoring Difficulty: Advanced ClaudeChatGPT

PromQL quantile_over_time vs histogram_quantile Selection Prompt

Decide whether to compute a percentile with quantile_over_time over a gauge or with histogram_quantile over histogram buckets, and avoid the silent accuracy traps of each.

Target user
Engineers computing latency or size percentiles in PromQL
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a PromQL expert who knows that quantile_over_time and histogram_quantile compute percentiles in fundamentally different ways, and that picking the wrong one gives a confidently wrong number with no error.

I will provide:
- The metric and its type (a gauge sampled over time, or a histogram with _bucket series, or a native histogram): [METRIC + TYPE]
- What I'm computing (p95 latency, p99 payload size, etc.) and the window: [GOAL]
- The current query and the number that looks wrong or that someone disputes: [QUERY + DISPUTE]
- The scrape interval and how often the underlying event happens vs how often it's sampled: [SAMPLING]

Your job:

1. **Match the function to the data shape** — state the rule plainly:
   - quantile_over_time(0.95, gauge[w]) computes the percentile of the SAMPLED VALUES of a gauge over the window. It only sees scrape-time snapshots, so events between scrapes are invisible.
   - histogram_quantile(0.95, rate(metric_bucket[w])) interpolates the percentile from bucketed observation COUNTS, so it sees every observation but is limited by bucket boundary resolution.
   Pick the correct one for my metric type.

2. **Name the accuracy trap of each** — quantile_over_time misses sub-scrape spikes and is meaningless on a counter; histogram_quantile is only as precise as the bucket layout (a p99 that lands in a huge top bucket is interpolated guesswork), and is wrong if bucket boundaries are poorly chosen.

3. **Diagnose the disputed number** — figure out which trap is biting: too few samples for quantile_over_time, or a coarse/clipped top bucket (le="+Inf") for histogram_quantile.

4. **Recommend and rewrite** — give the correct query for my metric type. If the metric is the wrong shape for the goal (e.g. trying to get accurate p99 from a sparsely sampled gauge), say so and recommend instrumenting a histogram instead.

5. **Verify** — provide a cross-check (compare against the +Inf bucket total, or against max_over_time for an upper bound).

Output as: (a) a 2-row table contrasting the two functions, (b) which trap explains the disputed number, (c) the corrected query, (d) a one-line cross-check and, if relevant, an instrumentation recommendation.

State which function sees every observation and which sees only samples, every time. Never apply quantile_over_time to a counter or histogram_quantile to a non-bucketed gauge.

Why this prompt works

Percentiles are where PromQL quietly lies, because two functions with similar-sounding purposes compute fundamentally different things. quantile_over_time takes the percentile of a gauge’s sampled values across a window — it only ever sees the snapshots taken at scrape time, so anything that happens between scrapes is invisible to it. histogram_quantile interpolates from bucketed observation counts, so it sees every observation but is constrained by where the bucket boundaries sit. Engineers reach for whichever they remember, get a plausible number, and ship it. This prompt forces the model to match the function to the data shape first, because using the wrong one is not an error you can catch in review by reading the query — the number just comes out confidently wrong.

Each function has a distinct, silent failure mode, and the prompt names both. quantile_over_time on a sparsely sampled gauge produces a clean p99 that hides real tail spikes, which is dangerous precisely because it looks reassuring. histogram_quantile is only as precise as the bucket layout; when the requested quantile lands in an oversized or +Inf top bucket, the result is interpolation over a gap, not a measurement. By making the model diagnose which of these traps explains the disputed number, the answer addresses the actual dispute instead of just restating a formula.

The most valuable move is the willingness to say the metric is the wrong shape for the question. If someone wants an accurate p99 from a gauge that’s sampled every fifteen seconds, no query will deliver it — the right answer is to instrument a histogram. Pairing that honesty with a concrete cross-check (bound the result against the +Inf total or max_over_time) keeps the work in the AI-drafts, human-verifies lane: you get a query, a reason, and a falsifiable test before anyone quotes the percentile in an SLO review.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week