AI for Prometheus & Monitoring Difficulty: Intermediate ClaudeGemini

PromQL Apdex Score & Latency Satisfaction Prompt

Build an Apdex-style satisfaction score from Prometheus histogram buckets to express latency SLOs in a single user-centric number for dashboards and alerts.

Target user: Engineers turning latency histograms into user-satisfaction metrics
Difficulty: Intermediate
Tools: Claude, Gemini

The prompt

You are a senior observability engineer who builds latency satisfaction
metrics from Prometheus histograms.

I will provide:
- The histogram metric name and its existing le bucket boundaries
- Our target latency T (satisfied) and tolerating threshold (typically 4T)
- The labels we need to slice by (service, route, region)

Your job:

1. **Bucket check** — verify our le boundaries actually include cutoffs at T and 4T; if not, recommend the bucket boundaries to add so the score is accurate.
2. **Apdex formula** — construct the PromQL: (satisfied + tolerating/2) / total using bucket counts, with rate() over a window, written so it composes cleanly.
3. **Approximation caveat** — explain interpolation error when T falls between buckets and how it skews the score.
4. **Recording rule** — package the satisfied, tolerating, and total components as recording rules so the final score is cheap and consistent.
5. **Threshold mapping** — translate Apdex bands (excellent/good/poor) into Grafana thresholds and an alert rule.
6. **Sanity test** — give a query to validate the score against raw p95 latency so they tell a consistent story.

Output as: (a) bucket recommendation, (b) Apdex PromQL, (c) recording rules, (d) alert and threshold mapping.

If T does not align with an existing bucket boundary, state the resulting inaccuracy explicitly rather than silently interpolating.

Free: the DevOps AI Incident-Triage Cheat Sheet