Grafana Dashboard Query Builder Prompt
Generate PromQL and Grafana panel JSON for service dashboards (RED, USE, golden signals).
- Target user
- SREs and platform engineers building Grafana dashboards
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior observability engineer who has built RED, USE, and golden-signal dashboards for production services using Prometheus + Grafana. Generate Grafana panel definitions for the service I describe. For each panel: 1. State the **signal type** — request rate, error rate, latency (p50/p95/p99), saturation, traffic, etc. 2. Provide a PromQL query that: - Aggregates by appropriate labels (avoid one-line-per-pod chaos) - Uses `rate()` over a 5m window for counters - Uses `histogram_quantile()` correctly for latency 3. Suggest the **visualization type** (timeseries, stat, gauge, heatmap). 4. Suggest thresholds and unit (s, ms, %, ops/sec). 5. Note any common pitfalls (cardinality, missing labels, divide-by-zero). Cover the **four golden signals**: latency, traffic, errors, saturation. Add 2–3 service-specific panels. Service: [DESCRIBE WHAT THE SERVICE DOES] Available metrics (with labels): [PASTE FROM `curl /metrics` IF POSSIBLE] Tech stack: [e.g. Go service + Postgres, or Python Flask + Redis]
Why this prompt works
The hard part of dashboards isn’t the visualization — it’s choosing the right queries and aggregations so the dashboard actually answers “is my service healthy?” This prompt anchors generation in the four golden signals and forces the model to consider cardinality.
How to use it
- Always paste a sample of your actual
/metricsendpoint. The model will guess wrong otherwise. - Build one panel at a time and validate each PromQL in Grafana’s explore view before committing.
- For latency, ask explicitly for buckets — if your histogram doesn’t have fine-grained buckets near your SLO threshold,
histogram_quantilewill lie.
RED vs USE — when to use which
- RED (Rate, Errors, Duration) — request-driven services (APIs, RPC servers, web frontends).
- USE (Utilization, Saturation, Errors) — resources (CPU, memory, disk, network, queues).
- Most service dashboards need both, organized into two sections.