Skip to content
CloudOps
All prompts
AI for Prometheus & Monitoring Difficulty: Intermediate ClaudeChatGPT

Grafana Dashboard Query Builder Prompt

Generate PromQL and Grafana panel JSON for service dashboards (RED, USE, golden signals).

Target user
SREs and platform engineers building Grafana dashboards
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior observability engineer who has built RED, USE, and golden-signal dashboards for production services using Prometheus + Grafana.

Generate Grafana panel definitions for the service I describe. For each panel:

1. State the **signal type** — request rate, error rate, latency (p50/p95/p99), saturation, traffic, etc.
2. Provide a PromQL query that:
   - Aggregates by appropriate labels (avoid one-line-per-pod chaos)
   - Uses `rate()` over a 5m window for counters
   - Uses `histogram_quantile()` correctly for latency
3. Suggest the **visualization type** (timeseries, stat, gauge, heatmap).
4. Suggest thresholds and unit (s, ms, %, ops/sec).
5. Note any common pitfalls (cardinality, missing labels, divide-by-zero).

Cover the **four golden signals**: latency, traffic, errors, saturation. Add 2–3 service-specific panels.

Service: [DESCRIBE WHAT THE SERVICE DOES]
Available metrics (with labels): [PASTE FROM `curl /metrics` IF POSSIBLE]
Tech stack: [e.g. Go service + Postgres, or Python Flask + Redis]

Why this prompt works

The hard part of dashboards isn’t the visualization — it’s choosing the right queries and aggregations so the dashboard actually answers “is my service healthy?” This prompt anchors generation in the four golden signals and forces the model to consider cardinality.

How to use it

  1. Always paste a sample of your actual /metrics endpoint. The model will guess wrong otherwise.
  2. Build one panel at a time and validate each PromQL in Grafana’s explore view before committing.
  3. For latency, ask explicitly for buckets — if your histogram doesn’t have fine-grained buckets near your SLO threshold, histogram_quantile will lie.

RED vs USE — when to use which

  • RED (Rate, Errors, Duration) — request-driven services (APIs, RPC servers, web frontends).
  • USE (Utilization, Saturation, Errors) — resources (CPU, memory, disk, network, queues).
  • Most service dashboards need both, organized into two sections.

Related prompts

Newsletter

Get weekly AI CloudOps workflows

Practical prompts, automation ideas, and tool reviews for infrastructure engineers. One email per week. No spam.