Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for Grafana Difficulty: Intermediate ClaudeChatGPT

Grafana Query Caching Enterprise Prompt

Configure Grafana Enterprise query caching to cut data source load and speed dashboards, with per-data-source TTLs and Redis backend.

Target user
Grafana Enterprise admins reducing data source load
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior Grafana Enterprise admin who configures query caching to reduce data source load and speed dashboards.

I will provide:
- The data sources under load
- The refresh patterns and dashboard concurrency
- Current caching config

Your job:

1. **Confirm the caching model**:
   - Query caching is a Grafana Enterprise / Cloud feature
   - Caches data source query responses so repeated queries skip the backend
   - Especially helps shared dashboards and public dashboards
2. **Pick a backend**:
   - In-memory (simple, per-instance, lost on restart)
   - Redis or Memcached (shared across HA replicas, recommended)
3. **Configure the backend**:
   - `[caching]` section: `backend`, `redis.url`, `redis.prefix`
   - Size limits and connection pool for Redis
4. **Set TTLs**:
   - Global default TTL in `[caching]`
   - Override per data source in its settings (cache TTL ms)
   - Short TTL for volatile metrics, longer for slow-changing data
5. **Choose what to cache**:
   - Enable caching per data source, not blanket
   - Avoid caching alerting queries where staleness matters
6. **Watch invalidation**:
   - Cache keys include query + time range + vars
   - "Now" relative ranges still generate new keys each interval
7. **Measure**:
   - Track cache hit ratio and data source QPS reduction
   - Tune TTL to balance freshness vs load

Mark DESTRUCTIVE: overly long TTLs serve stale data; caching alert queries can hide incidents; flushing the cache spikes data source load momentarily.

---

Data sources under load: [DESCRIBE]
Refresh patterns: [DESCRIBE]
Current config: [DESCRIBE]

Why this prompt works

Query caching is the highest-leverage way to protect a data source from dashboard fan-out, but a careless global TTL serves stale data and can even stale-evaluate alerts. This prompt picks a shared Redis backend for HA, sets per-data-source TTLs matched to volatility, and excludes alerting queries — so you cut load without hiding incidents.

How to use it

  1. Enable caching with a Redis backend for HA.
  2. Set a conservative default TTL, override per data source.
  3. Exclude alerting-critical sources.
  4. Measure hit ratio and tune.

Useful commands

# Enable caching for a specific data source via API
curl -s -X POST -H "Authorization: Bearer $GRAFANA_TOKEN" \
  -H "Content-Type: application/json" \
  http://grafana:3000/api/datasources/uid/$DS_UID/cache \
  -d '{"enabled":true,"ttlQueriesMs":60000,"ttlResourcesMs":300000,"useDefaultTTL":false}'

# Read the caching config for a data source
curl -s -H "Authorization: Bearer $GRAFANA_TOKEN" \
  http://grafana:3000/api/datasources/uid/$DS_UID/cache | jq

# Clean (flush) a data source cache
curl -s -X POST -H "Authorization: Bearer $GRAFANA_TOKEN" \
  http://grafana:3000/api/datasources/uid/$DS_UID/cache/clean

# Watch Redis for cache keys
redis-cli -h redis KEYS 'grafana:*' | head
# grafana.ini
[caching]
enabled = true
backend = redis

[caching.redis]
url = redis://redis:6379
prefix = grafana

Example config

// POST /api/datasources/uid/:uid/cache
{
  "enabled": true,
  "useDefaultTTL": false,
  "ttlQueriesMs": 60000,
  "ttlResourcesMs": 300000
}

Common findings this catches

  • High data source QPS → caching not enabled on the hot source.
  • Stale dashboards → TTL too long for volatile metrics.
  • Cache lost on restart → in-memory backend on HA.
  • Low hit ratio → relative-now ranges regenerating keys.
  • Stale alerts → caching applied to alerting queries.
  • Feature missing → running OSS, not Enterprise.

When to escalate

  • Redis sizing and HA for the cache tier — infra.
  • Deciding acceptable staleness per SLA — service owners.
  • Data source still saturated after caching — query/dashboard optimization.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week