Skip to content
CloudOps
Newsletter
All prompts
AI for Prometheus & Monitoring Difficulty: Advanced ClaudeChatGPT

Grafana Loki + Prometheus Correlation Prompt

Correlate metrics and logs in Grafana — exemplars from Prometheus to traces, derived fields from Loki, jump from spike to log line.

Target user
SREs debugging with metrics + logs
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior SRE who has built dashboards correlating metrics with logs — click on a latency spike, see relevant logs immediately.

I will provide:
- Current setup (Prom + Loki versions)
- Use case
- Symptom (correlation not working)

Your job:

1. **Correlation patterns**:
   - **Time-aligned panels** — metric + log volume + logs on same dashboard
   - **Exemplars** — Prom links to traces from histogram buckets
   - **Derived fields** — Loki extracts traceID from logs, links to Tempo
   - **Split view (Explore)** — drag from metric panel to logs
2. **For exemplars**:
   - Prometheus must support exemplars
   - Apps emit metrics with exemplar (traceID)
   - Grafana renders dots on histogram heatmap
3. **For Loki derived fields**:
   - Regex match in log
   - Extract traceID
   - Link to Tempo datasource
4. **For panel-to-panel sync**:
   - Shared dashboard variable for time range
   - Click drills into specific service
5. **For Tempo / trace correlation**:
   - From metric: exemplar → trace
   - From log: derived field → trace
   - From trace: service graph
6. **For ad-hoc filters**:
   - Variable applied to all panels
   - Useful for narrowing investigation
7. **For Explore mode**:
   - Side-by-side metrics + logs
   - Time linked
8. **For data source UIDs**:
   - Cross-DS links need correct UID
   - DS provisioning sets these

Mark DESTRUCTIVE: removing exemplars from app (loses correlation), changing DS UID (breaks derived field links), overly aggressive derived field regex (false matches).

---

Setup: [DESCRIBE]
Use case: [DESCRIBE]
Symptom: [DESCRIBE]

Why this prompt works

Correlation is the modern observability story. This prompt walks setup.

How to use it

  1. App emits exemplars + traceID in logs.
  2. Prom stores exemplars.
  3. Loki derived fields extract traceID.
  4. Tempo serves traces.

Setup

Prometheus exemplar support

# In Prometheus config
global:
  scrape_interval: 30s

# Enable exemplars (Prom 2.26+)
# Storage:
storage:
  exemplars:
    max_exemplars: 1000000
# CLI flag:
--enable-feature=exemplar-storage

App instrumentation (Go example)

import "github.com/prometheus/client_golang/prometheus"

histogram := prometheus.NewHistogramVec(prometheus.HistogramOpts{
    Name: "http_request_duration_seconds",
    Buckets: prometheus.DefBuckets,
}, []string{"method"})

// Observe with exemplar
histogram.WithLabelValues("GET").(prometheus.ExemplarObserver).
    ObserveWithExemplar(duration, prometheus.Labels{"traceID": traceID})

Loki app: include traceID in logs

logger.Info("request processed",
    zap.String("traceID", traceID),
    zap.Duration("duration", duration))

Loki derived field

datasources:
- name: Loki
  type: loki
  jsonData:
    derivedFields:
    - matcherRegex: 'traceID[=":\s]+(\w+)'
      name: TraceID
      url: ''
      datasourceUid: tempo-uid
      urlDisplayLabel: "View Trace"

Tempo datasource

- name: Tempo
  type: tempo
  uid: tempo-uid
  url: http://tempo:3200
  jsonData:
    tracesToLogs:
      datasourceUid: loki-uid
      filterByTraceID: true
    tracesToMetrics:
      datasourceUid: prometheus-uid
    serviceMap:
      datasourceUid: prometheus-uid

Correlated dashboard layout

┌─────────────────────────────────────────┐
│ Service Selector: $service              │
├─────────────────────────────────────────┤
│ Request Rate         | Error Rate       │
│ [time series]        | [time series]    │
├─────────────────────────────────────────┤
│ Latency Heatmap (with exemplar dots)     │
│ [click exemplar → trace in Tempo]        │
├─────────────────────────────────────────┤
│ Log Volume           | Logs (filtered)   │
│ [time series]        | [logs panel]      │
│                      | [click TraceID    │
│                      |  → trace in Tempo]│
└─────────────────────────────────────────┘

Common findings this catches

  • No exemplars visible → app not emitting OR Prom not storing.
  • Derived field not linking → DS UID wrong; regex no match.
  • Click on exemplar goes nowhere → tracesTo* config missing.
  • Time skew between metric and log → NTP issues.
  • High exemplar volume → Prom storage; tune retention.
  • Tempo can’t find trace → retention; sampling lost it.
  • Multi-cluster correlation → cluster label propagation.

When to escalate

  • App instrumentation rollout — engage app teams.
  • Tempo / Loki scaling — engineering.
  • Trace sampling design — coordinate.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week