Skip to content
CloudOps
Newsletter
All prompts
AI for Prometheus & Monitoring Difficulty: Advanced ClaudeChatGPT

Custom Prometheus Exporter Design Prompt

Design and write a custom Prometheus exporter — client library, metric types, registration, scrape efficiency.

Target user
Engineers writing Prometheus exporters
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior engineer who has written Prometheus exporters for legacy systems — pulling metrics from custom APIs, databases, files, and exposing them as `/metrics`.

I will provide:
- The data source (API, log file, DB, custom protocol)
- Programming language preference
- Metric semantics needed

Your job:

1. **Pick metric types**:
   - **Counter** — monotonically increasing (requests served)
   - **Gauge** — value can go up/down (current connections)
   - **Histogram** — observations with buckets (latency)
   - **Summary** — observations with quantiles
2. **Pick library**:
   - Go: `github.com/prometheus/client_golang`
   - Python: `prometheus_client`
   - Java: `simpleclient`
   - Node: `prom-client`
3. **Naming conventions**:
   - `<namespace>_<subsystem>_<name>_<unit>`
   - Lowercase, underscores
   - Use base units (`_seconds`, `_bytes`)
   - Suffix: `_total` for counters, no suffix for gauges
4. **For label design**:
   - Low cardinality
   - Static or bounded values
   - Avoid request IDs, timestamps
5. **For scrape implementation**:
   - **Push collectors** — values updated on data change
   - **Pull collectors** — collect on each scrape (slower but fresh)
   - Use `Collect()` interface for on-demand
6. **For efficiency**:
   - Cache expensive queries within scrape interval
   - Batch fetch from source
   - Concurrent collection
7. **For health metrics**:
   - `<exporter>_up` (1/0)
   - Error counters
   - Scrape duration self-metric
8. **For HTTP handler**:
   - `promhttp.Handler()` (Go)
   - Custom registry vs default

Mark DESTRUCTIVE: high-cardinality labels (Prom OOM), pull collector with expensive queries (latency on every scrape), exposing sensitive data in metrics.

---

Data source: [DESCRIBE]
Language: [DESCRIBE]
Metric semantics: [DESCRIBE]

Why this prompt works

Exporter writing requires understanding conventions. This prompt walks them.

How to use it

  1. Match metric types to semantics.
  2. Name properly for discoverability.
  3. Mind cardinality.
  4. Add health metrics.

Examples

Go example (using client_golang)

package main

import (
    "log"
    "net/http"
    "time"

    "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/client_golang/prometheus/promhttp"
)

var (
    requestsTotal = prometheus.NewCounterVec(
        prometheus.CounterOpts{
            Namespace: "myapp",
            Name:      "http_requests_total",
            Help:      "Total HTTP requests by status",
        },
        []string{"method", "status"},
    )

    activeConnections = prometheus.NewGauge(
        prometheus.GaugeOpts{
            Namespace: "myapp",
            Name:      "active_connections",
            Help:      "Currently active connections",
        },
    )

    requestDuration = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Namespace: "myapp",
            Name:      "request_duration_seconds",
            Help:      "Request duration",
            Buckets:   prometheus.DefBuckets,           // 5ms..10s
        },
        []string{"method"},
    )
)

func init() {
    prometheus.MustRegister(requestsTotal)
    prometheus.MustRegister(activeConnections)
    prometheus.MustRegister(requestDuration)
}

func handler(w http.ResponseWriter, r *http.Request) {
    start := time.Now()
    activeConnections.Inc()
    defer activeConnections.Dec()

    // ... do work ...
    status := "200"

    requestsTotal.WithLabelValues(r.Method, status).Inc()
    requestDuration.WithLabelValues(r.Method).Observe(time.Since(start).Seconds())
    w.Write([]byte("OK"))
}

func main() {
    http.HandleFunc("/", handler)
    http.Handle("/metrics", promhttp.Handler())
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Python example (custom collector for external API)

from prometheus_client import REGISTRY, Metric, GaugeMetricFamily, start_http_server
import requests

class CustomCollector:
    def collect(self):
        try:
            r = requests.get("http://internal-api/stats", timeout=5)
            data = r.json()

            # Up metric
            up = GaugeMetricFamily('myapi_up', 'API reachable', value=1)
            yield up

            # Application metrics
            workers = GaugeMetricFamily(
                'myapi_active_workers',
                'Active worker count',
                labels=['queue'])
            for queue, count in data['workers'].items():
                workers.add_metric([queue], count)
            yield workers
        except Exception as e:
            # Failure marker
            yield GaugeMetricFamily('myapi_up', 'API reachable', value=0)

REGISTRY.register(CustomCollector())
start_http_server(8000)
input()    # keep running

Common findings this catches

  • High cardinality from path label → strip ID portions.
  • Pull collector slow → cache or use push model.
  • Counter not monotonic → confused with gauge.
  • No _up metric → no way to know if exporter healthy.
  • Histogram buckets wrong range → percentiles inaccurate.
  • Same name different labels registered → panic.
  • Secret in metric value → audit.

When to escalate

  • Performance issues at scale — async/cache.
  • Sensitive data — security review.
  • Vendor exporter contribution — community.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week