Custom Prometheus Exporter Design Prompt
Design and write a custom Prometheus exporter — client library, metric types, registration, scrape efficiency.
- Target user
- Engineers writing Prometheus exporters
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior engineer who has written Prometheus exporters for legacy systems — pulling metrics from custom APIs, databases, files, and exposing them as `/metrics`. I will provide: - The data source (API, log file, DB, custom protocol) - Programming language preference - Metric semantics needed Your job: 1. **Pick metric types**: - **Counter** — monotonically increasing (requests served) - **Gauge** — value can go up/down (current connections) - **Histogram** — observations with buckets (latency) - **Summary** — observations with quantiles 2. **Pick library**: - Go: `github.com/prometheus/client_golang` - Python: `prometheus_client` - Java: `simpleclient` - Node: `prom-client` 3. **Naming conventions**: - `<namespace>_<subsystem>_<name>_<unit>` - Lowercase, underscores - Use base units (`_seconds`, `_bytes`) - Suffix: `_total` for counters, no suffix for gauges 4. **For label design**: - Low cardinality - Static or bounded values - Avoid request IDs, timestamps 5. **For scrape implementation**: - **Push collectors** — values updated on data change - **Pull collectors** — collect on each scrape (slower but fresh) - Use `Collect()` interface for on-demand 6. **For efficiency**: - Cache expensive queries within scrape interval - Batch fetch from source - Concurrent collection 7. **For health metrics**: - `<exporter>_up` (1/0) - Error counters - Scrape duration self-metric 8. **For HTTP handler**: - `promhttp.Handler()` (Go) - Custom registry vs default Mark DESTRUCTIVE: high-cardinality labels (Prom OOM), pull collector with expensive queries (latency on every scrape), exposing sensitive data in metrics. --- Data source: [DESCRIBE] Language: [DESCRIBE] Metric semantics: [DESCRIBE]
Why this prompt works
Exporter writing requires understanding conventions. This prompt walks them.
How to use it
- Match metric types to semantics.
- Name properly for discoverability.
- Mind cardinality.
- Add health metrics.
Examples
Go example (using client_golang)
package main
import (
"log"
"net/http"
"time"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
)
var (
requestsTotal = prometheus.NewCounterVec(
prometheus.CounterOpts{
Namespace: "myapp",
Name: "http_requests_total",
Help: "Total HTTP requests by status",
},
[]string{"method", "status"},
)
activeConnections = prometheus.NewGauge(
prometheus.GaugeOpts{
Namespace: "myapp",
Name: "active_connections",
Help: "Currently active connections",
},
)
requestDuration = prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Namespace: "myapp",
Name: "request_duration_seconds",
Help: "Request duration",
Buckets: prometheus.DefBuckets, // 5ms..10s
},
[]string{"method"},
)
)
func init() {
prometheus.MustRegister(requestsTotal)
prometheus.MustRegister(activeConnections)
prometheus.MustRegister(requestDuration)
}
func handler(w http.ResponseWriter, r *http.Request) {
start := time.Now()
activeConnections.Inc()
defer activeConnections.Dec()
// ... do work ...
status := "200"
requestsTotal.WithLabelValues(r.Method, status).Inc()
requestDuration.WithLabelValues(r.Method).Observe(time.Since(start).Seconds())
w.Write([]byte("OK"))
}
func main() {
http.HandleFunc("/", handler)
http.Handle("/metrics", promhttp.Handler())
log.Fatal(http.ListenAndServe(":8080", nil))
}
Python example (custom collector for external API)
from prometheus_client import REGISTRY, Metric, GaugeMetricFamily, start_http_server
import requests
class CustomCollector:
def collect(self):
try:
r = requests.get("http://internal-api/stats", timeout=5)
data = r.json()
# Up metric
up = GaugeMetricFamily('myapi_up', 'API reachable', value=1)
yield up
# Application metrics
workers = GaugeMetricFamily(
'myapi_active_workers',
'Active worker count',
labels=['queue'])
for queue, count in data['workers'].items():
workers.add_metric([queue], count)
yield workers
except Exception as e:
# Failure marker
yield GaugeMetricFamily('myapi_up', 'API reachable', value=0)
REGISTRY.register(CustomCollector())
start_http_server(8000)
input() # keep running
Common findings this catches
- High cardinality from path label → strip ID portions.
- Pull collector slow → cache or use push model.
- Counter not monotonic → confused with gauge.
- No
_upmetric → no way to know if exporter healthy. - Histogram buckets wrong range → percentiles inaccurate.
- Same name different labels registered → panic.
- Secret in metric value → audit.
When to escalate
- Performance issues at scale — async/cache.
- Sensitive data — security review.
- Vendor exporter contribution — community.
Related prompts
-
Blackbox Exporter Probe Configuration Prompt
Configure blackbox_exporter for HTTP, TCP, ICMP, DNS probes — uptime monitoring, certificate expiry, response validation.
-
Prometheus Scrape Config & Service Discovery Prompt
Configure Prometheus scrape targets — kubernetes_sd, ec2_sd, file_sd, consul_sd, relabeling, scrape interval tuning.
-
PromQL Histogram & Quantile Calculation Prompt
Use Prometheus histograms correctly — `histogram_quantile`, bucket bounds, p99 latency calculation, histogram vs summary, native histograms.