Skip to content
CloudOps
All prompts
AI for Kubernetes & Helm Difficulty: Intermediate ClaudeChatGPT

Prometheus ServiceMonitor & PodMonitor Configuration Prompt

Configure Prometheus Operator scrape — ServiceMonitor, PodMonitor, target discovery, label rewriting, missing metrics debugging.

Target user
SREs configuring Prometheus monitoring
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior SRE who has set up Prometheus Operator at scale — ServiceMonitors, PodMonitors, label rewriting, target discovery, federation.

I will provide:
- The metric source (app, exporter, sidecar)
- Current ServiceMonitor/PodMonitor spec
- Symptom (metrics missing, scrape failing, label too high cardinality)

Your job:

1. **ServiceMonitor vs PodMonitor**:
   - **ServiceMonitor** — discovers via Service endpoints (recommended; works with Headless too)
   - **PodMonitor** — discovers pods directly (no Service required)
   - Both result in scrape targets
2. **For metrics missing**:
   - Check Prometheus targets page (port-forward to prometheus + /targets)
   - Target down = scrape failure
   - Target missing = no service/pod matches selector
3. **For Prometheus Operator selecting your SM**:
   - Prometheus CR has `serviceMonitorSelector`
   - Default install: selects all SMs OR namespace-scoped
   - Verify label match
4. **For label rewriting**:
   - `metricRelabelings` (post-scrape, before ingest)
   - `relabelings` (pre-scrape, on discovery)
   - Common: drop high-cardinality labels, rename
5. **For port specification**:
   - SM: `port` is the Service port name (not number)
   - PodMonitor: `port` is the container port name
6. **For path / scheme**:
   - Default `/metrics` HTTP
   - Custom for non-standard exporters
   - TLS for HTTPS scrape
7. **For namespace selection**:
   - SM in `monitoring` namespace can monitor across namespaces (depends on `namespaceSelector`)
   - `namespaceSelector: {}` = all namespaces
8. **For high cardinality**:
   - Labels with unbounded values (pod UID, request ID) explode metric counts
   - Drop or transform via metricRelabelings

Mark DESTRUCTIVE: dropping all metrics via wrong relabel (loses observability), rate-limiting Prometheus too tight (drops scrapes), monitoring high-cardinality without sampling.

---

Metric source: [DESCRIBE]
ServiceMonitor/PodMonitor:
```yaml
[PASTE]
```
Symptom: [DESCRIBE]
Prometheus CR labels: [DESCRIBE]

Why this prompt works

Prometheus Operator simplifies monitoring but has its own model. This prompt walks the configs.

How to use it

  1. Verify CR selectors match SMs.
  2. For missing metrics, check targets page.
  3. Audit cardinality before adding labels.
  4. Test relabel rules carefully.

Useful commands

# Prometheus CR
kubectl get prometheus -A
kubectl get prometheus <name> -o yaml | yq '.spec.serviceMonitorSelector'

# ServiceMonitors
kubectl get servicemonitor -A
kubectl get servicemonitor <name> -o yaml

# Verify labels match
kubectl get servicemonitor <name> -o jsonpath='{.metadata.labels}'

# Access Prometheus UI
kubectl port-forward -n monitoring svc/prometheus 9090:9090
# Open localhost:9090/targets

# Query metric exists
curl -s http://localhost:9090/api/v1/query?query='up{job="myapp"}' | jq

# Check scrape duration / errors
curl -s http://localhost:9090/api/v1/query?query='scrape_duration_seconds{job="myapp"}' | jq

Patterns

Standard ServiceMonitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: web-metrics
  namespace: monitoring
  labels:
    release: kube-prometheus-stack         # must match Prometheus CR selector
spec:
  selector:
    matchLabels:
      app: web
  namespaceSelector:
    matchNames: [default, production]
  endpoints:
  - port: metrics                          # Service port NAME
    path: /metrics
    interval: 30s
    scrapeTimeout: 10s

PodMonitor (no Service needed)

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: worker-metrics
spec:
  selector:
    matchLabels:
      app: worker
  podMetricsEndpoints:
  - port: metrics                          # container port NAME
    interval: 30s
    path: /metrics

With relabeling (drop high-cardinality)

endpoints:
- port: metrics
  metricRelabelings:
  - sourceLabels: [__name__]
    regex: 'http_requests_total'
    action: keep                           # only keep this metric
  - sourceLabels: [path]
    regex: '/api/v1/items/[0-9]+'
    replacement: '/api/v1/items/:id'       # collapse high-card paths
    targetLabel: path
    action: replace
  - regex: 'request_id'
    action: labeldrop                      # drop request_id label entirely

TLS scrape

endpoints:
- port: https
  scheme: https
  tlsConfig:
    caFile: /etc/prometheus/secrets/ca-bundle/ca.crt
    serverName: web.example.com

Common findings this catches

  • Targets page shows 0 active → selector mismatch; check labels.
  • All targets DOWN → wrong port name, network issue, or app not exposing /metrics.
  • Metrics scraped but not visible → name collision; check up{job=...} for job name.
  • Cardinality explosion → drop pod UID, request ID, etc.
  • SM not picked up by Prometheus → CR’s serviceMonitorSelector doesn’t match SM labels.
  • Wrong namespace → namespaceSelector excludes.
  • Scrape timeout → app too slow; or scraping too many metrics.

When to escalate

  • Prometheus OOM at scale → cardinality audit + sharding.
  • Federation across clusters — engage observability team.
  • High-cardinality metrics needed for product — investigate sampling.

Related prompts

Newsletter

Get weekly AI workflows for DevOps engineers

Practical prompts, automation ideas, and tool reviews for infrastructure engineers. One email per week. No spam.