Prometheus ServiceMonitor & PodMonitor Configuration Prompt
Configure Prometheus Operator scrape — ServiceMonitor, PodMonitor, target discovery, label rewriting, missing metrics debugging.
- Target user
- SREs configuring Prometheus monitoring
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior SRE who has set up Prometheus Operator at scale — ServiceMonitors, PodMonitors, label rewriting, target discovery, federation.
I will provide:
- The metric source (app, exporter, sidecar)
- Current ServiceMonitor/PodMonitor spec
- Symptom (metrics missing, scrape failing, label too high cardinality)
Your job:
1. **ServiceMonitor vs PodMonitor**:
- **ServiceMonitor** — discovers via Service endpoints (recommended; works with Headless too)
- **PodMonitor** — discovers pods directly (no Service required)
- Both result in scrape targets
2. **For metrics missing**:
- Check Prometheus targets page (port-forward to prometheus + /targets)
- Target down = scrape failure
- Target missing = no service/pod matches selector
3. **For Prometheus Operator selecting your SM**:
- Prometheus CR has `serviceMonitorSelector`
- Default install: selects all SMs OR namespace-scoped
- Verify label match
4. **For label rewriting**:
- `metricRelabelings` (post-scrape, before ingest)
- `relabelings` (pre-scrape, on discovery)
- Common: drop high-cardinality labels, rename
5. **For port specification**:
- SM: `port` is the Service port name (not number)
- PodMonitor: `port` is the container port name
6. **For path / scheme**:
- Default `/metrics` HTTP
- Custom for non-standard exporters
- TLS for HTTPS scrape
7. **For namespace selection**:
- SM in `monitoring` namespace can monitor across namespaces (depends on `namespaceSelector`)
- `namespaceSelector: {}` = all namespaces
8. **For high cardinality**:
- Labels with unbounded values (pod UID, request ID) explode metric counts
- Drop or transform via metricRelabelings
Mark DESTRUCTIVE: dropping all metrics via wrong relabel (loses observability), rate-limiting Prometheus too tight (drops scrapes), monitoring high-cardinality without sampling.
---
Metric source: [DESCRIBE]
ServiceMonitor/PodMonitor:
```yaml
[PASTE]
```
Symptom: [DESCRIBE]
Prometheus CR labels: [DESCRIBE]
Why this prompt works
Prometheus Operator simplifies monitoring but has its own model. This prompt walks the configs.
How to use it
- Verify CR selectors match SMs.
- For missing metrics, check targets page.
- Audit cardinality before adding labels.
- Test relabel rules carefully.
Useful commands
# Prometheus CR
kubectl get prometheus -A
kubectl get prometheus <name> -o yaml | yq '.spec.serviceMonitorSelector'
# ServiceMonitors
kubectl get servicemonitor -A
kubectl get servicemonitor <name> -o yaml
# Verify labels match
kubectl get servicemonitor <name> -o jsonpath='{.metadata.labels}'
# Access Prometheus UI
kubectl port-forward -n monitoring svc/prometheus 9090:9090
# Open localhost:9090/targets
# Query metric exists
curl -s http://localhost:9090/api/v1/query?query='up{job="myapp"}' | jq
# Check scrape duration / errors
curl -s http://localhost:9090/api/v1/query?query='scrape_duration_seconds{job="myapp"}' | jq
Patterns
Standard ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: web-metrics
namespace: monitoring
labels:
release: kube-prometheus-stack # must match Prometheus CR selector
spec:
selector:
matchLabels:
app: web
namespaceSelector:
matchNames: [default, production]
endpoints:
- port: metrics # Service port NAME
path: /metrics
interval: 30s
scrapeTimeout: 10s
PodMonitor (no Service needed)
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: worker-metrics
spec:
selector:
matchLabels:
app: worker
podMetricsEndpoints:
- port: metrics # container port NAME
interval: 30s
path: /metrics
With relabeling (drop high-cardinality)
endpoints:
- port: metrics
metricRelabelings:
- sourceLabels: [__name__]
regex: 'http_requests_total'
action: keep # only keep this metric
- sourceLabels: [path]
regex: '/api/v1/items/[0-9]+'
replacement: '/api/v1/items/:id' # collapse high-card paths
targetLabel: path
action: replace
- regex: 'request_id'
action: labeldrop # drop request_id label entirely
TLS scrape
endpoints:
- port: https
scheme: https
tlsConfig:
caFile: /etc/prometheus/secrets/ca-bundle/ca.crt
serverName: web.example.com
Common findings this catches
- Targets page shows 0 active → selector mismatch; check labels.
- All targets DOWN → wrong port name, network issue, or app not exposing /metrics.
- Metrics scraped but not visible → name collision; check
up{job=...}for job name. - Cardinality explosion → drop pod UID, request ID, etc.
- SM not picked up by Prometheus → CR’s serviceMonitorSelector doesn’t match SM labels.
- Wrong namespace → namespaceSelector excludes.
- Scrape timeout → app too slow; or scraping too many metrics.
When to escalate
- Prometheus OOM at scale → cardinality audit + sharding.
- Federation across clusters — engage observability team.
- High-cardinality metrics needed for product — investigate sampling.
Related prompts
-
Kubernetes Events Analysis Prompt
Filter, aggregate, and decode Kubernetes events — FailedScheduling, BackOff, ProvisioningFailed — to diagnose cluster-wide issues from noisy event streams.
-
Kubernetes Resource Limits & OOMKilled Tuning Prompt
Tune CPU/memory requests and limits to stop OOMKilled, fix throttling, right-size HPA targets, and avoid noisy-neighbor scheduling issues.
-
Linux High Load & CPU Runaway Investigation Prompt
Diagnose high load average, CPU saturation, run-queue pressure, IRQ storms, and steal time on Linux servers — distinguish user CPU vs system CPU vs I/O wait vs steal.