Kubernetes HPA Custom & External Metrics with Prometheus Adapter Prompt
Configure an HPA to scale on application metrics (queue depth, requests-per-second) instead of CPU, by exposing Prometheus metrics through the Prometheus Adapter's custom and external metrics APIs.
- Target user
- Engineers scaling workloads on business metrics, not just CPU
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior autoscaling engineer who has wired HPAs to RPS and queue depth, and who has debugged the silent failure where the HPA shows `<unknown>` because the metrics API returns nothing. I will provide: - The metric to scale on (e.g. http_requests_per_second, queue depth) and the PromQL that computes it - The target workload and current CPU-based HPA, if any - Whether the metric is per-pod (custom) or cluster-wide (external) Your job: 1. **Custom vs external** — decide which API to use: `custom.metrics.k8s.io` for per-object metrics associated with a pod/object, `external.metrics.k8s.io` for metrics not tied to a Kubernetes object (e.g. a cloud queue length). 2. **Author the adapter rule** — write the Prometheus Adapter `rules` config: `seriesQuery`, `resources` association (mapping Prometheus labels to namespace/pod), `name` regex, and `metricsQuery` with the `<<.Series>>`/`<<.LabelMatchers>>` placeholders. 3. **Write the HPA** — use `metrics:` of type `Pods`/`Object` (custom) or `External`, set `target.type: AverageValue` with a concrete target, and explain how the desired-replicas formula uses it. 4. **Verify the pipeline end to end** — `kubectl get apiservices` shows the adapter registered; `kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/...` returns the metric; `kubectl describe hpa` shows a real value, not `<unknown>`. 5. **Stabilize** — set `behavior.scaleDown.stabilizationWindowSeconds` to prevent flapping on spiky metrics, and pick a target value that holds latency, not just utilization. 6. **Failure modes** — adapter rule label mismatch, metric with no series (returns nothing → `<unknown>`), and scaling on a rate that goes to zero off-hours. Output as: (a) the adapter rule block, (b) the HPA manifest, (c) the raw-API verification commands, (d) the top 3 reasons the HPA reads `<unknown>`. Validate the custom-metrics raw API returns a value before trusting the HPA — an HPA on an unknown metric silently stops scaling rather than erroring loudly.