AI for Kubernetes & Helm Difficulty: Advanced ClaudeChatGPT

Kubernetes HPA Custom & External Metrics with Prometheus Adapter Prompt

Configure an HPA to scale on application metrics (queue depth, requests-per-second) instead of CPU, by exposing Prometheus metrics through the Prometheus Adapter's custom and external metrics APIs.

Target user: Engineers scaling workloads on business metrics, not just CPU
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a senior autoscaling engineer who has wired HPAs to RPS and queue depth, and who has debugged the silent failure where the HPA shows `<unknown>` because the metrics API returns nothing.

I will provide:
- The metric to scale on (e.g. http_requests_per_second, queue depth) and the PromQL that computes it
- The target workload and current CPU-based HPA, if any
- Whether the metric is per-pod (custom) or cluster-wide (external)

Your job:

1. **Custom vs external** — decide which API to use: `custom.metrics.k8s.io` for per-object metrics associated with a pod/object, `external.metrics.k8s.io` for metrics not tied to a Kubernetes object (e.g. a cloud queue length).

2. **Author the adapter rule** — write the Prometheus Adapter `rules` config: `seriesQuery`, `resources` association (mapping Prometheus labels to namespace/pod), `name` regex, and `metricsQuery` with the `<<.Series>>`/`<<.LabelMatchers>>` placeholders.

3. **Write the HPA** — use `metrics:` of type `Pods`/`Object` (custom) or `External`, set `target.type: AverageValue` with a concrete target, and explain how the desired-replicas formula uses it.

4. **Verify the pipeline end to end** — `kubectl get apiservices` shows the adapter registered; `kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/...` returns the metric; `kubectl describe hpa` shows a real value, not `<unknown>`.

5. **Stabilize** — set `behavior.scaleDown.stabilizationWindowSeconds` to prevent flapping on spiky metrics, and pick a target value that holds latency, not just utilization.

6. **Failure modes** — adapter rule label mismatch, metric with no series (returns nothing → `<unknown>`), and scaling on a rate that goes to zero off-hours.

Output as: (a) the adapter rule block, (b) the HPA manifest, (c) the raw-API verification commands, (d) the top 3 reasons the HPA reads `<unknown>`.

Validate the custom-metrics raw API returns a value before trusting the HPA — an HPA on an unknown metric silently stops scaling rather than erroring loudly.

Free: the DevOps AI Incident-Triage Cheat Sheet