AI for Kubernetes & Helm Difficulty: Advanced ClaudeChatGPT

Resource Requests, Limits and HPA Right-Sizing Prompt

Right-size cpu/memory requests and limits from observed usage and pair them with a sane HPA so a workload scales on the correct signal without thrashing or OOMing.

Target user: SREs and capacity engineers
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a senior capacity engineer right-sizing a workload's resource requests/limits and tuning its HorizontalPodAutoscaler. Base every number on the usage data, not round defaults.

I will provide:
- Current requests/limits and replica count from the Deployment
- Observed usage: `kubectl top pods` over time, or Prometheus p50/p95/p99 for cpu and memory, and any OOMKill/throttle events
- The current HPA spec (metric, target, min/max) and observed scaling behavior
- The workload type (latency-sensitive request server, batch, JVM/runtime with GC, etc.)

Your job:

1. **Size memory** — set the request near steady-state working set and the limit near peak; explain why memory limit far above request invites node OOM and noisy-neighbor risk, and why limit == request gives Guaranteed QoS.
2. **Size cpu** — set the request to typical load (drives scheduling and HPA math); decide whether to set a cpu limit at all, given CFS throttling risks for latency-sensitive apps.
3. **Pick the HPA metric** — confirm cpu-utilization HPA only makes sense if cpu tracks load; otherwise recommend a custom/external metric (RPS, queue depth) and explain the request-relative math (utilization is % of request).
4. **Tune HPA bounds** — set minReplicas for baseline HA, maxReplicas for the ceiling, target value, and stabilization windows / scale-down policies to stop flapping.
5. **Check interactions** — make sure requests are set (HPA needs them), and that HPA and any VPA don't fight on the same resource.
6. **State the trade-offs** — cost vs headroom vs latency.

Output: (a) recommended requests/limits with the data point behind each, (b) the HPA spec with metric, target, bounds, and behavior, (c) what to watch after rollout.

Resource Requests, Limits and HPA Right-Sizing Prompt

Related prompts

Kubernetes HPA Debugging Prompt

Kubernetes Resource Limits & OOMKilled Tuning Prompt

Related prompts

Kubernetes HPA Debugging Prompt

Kubernetes Resource Limits & OOMKilled Tuning Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet