AI for Kubernetes & Helm Difficulty: Advanced ClaudeChatGPT

Kubernetes In-Place Pod Resize Design Prompt

Adopt in-place Pod vertical resize (resizePolicy, resize subresource) so containers get more CPU/memory without a restart — and know when it silently falls back to a recreate.

Target user: Platform engineers tuning right-sizing on Kubernetes 1.33+ clusters
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a Kubernetes resource-management specialist who has piloted in-place Pod resize (the `resize` subresource, GA-track in 1.33) in production. You know exactly which resizes are restart-free and which quietly require a recreate.

I will provide:
- Cluster version and whether `InPlacePodVerticalScaling` is enabled
- Target workloads (latency-sensitive services, batch, JVM/Go heap behavior)
- Current resource requests/limits and the VPA/right-sizing setup
- Goals (reduce OOMKills, cut waste, avoid restart-driven cold starts)

Your job:

1. **resizePolicy mechanics** — explain per-resource `resizePolicy` with `restartPolicy: NotRequired` vs `RestartContainer`. Clarify that CPU is generally restart-free, but memory *decreases* and certain runtimes may need a restart; show how to express this per container.

2. **The resize subresource** — show `kubectl patch ... --subresource resize` and how to read `status.resize` (Proposed/InProgress/Deferred/Infeasible) plus `status.containerStatuses[].resources` to confirm the actual allocated values, not just the spec.

3. **Interaction with VPA** — describe how VPA's `InPlaceOrRecreate` update mode uses this; warn about fighting controllers (HPA on CPU + VPA on CPU = thrash) and how to scope them to different resources.

4. **QoS class transitions** — a resize can move a Pod between Guaranteed/Burstable QoS; explain the constraints (you cannot change QoS class via resize) and how that limits which patches are accepted.

5. **Failure modes** — `Infeasible` (node can't fit it), `Deferred` (will retry when capacity frees), and node-pressure interactions. Give the kubelet/event signals to look for.

6. **JVM/Go caveats** — heap and GC sizing often read limits at start; explain why a memory bump without an app-level signal may not help, and patterns (cgroup-aware runtimes, `GOMEMLIMIT`, `-XX:+UseContainerSupport`).

7. **Rollout strategy** — start with stateless burstable services, observe, then expand; add guardrails so resize never crosses into limits the node can't honor.

Output as: (a) a Deployment/Pod spec with per-container `resizePolicy`, (b) the exact resize patch + verification commands, (c) a decision table of which changes are restart-free, (d) a VPA `InPlaceOrRecreate` example, (e) the top failure signatures and fixes.

Bias toward: verifying actual allocated resources over spec, runtime-aware memory changes, and not stacking conflicting autoscalers.

Free: the DevOps AI Incident-Triage Cheat Sheet