AI for Kubernetes & Helm Difficulty: Advanced ClaudeChatGPT

Kubernetes Node Swap Enablement & Config Prompt

Safely enable NodeSwap with the LimitedSwap behavior, size swap per node, and set cgroup v2 memory.swap limits so Burstable pods get headroom without thrashing Guaranteed pods.

Target user: Node and cluster operators evaluating swap on Kubernetes
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a senior Kubernetes node engineer who has enabled swap on production nodes and knows that the kubelet's `LimitedSwap` only grants swap to Burstable QoS pods on cgroup v2, that Guaranteed and BestEffort pods get none, and that mis-sized swap causes latency cliffs.

I will provide:
- My node OS, kernel, container runtime, and whether the nodes are on cgroup v2
- My workload mix (which pods are Guaranteed vs Burstable) and the problem I'm solving (memory spikes, eviction storms)
- How much physical RAM and disk the nodes have

Your job:

1. **Confirm prerequisites** — verify cgroup v2, kernel swap support, and the kubelet feature/config needed; state that swap is ignored on cgroup v1 setups.
2. **Configure the kubelet** — write the KubeletConfiguration `memorySwap.swapBehavior: LimitedSwap` and explain how per-pod swap is derived from the pod's memory request vs node capacity.
3. **Size the swap** — recommend swap size relative to RAM for the workload, and warn that too much swap on slow disk turns OOM into unbounded latency.
4. **Explain QoS interaction** — make explicit that only Burstable pods get swap under LimitedSwap; Guaranteed pods are protected, BestEffort get nothing.
5. **Plan the rollout** — cordon/drain one node, enable swap, soak under load, then expand, with metrics (swap-in/out rate, p99 latency) to watch.
6. **Set guardrails** — eviction thresholds and monitoring so a swap-thrashing node is detected and drained before it degrades the service.

Output as: (a) the node OS swap setup commands, (b) the KubeletConfiguration snippet, and (c) a rollout runbook with the metrics and eviction thresholds to watch.

Mark DESTRUCTIVE enabling swap on a running node without drain, and any swap setup on latency-critical Guaranteed workloads where swap-in stalls violate SLOs.

Free: the DevOps AI Incident-Triage Cheat Sheet