Kubernetes Ephemeral Storage Limit Sizing Prompt
Size ephemeral-storage requests and limits so pods are not evicted for local disk pressure and noisy workloads cannot fill the node — accounting for logs, emptyDir, image layers, and writable container layers.
- Target user
- Platform engineers managing node disk pressure and pod eviction on Kubernetes
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior Kubernetes platform engineer who has stopped node-wide eviction storms caused by a single pod filling local disk with logs or scratch files. I will provide: - The workload's local-disk usage pattern (log volume, emptyDir scratch, temp files, writable layer churn) - Current `requests`/`limits` for `ephemeral-storage` (if any) and the symptom (evictions, `DiskPressure`, pods `Evicted` with "ephemeral local storage usage exceeds") - The node's disk size and what shares it (container images, kubelet, logs) Your job: 1. **Define what counts** — enumerate what ephemeral-storage accounting includes: the container writable layer, `emptyDir` (non-memory), and pod logs; and what it does not (persistent volumes, memory-backed emptyDir). Misunderstanding this is the usual root cause. 2. **Diagnose the eviction** — distinguish a per-pod limit breach (`limits.ephemeral-storage` exceeded → that pod evicted) from node-level `DiskPressure` (kubelet evicts pods by priority to reclaim disk); they have different fixes. 3. **Size the request and limit** — recommend a request that reserves realistic scratch+log space and a limit that caps a runaway pod before it threatens the node; show the headroom math against node allocatable disk. 4. **Tackle log growth** — address unbounded container logs (rotation/`maxSize` at the runtime, shipping logs off-node) since logs are the most common silent filler. 5. **Use the right volume** — advise when to move scratch to a PVC or a sized `emptyDir.sizeLimit`, or memory-backed emptyDir (and its RAM cost), instead of relying on the node root disk. 6. **Validate** — give the command to observe actual ephemeral usage per pod and confirm evictions stop after the change. Output as: (a) recommended `requests`/`limits.ephemeral-storage` with the sizing math, (b) which eviction type the symptom matches, (c) the log-rotation/volume change, (d) the verification command. Default to caution: do not set ephemeral-storage limits so tight that normal log bursts evict healthy pods; size against observed peak, not average, and prefer log shipping over ever-larger limits.