Kubernetes Error Guide: 'OOMKilled' Exit Code 137 Out-of-Memory Kills
Fix OOMKilled (exit code 137) in Kubernetes: diagnose low memory limits, leaks, JVM/Node heaps ignoring cgroups, batch spikes, greedy sidecars, and node pressure.
- #kubernetes
- #troubleshooting
- #errors
- #memory
Overview
OOMKilled is the termination reason Kubernetes reports when the Linux kernel’s out-of-memory killer terminates a container for exceeding its memory limit. The container’s cgroup hits its memory.limit_in_bytes, the kernel kills the process group, and the container exits with code 137 (128 + signal 9, SIGKILL). Kubernetes records the reason as OOMKilled and, with the default Always restart policy, restarts the container — often straight into a CrashLoopBackOff if it keeps blowing the limit.
You will see this in the pod’s last state:
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
It occurs whenever a container’s working set exceeds resources.limits.memory — a limit set too low, a slow memory leak, a runtime that ignores the cgroup limit, a sudden large request or batch, or a noisy sidecar consuming the pod’s shared budget. This is distinct from a node-level eviction, where the kubelet evicts pods under node memory pressure rather than the kernel OOM-killing a single container.
Symptoms
kubectl describe podshowsLast State: Terminated,Reason: OOMKilled,Exit Code: 137.- The pod restarts repeatedly and may settle into
CrashLoopBackOff. kubectl top podshows memory sitting at or near the container’s limit just before the kill.- No application stack trace in logs — the process is killed abruptly by SIGKILL, so it cannot log a graceful shutdown.
kubectl get pods -l app=worker
NAME READY STATUS RESTARTS AGE
worker-5f7d8c9b6d-h4kpz 0/1 OOMKilled 3 (30s ago) 6m18s
kubectl describe pod worker-5f7d8c9b6d-h4kpz | grep -A5 'Last State'
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Mon, 23 Jun 2026 14:08:02 +0000
Finished: Mon, 23 Jun 2026 14:11:47 +0000
Common Root Causes
1. Memory limit set too low for the workload
The most common cause: the container’s normal working set is simply larger than resources.limits.memory, so it is killed as soon as it warms up.
kubectl get pod worker-5f7d8c9b6d-h4kpz -o jsonpath='{.spec.containers[0].resources}'
kubectl top pod worker-5f7d8c9b6d-h4kpz --containers
{"limits":{"memory":"256Mi"},"requests":{"memory":"128Mi"}}
POD NAME CPU(cores) MEMORY(bytes)
worker-5f7d8c9b6d-h4kpz worker 210m 255Mi
Memory pins at 255Mi against a 256Mi limit — the limit is too small for steady state. Raise limits.memory to the observed peak plus headroom.
2. Memory leak in the application
If memory climbs steadily over time and never plateaus, the app is leaking; the limit only determines when it gets killed, not whether.
kubectl top pod worker-5f7d8c9b6d-h4kpz --containers
# observe over time, e.g. with watch
kubectl describe pod worker-5f7d8c9b6d-h4kpz | grep 'Restart Count'
worker-5f7d8c9b6d-h4kpz worker 180m 240Mi # rising every poll
Restart Count: 3
A rising trend with periodic OOM kills (restarts every few minutes, each at a higher run length) signals a leak. Fix the leak; raising the limit only delays the kill.
3. JVM / Node heap not aware of the cgroup limit
Older or misconfigured runtimes size their heap from the node’s total memory, not the container’s cgroup limit, so the heap alone can exceed limits.memory.
kubectl get pod worker-5f7d8c9b6d-h4kpz -o jsonpath='{.spec.containers[0].env}'
kubectl exec worker-5f7d8c9b6d-h4kpz -- java -XX:+PrintFlagsFinal -version 2>/dev/null | grep -i MaxHeapSize
[{"name":"JAVA_OPTS","value":"-server"}]
size_t MaxHeapSize = 8589934592 {product} # 8Gi heap in a 1Gi pod
The JVM picked an 8Gi heap on a 32Gi node despite a 1Gi limit. Set -XX:MaxRAMPercentage (JVM) or --max-old-space-size (Node) to fit the container limit.
4. Large request or batch spike
A single oversized request, a big query result, or a batch job that loads more data than usual pushes the working set past the limit momentarily and triggers the OOM kill.
kubectl get events --field-selector involvedObject.name=worker-5f7d8c9b6d-h4kpz | grep -i oom
kubectl logs worker-5f7d8c9b6d-h4kpz --previous --tail=5
Warning OOMKilling Memory cgroup out of memory: Killed process 1 (worker)
2026-06-23T14:11:46Z processing batch id=8821 rows=4200000
The kill coincides with an unusually large batch. Stream/paginate the work, cap batch size, or raise the limit to cover the worst-case batch.
5. Sidecar competing for the pod’s memory
In a multi-container pod, each container has its own limit, but a greedy sidecar (logging agent, proxy) can be the one OOM-killed — check which container died, not just the pod.
kubectl describe pod worker-5f7d8c9b6d-h4kpz | grep -B1 -A4 'OOMKilled'
kubectl top pod worker-5f7d8c9b6d-h4kpz --containers
istio-proxy:
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
POD NAME MEMORY(bytes)
worker-5f7d8c9b6d-h4kpz worker 180Mi
worker-5f7d8c9b6d-h4kpz istio-proxy 128Mi # at its 128Mi limit
Here the sidecar, not the app, hit its limit. Raise the sidecar’s limits.memory independently.
6. Node memory pressure: eviction vs. container OOM
Distinguish a kernel OOM kill of one container (Reason: OOMKilled) from a kubelet eviction under node memory pressure (Reason: Evicted) — the symptoms look similar but the fix differs.
kubectl get pods -A --field-selector status.phase=Failed | grep -i evict
kubectl describe node <NODE> | grep -A5 'MemoryPressure\|Conditions'
prod worker-5f7d8c9b6d-22abc 0/1 Evicted 0 3m
MemoryPressure True Mon, 23 Jun 2026 14:12:00 +0000 KubeletHasInsufficientMemory
Evicted + node MemoryPressure True means the node is out of memory, not the container’s cgroup. Fix overcommit: set proper requests, add capacity, or move the pod — raising one container’s limit will not help.
Diagnostic Workflow
Step 1: Confirm OOMKilled and the exit code
kubectl describe pod <POD> | grep -A5 'Last State'
kubectl get pod <POD> -o jsonpath='{.status.containerStatuses[*].lastState.terminated.reason}'
Reason: OOMKilled with Exit Code: 137 confirms a kernel OOM kill (not a generic crash or SIGTERM 143).
Step 2: Identify which container was killed
kubectl get pod <POD> -o jsonpath='{range .status.containerStatuses[*]}{.name}={.lastState.terminated.reason}{"\n"}{end}'
In multi-container pods this tells you whether the app or a sidecar hit its limit, which determines which limit to change.
Step 3: Compare usage against the limit
kubectl top pod <POD> --containers
kubectl get pod <POD> -o jsonpath='{range .spec.containers[*]}{.name} {.resources.limits.memory}{"\n"}{end}'
Usage sitting at the limit means the limit is too low or the workload spiked; a steady climb means a leak.
Step 4: Check for node-level pressure
kubectl get pods -A --field-selector status.phase=Failed | grep -i evict
kubectl describe node <NODE> | grep -A5 'Conditions'
kubectl top node
If the node shows MemoryPressure True and pods are Evicted, treat it as a node capacity/overcommit problem, not a per-container limit.
Step 5: Inspect runtime heap settings and events
kubectl get events --field-selector involvedObject.name=<POD> | grep -i oom
kubectl get pod <POD> -o jsonpath='{.spec.containers[0].env}'
For JVM/Node apps, confirm the heap is bounded relative to the cgroup limit (-XX:MaxRAMPercentage, --max-old-space-size).
Example Root Cause Analysis
A reporting service report-gen keeps restarting; the pod alternates between Running and OOMKilled.
kubectl describe pod report-gen-6c4f7d9b8-tn3vw | grep -A5 'Last State'
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
It is a Java service with a 1Gi limit. We compare usage to the limit:
kubectl top pod report-gen-6c4f7d9b8-tn3vw --containers
kubectl get pod report-gen-6c4f7d9b8-tn3vw -o jsonpath='{.spec.containers[0].resources.limits.memory}'
POD NAME MEMORY(bytes)
report-gen-6c4f7d9b8-tn3vw report-gen 1010Mi
1Gi
Usage is right at the 1Gi limit. We then check the JVM heap sizing:
kubectl exec report-gen-6c4f7d9b8-tn3vw -- java -XX:+PrintFlagsFinal -version 2>/dev/null | grep -i MaxHeapSize
size_t MaxHeapSize = 8589934592 {product}
The JVM sized an 8Gi heap from the 32Gi node total, ignoring the 1Gi container limit. The heap reservation plus off-heap usage blows past 1Gi the moment real work starts, and the kernel OOM-kills it.
Fix: bound the heap to the cgroup limit and give a little headroom, then restart:
kubectl set env deployment/report-gen JAVA_OPTS="-XX:MaxRAMPercentage=75.0"
kubectl set resources deployment/report-gen --limits=memory=1Gi --requests=memory=768Mi
kubectl rollout restart deployment report-gen
The heap now caps at ~768Mi inside the 1Gi limit, the OOM kills stop, and the pod stays 1/1 Running.
Prevention Best Practices
- Set
limits.memoryfrom observed peak usage plus headroom, and setrequests.memoryso the scheduler reserves real memory rather than overcommitting the node. - Make runtimes cgroup-aware: use
-XX:MaxRAMPercentagefor the JVM and--max-old-space-sizefor Node so the heap never exceeds the container limit. - Watch for steady upward memory trends in dashboards — a leak shows as a saw-tooth of climb-then-OOM, and raising the limit only changes the period, not the outcome.
- Give sidecars their own realistic limits; a logging agent or proxy at its cap will be OOMKilled independently of the app.
- Cap batch/request sizes and stream large payloads so a single oversized job cannot spike the working set past the limit.
- Protect nodes from overcommit with requests, ResourceQuotas, and headroom so kernel OOM kills do not escalate into node-wide
MemoryPressureevictions. See more in Kubernetes & Helm guides.
Quick Command Reference
# Confirm OOMKilled and exit code 137
kubectl describe pod <POD> | grep -A5 'Last State'
kubectl get pod <POD> -o jsonpath='{.status.containerStatuses[*].lastState.terminated.reason}'
# Which container was killed (multi-container pods)
kubectl get pod <POD> -o jsonpath='{range .status.containerStatuses[*]}{.name}={.lastState.terminated.reason}{"\n"}{end}'
# Usage vs. limit
kubectl top pod <POD> --containers
kubectl get pod <POD> -o jsonpath='{range .spec.containers[*]}{.name} {.resources.limits.memory}{"\n"}{end}'
# Node-level pressure / evictions
kubectl get pods -A --field-selector status.phase=Failed | grep -i evict
kubectl describe node <NODE> | grep -A5 'Conditions'
kubectl top node
# OOM events and runtime heap env
kubectl get events --field-selector involvedObject.name=<POD> | grep -i oom
kubectl get pod <POD> -o jsonpath='{.spec.containers[0].env}'
# Apply a fix
kubectl set resources deployment <DEPLOY> --limits=memory=1Gi --requests=memory=768Mi
kubectl rollout restart deployment <DEPLOY>
Conclusion
OOMKilled with exit code 137 means the kernel killed a container for exceeding its cgroup memory limit. The usual root causes:
- The
limits.memoryis set too low for the container’s normal working set. - A memory leak in the application that climbs until it hits the limit.
- A JVM/Node heap sized from node memory instead of the container’s cgroup limit.
- A large request or batch spike that pushes the working set over the limit.
- A sidecar (proxy/logging agent) hitting its own limit rather than the app.
- Node memory pressure causing kubelet eviction (
Reason: Evicted) rather than a per-container OOM.
Confirm Reason: OOMKilled and which container died first, then compare kubectl top against the limit — that distinguishes a too-small limit from a leak, a greedy sidecar, or node-wide pressure before you change anything.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.