Docker Error Guide: 'OOMKilled' Exit Code 137 Out-of-Memory Container Kills
Fix Docker OOMKilled and exit code 137: raise memory limits, make the JVM and Node cgroup-aware, find leaks, and read dmesg to confirm the kernel OOM kill.
- #docker
- #troubleshooting
- #errors
- #runtime
Exact Error Message
A container that the kernel killed for exceeding its memory budget shows OOMKilled: true in its state and exits with code 137:
$ docker inspect myapp --format '{{.State.OOMKilled}} {{.State.ExitCode}}'
true 137
The kill itself is recorded by the kernel, visible in dmesg:
[1234567.890] Memory cgroup out of memory: Killed process 4821 (java) total-vm:5242880kB, anon-rss:2097152kB, file-rss:0kB
[1234567.891] Out of memory: Killed process 4821 (java) score 932 or sacrifice child
You may also see the higher-level state field as "State.OOMKilled": true in the full docker inspect JSON, and the container status Exited (137) in docker ps -a.
What It Means
OOMKilled: true means the Linux kernel’s out-of-memory killer terminated the container’s main process because it tried to use more memory than was available to it. Exit code 137 is 128 + 9 — the process received SIGKILL (signal 9), which is the OOM killer’s blunt instrument; the process cannot catch or clean up after it.
There are two distinct flavours. If you set --memory on the container, the kill comes from the memory cgroup: the container hit its own limit and only that container was killed (dmesg says “Memory cgroup out of memory”). If you set no limit, the kill comes from the host running out of physical memory + swap, and the kernel picks a victim by OOM score — which may or may not be your container (“Out of memory: Killed process”). Either way the container exits 137 and OOMKilled is true.
One subtlety trips people up constantly: exit code 137 on its own does not prove an OOM kill. Any SIGKILL — a docker kill, a stuck shutdown that Docker force-kills after the stop timeout, or an external supervisor sending signal 9 — produces the same 128 + 9 = 137. The authoritative signal is OOMKilled: true in the container state combined with a matching line in dmesg. If OOMKilled is false but the exit code is 137, something else sent the kill and you should look at orchestration or stop-timeout behaviour rather than memory limits.
Common Causes
- Container exceeded its
--memorylimit. The workload’s real footprint is larger than the cgroup limit you set. - Host OOM with no limit set. Several unbounded containers compete for host RAM and the kernel kills one.
- Runtime not cgroup-aware. The JVM or Node sizes its heap from the host’s total RAM, not the container’s cgroup limit, so it allocates a heap larger than the limit allows and gets killed under load.
- A genuine memory leak. Usage climbs steadily until it crosses the limit; restarts only reset the clock.
- Limit set too low for the workload’s legitimate peak (large request, batch job, cache warm-up).
- Page cache / kernel memory pressure counted against the cgroup, pushing a borderline container over the edge.
How to Reproduce the Error
Run a container with a small limit and a process that allocates more than that:
docker run --rm --memory=64m --memory-swap=64m polinux/stress \
stress --vm 1 --vm-bytes 200M --vm-hang 0
stress: FAIL: [1] (...) process exited
docker inspect <id> --format '{{.State.OOMKilled}} {{.State.ExitCode}}'
true 137
The --vm-bytes 200M allocation blows past the 64m cgroup limit, and the kernel OOM-kills the stress process immediately.
Diagnostic Commands
Confirm it really was an OOM kill, not an application crash that also returned 137:
docker inspect myapp --format 'OOMKilled={{.State.OOMKilled}} Exit={{.State.ExitCode}} Reason={{.State.Error}}'
docker ps -a --filter name=myapp --format '{{.Names}} {{.Status}}'
Watch live memory against the limit while reproducing load:
docker stats --no-stream
docker stats myapp --format '{{.Name}}: {{.MemUsage}} ({{.MemPerc}})'
Read the kernel’s record of the kill — this is the authoritative source:
dmesg -T | grep -i -E 'oom|killed process|out of memory' | tail -20
journalctl -k --no-pager | grep -i 'out of memory' | tail -10
Check the limit the container was actually given, and the daemon’s overall view:
docker inspect myapp --format 'limit={{.HostConfig.Memory}} swap={{.HostConfig.MemorySwap}}'
docker info --format 'Total mem: {{.MemTotal}} | Swap limit support: {{.SwapLimit}}'
Step-by-Step Resolution
1. Confirm the kill source. If dmesg says “Memory cgroup out of memory”, the container hit its own --memory limit. If it says “Out of memory” without “cgroup”, the host ran out — reduce total committed memory or add capacity.
2. Right-size the limit (or find the leak). If peak usage is legitimately higher than the limit, raise it:
docker update --memory=512m --memory-swap=512m myapp # for a running container
# or in run:
docker run --memory=512m --memory-swap=512m myapp:latest
If usage climbs without bound under steady load, you have a leak — profile the app; raising the limit only delays the kill.
3. Make the JVM cgroup-aware. Modern JVMs honour cgroup limits, but pin the heap as a fraction of the container limit instead of a fixed -Xmx tied to host RAM:
docker run --memory=1g openjdk:21 java -XX:MaxRAMPercentage=75.0 -jar app.jar
4. Cap the Node.js heap to fit inside the cgroup limit (V8 defaults to ~host RAM):
docker run --memory=1g node:22 node --max-old-space-size=768 server.js
5. Account for swap correctly. --memory-swap sets total memory + swap. Setting --memory=512m alone with default swap can let the container use swap and mask the real RSS; setting --memory-swap equal to --memory disables swap so the limit is hard and the kill is deterministic. For predictable behaviour in production, keep them equal.
6. Add headroom and monitoring. Leave ~20-25% above observed peak, and alert on MemPerc so you catch creep before the kill. Always set explicit limits on every container so one workload cannot OOM the host. When the host itself is the victim (no cgroup line in dmesg), the fix is not on one container — reduce total committed memory across containers, add RAM, or set limits on the previously unbounded workloads so the kernel has a bounded total to schedule against.
How to Prevent the Issue
- Set an explicit
--memorylimit on every container so the blast radius of a leak is one container, not the whole host. - Make every managed runtime cgroup-aware:
-XX:MaxRAMPercentagefor the JVM,--max-old-space-sizefor Node, equivalent flags elsewhere. - Size limits from observed peak
docker stats, not guesses, and leave 20-25% headroom for spikes and page cache. - Alert on memory percentage trending toward the limit so a slow leak is caught before exit code 137.
- Load-test at the configured limit so the OOM kill happens in CI, not production. See more in Docker guides.
Related Docker Errors
- Kubernetes Error: OOMKilled — the same kernel mechanism inside Kubernetes pods, where the container’s memory
limitsplay the role of--memory. The diagnosis (dmesg, exit 137) is identical; the fix is set on the pod spec. - Docker Error: OCI runtime create failed — for container startup failures that are not memory related but also surface as non-zero exits.
- Browse the full Docker troubleshooting category for more runtime error guides.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.