Linux Error: Out of memory: Killed process

Summary

Out of memory: Killed process is the Linux kernel’s OOM killer terminating a running process because an allocation could not be satisfied and there was nothing left to reclaim. It fires at the moment of a failing allocation — a growing heap, a fork, a page fault that cannot be backed — and the victim is chosen by oom_score, not necessarily the process that triggered it. This is the opposite of Cannot allocate memory/ENOMEM, which rejects an allocation up front before any process dies. The kill is either host-wide (CONSTRAINT_NONE) or scoped to a single cgroup hitting its memory.max/systemd MemoryMax (CONSTRAINT_MEMCG); containers killed this way exit with code 137 (128 + signal 9).

Common Symptoms

A process disappears with no application-level error; the service restarts or simply vanishes.
dmesg/journal shows Out of memory: Killed process or Memory cgroup out of memory.
A container shows OOMKilled status and exit code 137.
free -h shows near-zero available and little or no swap right before the kill.
A service on a host with plenty of free RAM keeps dying — a tell for a cgroup-scoped kill.

Most Likely Causes of the ‘Out of memory: Killed process’ Error

Ordered by how often they cause Out of memory: Killed process in production:

A cgroup hitting its memory.max / systemd MemoryMax while the host still has free RAM — reported as CONSTRAINT_MEMCG. The single most common self-inflicted case.
A container with no or too-low a memory limit, OOM-killed with exit code 137 (OOMKilled).
A leaking process accumulating the largest RSS and the highest oom_score, selected as the victim.
Host RAM exhausted with no swap to absorb the spike — CONSTRAINT_NONE, host-wide.
Page-cache and dirty-page pressure against large anonymous memory that reclaim cannot free fast enough.
Too-strict overcommit (vm.overcommit_memory=2) denying allocations at the CommitLimit.

Quick Triage

# Did an OOM kill occur, and was it host-wide or cgroup-scoped?
dmesg -T | grep -i -E 'oom|killed process'
journalctl -k | grep -i oom

# Memory state at the time of the kill.
free -h
swapon --show

In the kernel line, the constraint= field is the pivot: CONSTRAINT_NONE is host-wide, CONSTRAINT_MEMCG is a cgroup/container limit. That single word drives the whole investigation.

Diagnostic Commands

dmesg -T | grep -i -E 'oom|killed process' | tail -10
journalctl -k | grep -i oom

Note the victim pid/comm and the constraint= field. Example: oom-kill:constraint=CONSTRAINT_MEMCG,oom_memcg=/system.slice/app.service,task=python3,pid=8123.

free -h                                  # available memory + swap
swapon --show                            # is there any swap to spill to?
cat /proc/meminfo                        # AnonPages, Cached, Dirty, Committed_AS

Swap: 0B with near-zero available means the next sizeable allocation triggers a host-wide kill.

# Highest oom_score = the kernel's next victim.
for p in /proc/[0-9]*; do s=$(cat "$p/oom_score" 2>/dev/null) || continue; \
  printf '%5s %s %s\n' "$s" "${p#/proc/}" "$(cat "$p/comm")"; done | sort -rn | head -5
cat /proc/<pid>/oom_score /proc/<pid>/oom_score_adj

A single process far above the rest is your leaking or largest workload.

systemctl show <unit>.service -p MemoryMax -p MemoryCurrent
cat /sys/fs/cgroup/system.slice/<unit>.service/memory.max
cat /sys/fs/cgroup/system.slice/<unit>.service/memory.current
cat /sys/fs/cgroup/system.slice/<unit>.service/memory.events   # oom / oom_kill counters

MemoryCurrent near MemoryMax with a non-zero oom_kill in memory.events pins the kill to that unit’s own limit (cgroup v2 layout, default on modern Ubuntu/Debian and RHEL/Rocky).

cat /proc/sys/vm/overcommit_memory       # 2 = strict, denies at CommitLimit
cat /proc/sys/vm/overcommit_ratio
sysctl vm.swappiness
grep -E 'CommitLimit|Committed_AS' /proc/meminfo

Rule out a too-strict overcommit policy or a missing swap device before blaming the workload.

systemd-cgtop --order=memory -n1         # live per-cgroup memory ranking
vmstat 1 5                               # si/so swap-in/out, r/b queues
smem -tk                                 # PSS-accurate per-process memory (if installed)

# Containers: confirm the OOM kill and the limit.
docker inspect -f '{{.State.OOMKilled}} {{.State.ExitCode}} mem={{.HostConfig.Memory}}' <container>
docker stats --no-stream <container>

State.OOMKilled=true, ExitCode=137, and usage pinned at the limit confirm a container hit its own ceiling.

Fix / Remediation

Match the fix to the constraint you found — safe, non-destructive changes first.

Raise a cgroup / unit ceiling to the real working set (cgroup-scoped kill):

sudo systemctl edit <unit>.service
# [Service]
# MemoryMax=1G
sudo systemctl daemon-reload
sudo systemctl restart <unit>.service

Raise a container’s limit to measured usage plus headroom: docker run --memory=1g ..., or set resources.limits.memory in Kubernetes.

Protect critical daemons so the kernel kills the right workload first:

sudo systemctl edit sshd.service
# [Service]
# OOMScoreAdjust=-800
sudo systemctl daemon-reload

Add swap so transient spikes spill instead of triggering an immediate kill:

sudo fallocate -l 2G /swapfile && sudo chmod 600 /swapfile
sudo mkswap /swapfile && sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

Relax strict overcommit if vm.overcommit_memory=2 denies valid allocations at the CommitLimit:

echo 'vm.overcommit_memory = 0' | sudo tee /etc/sysctl.d/99-overcommit.conf
sudo sysctl --system

Fix the underlying leak — raising limits only buys time if RSS climbs without bound.

Warning: Manually killing the top-RSS process to recover a wedged host is destructive and can lose in-flight work. Prefer raising limits or adding swap. As a last resort, after identifying the offender: sudo kill <pid> (escalate to kill -9 only if it ignores SIGTERM).

Validation

# oom_kill should stop incrementing after the fix.
cat /sys/fs/cgroup/system.slice/<unit>.service/memory.events
systemctl show <unit>.service -p MemoryMax -p MemoryCurrent

# No new OOM lines since the change.
journalctl -k --since "10 min ago" | grep -i oom

# Container should no longer report OOMKilled.
docker inspect -f '{{.State.OOMKilled}} {{.State.ExitCode}}' <container>

After a correct fix, memory.current peaks below memory.max, the oom_kill counter holds steady, and the restart loop ends. A counter that keeps climbing means the working set still exceeds the limit or the process is leaking.

Prevention

Right-size cgroup limits (MemoryMax, container --memory) to the measured working set plus headroom — under-sized limits cause self-inflicted OOM while the host sits idle.
Use systemd MemoryHigh= as a soft throttle below the hard MemoryMax= so a workload is slowed before it is killed.
Tune vm.overcommit_memory/overcommit_ratio deliberately and always provision some swap or zram, even a modest swapfile, on hosts that lack it.
Protect critical daemons with OOMScoreAdjust (e.g. -800 for sshd/monitoring) so the kernel picks the right victim.
Alert on cgroup v2 memory.events oom_kill counters and on container exit code 137; a rising count is an early warning before a full outage.
Baseline leak-prone services with systemd-cgtop and free -h trends so a slow climb is caught before it reaches the ceiling.

Cannot allocate memory — the ENOMEM counterpart that rejects allocations before any process dies.
fork: Resource temporarily unavailable — process/thread creation failing under memory or PID pressure.
TCP: out of memory — kernel socket-buffer memory exhaustion.
Too many open files — a related resource-limit failure.
Segmentation fault (core dumped) — a different class of process crash.
Taming the Linux OOM killer — deep dive on oom_score and tuning the killer.
Linux Error Guides hub — the full index of Linux error walkthroughs.

Final Notes

Out of memory: Killed process means an allocation could not be satisfied and the kernel killed a process to recover — either host-wide or inside a single cgroup. Read the kernel message first and let the constraint= field decide the path: CONSTRAINT_MEMCG sends you to a unit/container limit, CONSTRAINT_NONE to host memory and swap. Then size limits and swap to the real working set — most OOM kills are simply a limit set smaller than the workload actually needs.

Want faster Linux incident response? Use DevOps AI Toolkit to turn production errors into clear diagnostics, remediation steps, and reusable runbooks.

Linux Error: Out of memory: Killed process — Cause, Fix, and Troubleshooting Guide

Summary

Common Symptoms

Most Likely Causes of the ‘Out of memory: Killed process’ Error

Quick Triage

Diagnostic Commands

Fix / Remediation

Validation

Prevention

Final Notes

Download the Free 500-Prompt DevOps AI Toolkit

Summary

Common Symptoms

Most Likely Causes of the ‘Out of memory: Killed process’ Error

Quick Triage

Diagnostic Commands

Fix / Remediation

Validation

Prevention

Related Errors

Final Notes

Download the Free 500-Prompt DevOps AI Toolkit