Linux Error: Cannot allocate memory

Summary

Cannot allocate memory (errno ENOMEM) is the kernel refusing an allocation or task-creation request up front — most often on a fork()/clone() or mmap(). Nothing is killed: the syscall simply fails and the shell frequently retries, printing fork: retry: Resource temporarily unavailable. This is the opposite of the OOM killer, which reaps a running process after memory is committed (see Out of memory: Killed process). Just as often as a raw shortage, Cannot allocate memory means a limit was hit — strict overcommit accounting, a PID/thread ceiling, a cgroup pids.max/TasksMax, a per-user nproc cap, or vm.max_map_count.

Common Symptoms

The shell prints fork: retry: Resource temporarily unavailable, sometimes looping before it gives up.
New SSH sessions for one user are refused while root logs in fine.
A service fails to spawn worker threads and logs Cannot allocate memory or pthread_create failed.
systemctl restart of a busy unit fails with Resource temporarily unavailable.
The same host forks fine as root but fails as a service account — a strong hint it is a per-user/per-cgroup limit, not the box running dry.

Most Likely Causes of the ‘Cannot allocate memory’ Error

Ordered by how often they cause Cannot allocate memory in production:

A cgroup pids.max / systemd TasksMax hit by one unit. The classic fork: retry inside a single service while the rest of the host is idle.
A per-user nproc (RLIMIT_NPROC) cap exhausted. One UID owns all its allowed processes; other users and root are unaffected.
The system-wide PID/thread ceiling reached (kernel.pid_max / kernel.threads-max) — no process anywhere can fork.
Strict overcommit (vm.overcommit_memory=2) with a low overcommit_ratio rejecting commits while free -h still shows headroom.
vm.max_map_count or ulimit -v (RLIMIT_AS) exhausted — common with the JVM and Elasticsearch, returns ENOMEM from mmap().
Genuine memory + swap exhaustion (MemAvailable/SwapFree near zero) — the true shortage case.

Quick Triage

# Real shortage, or a limit? Free memory vs commit accounting.
free -h
grep -E 'MemAvailable|SwapFree|Committed_AS|CommitLimit' /proc/meminfo

# Live thread/process count — compare against the ceilings below.
ps -eLf | wc -l

If MemAvailable is healthy and swap is non-trivial, this is a limit, not a shortage — skip to the ceilings. Thousands of threads against a low pid_max points straight at a count limit.

Diagnostic Commands

free -h                                  # available memory + swap at a glance
grep -E 'Committed_AS|CommitLimit' /proc/meminfo   # commit accounting vs cap

Committed_AS at CommitLimit with no swap means allocations are being rejected for lack of backing.

cat /proc/sys/vm/overcommit_memory       # 0 heuristic, 1 always, 2 strict
cat /proc/sys/vm/overcommit_ratio        # cap = swap + RAM*ratio/100 in mode 2

Mode 2 with a low ratio caps commitments well below physical RAM; malloc/fork returns ENOMEM even though memory looks free.

cat /proc/sys/kernel/pid_max             # system-wide PID ceiling
cat /proc/sys/kernel/threads-max         # system-wide thread ceiling
ps -eLf | wc -l                          # live thread count

A live count near either ceiling means the whole box is out of PIDs.

systemctl show -p TasksMax -p TasksCurrent <unit>.service
cat /sys/fs/cgroup/system.slice/<unit>.service/pids.current
cat /sys/fs/cgroup/system.slice/<unit>.service/pids.max
cat /sys/fs/cgroup/system.slice/<unit>.service/pids.events   # 'max' counter increments on each rejection

TasksCurrent == TasksMax (or pids.current == pids.max) confines just that unit. On cgroup v2 the path is under /sys/fs/cgroup/system.slice/....

ulimit -u                                # nproc / RLIMIT_NPROC for this user
ulimit -v                                # address space / RLIMIT_AS
cat /proc/sys/vm/max_map_count           # mmap region cap (JVM/Elasticsearch)
cat /proc/<pid>/limits                   # effective limits of a running service
ps -u <user> -L --no-headers | wc -l     # live threads owned by one UID

Compare the live count to the limit. On Ubuntu/Debian, per-user caps live in /etc/security/limits.d/; on RHEL/Rocky, check the same path plus /etc/security/limits.conf and any systemd drop-ins (LimitNPROC=).

vmstat 1 5                               # r/b queues, si/so swap activity
smem -tk -c "pid user command rss pss"   # per-process PSS if smem is installed

Fix / Remediation

Apply the fix that matches the ceiling you identified — safe changes first.

Raise a unit’s task ceiling. Preferred when one service hits TasksMax:

sudo mkdir -p /etc/systemd/system/<unit>.service.d
printf '[Service]\nTasksMax=4096\n' | sudo tee /etc/systemd/system/<unit>.service.d/tasks.conf
sudo systemctl daemon-reload
sudo systemctl restart <unit>.service

Raise a per-user process limit in /etc/security/limits.d/90-nproc.conf (both Ubuntu/Debian and RHEL/Rocky):
```
appuser  soft  nproc  8192
appuser  hard  nproc  16384
```
For systemd services set LimitNPROC= in a drop-in instead. Re-login for a shell session to pick up the new limit.

Raise vm.max_map_count for map-heavy apps (JVM, Elasticsearch):

echo 'vm.max_map_count = 262144' | sudo tee /etc/sysctl.d/99-maps.conf
sudo sysctl --system

Relax strict overcommit if vm.overcommit_memory=2 is rejecting valid allocations — either raise the ratio or move to heuristic mode 0:
```
echo 'vm.overcommit_memory = 0' | sudo tee /etc/sysctl.d/99-overcommit.conf
sudo sysctl --system
```

Raise the system-wide PID ceiling if the whole box is out of PIDs:

echo 'kernel.pid_max = 4194304' | sudo tee /etc/sysctl.d/99-pidmax.conf
sudo sysctl --system

Add swap or free real memory only when it is a genuine shortage.

Warning: Killing processes to reclaim PIDs/memory is destructive and can lose in-flight work. Prefer raising the specific ceiling. Only as a last resort, and after identifying the offender, run sudo kill <pid> (escalate to kill -9 if it ignores SIGTERM).

Validation

# The relevant counter should now sit below its ceiling.
systemctl show -p TasksMax -p TasksCurrent <unit>.service
cat /proc/sys/vm/max_map_count
ps -eLf | wc -l

# Confirm the syscall now succeeds where it failed.
sudo -u appuser bash -c 'true & wait'    # a fork that previously errored

Fork failures should stop and TasksCurrent/thread counts should settle well below their limits. A recurring climb back to the ceiling signals a thread or process leak, not a number to keep bumping.

Prevention

Right-size TasksMax=/LimitNPROC= on services that spawn workers to their real concurrency, and treat a unit hitting its ceiling as a paging event — it usually means a leak.
Standardize nproc, nofile, and vm.max_map_count across hosts via config management; drift is the classic “works on one box, fails on another.”
Prefer systemd MemoryHigh= (soft throttle) over a hard MemoryMax= for spiky workloads, and keep some swap or zram so a transient spike returns a clean error path.
Monitor ps -eLf | wc -l against kernel.pid_max and alert on Committed_AS / CommitLimit and pids.events/memory.events under cgroup v2.
For map-heavy workloads (JVM, Elasticsearch), bake vm.max_map_count into provisioning rather than discovering it at an outage.

Out of memory: Killed process — the OOM-killer counterpart that reaps a running process.
fork: Resource temporarily unavailable — the sibling message from the same fork failure.
Too many open files — a related RLIMIT_NOFILE/resource ceiling.
TCP: out of memory — kernel socket-memory exhaustion.
Segmentation fault (core dumped) — when an allocation succeeds but the access is invalid.
Taming the Linux OOM killer — deep dive on scoring and tuning.
Linux Error Guides hub — the full index of Linux error walkthroughs.

Final Notes

Cannot allocate memory / fork: retry: Resource temporarily unavailable is the kernel rejecting a request before anything new exists — distinct from the OOM killer, which kills after the fact. Check free -h first to separate a true shortage from a limit, then work the limits in order (unit TasksMax, per-user nproc, system pid_max, overcommit, max_map_count). Raise the specific ceiling being hit, and treat a recurring failure as a leak to fix, not just a number to increase.

Want faster Linux incident response? Use DevOps AI Toolkit to turn production errors into clear diagnostics, remediation steps, and reusable runbooks.

Linux Error: Cannot allocate memory — Cause, Fix, and Troubleshooting Guide

Summary

Common Symptoms

Most Likely Causes of the ‘Cannot allocate memory’ Error

Quick Triage

Diagnostic Commands

Fix / Remediation

Validation

Prevention

Final Notes

Download the Free 500-Prompt DevOps AI Toolkit

Summary

Common Symptoms

Most Likely Causes of the ‘Cannot allocate memory’ Error

Quick Triage

Diagnostic Commands

Fix / Remediation

Validation

Prevention

Related Errors

Final Notes

Download the Free 500-Prompt DevOps AI Toolkit