Linux Error: fork: Resource temporarily unavailable — Cause, Fix, and Troubleshooting Guide
How to fix fork: Resource temporarily unavailable (EAGAIN) on Linux by diagnosing ulimit nproc, cgroup pids.max, kernel.pid_max and kernel.threads-max limits.
- #linux
- #troubleshooting
- #errors
- #limits
Summary
fork: Resource temporarily unavailable (errno EAGAIN) is what the kernel returns when it refuses to create a new process or thread. It is almost never about RAM — that would be ENOMEM. It means you hit a count limit on tasks. Bash retries a few times, so you also see fork: retry: Resource temporarily unavailable, before commands stop working entirely. On Linux, threads are tasks too, so a single JVM or Go service spawning thousands of threads counts exactly like thousands of processes against the same ceiling.
Common Symptoms
- Interactive commands fail:
bash: fork: retry: Resource temporarily unavailable, thenbash: fork: Resource temporarily unavailable. - A JVM logs
java.lang.OutOfMemoryError: unable to create new native thread(despite plenty of free RAM). sshdrefuses logins:error: fork: Resource temporarily unavailablein/var/log/auth.log.- systemd journal shows
Failed to fork: Resource temporarily unavailableandstatus=219/CGROUP. - The failures hit all users at once (global limit) or just one service (per-user/cgroup limit).
Most Likely Causes of the ‘fork: Resource temporarily unavailable’ Error
There are four independent ceilings, and any one triggers fork: Resource temporarily unavailable. Most common in production first:
RLIMIT_NPROC(ulimit -u) too low — a per-user cap on tasks (processes + threads). Shared hosts often setnprocto a few hundred in/etc/security/limits.conf.- A thread-pool or process leak — unbounded executors, leaked connections, or a thread-per-request server under load spawning tasks faster than they exit.
- cgroup
pids.max/ systemdTasksMax— a restrictive per-cgroup cap on a container or unit. kernel.pid_max— the largest PID the kernel will allocate, a global cap on concurrent tasks.kernel.threads-max— a global cap on total threads system-wide.- Zombie or stuck processes never reaped, slowly filling the task table.
Quick Triage
# Per-user soft/hard task limit for the affected user
ulimit -u; ulimit -Hu
# Live, authoritative limit the kernel enforces for a running service
cat /proc/$(pgrep -n java)/limits | grep -E 'processes|Limit'
# Per-user thread count — the number that matters for RLIMIT_NPROC
ps -eLf | awk '{print $1}' | sort | uniq -c | sort -rn | head
If a user’s live thread count is at its ulimit -u, that is RLIMIT_NPROC. If it hits all users at once, suspect a global ceiling.
Diagnostic Commands
# Total tasks (threads included) on the host
ps -eLf | wc -l
# Threads per process — find the greedy one
ps -eo pid,nlwp,user,comm --sort=-nlwp | head
nlwp is the thread count per PID; a single java PID holding thousands of threads is the usual culprit.
# Global ceilings
cat /proc/sys/kernel/pid_max
cat /proc/sys/kernel/threads-max
Compare ps -eLf | wc -l against threads-max for a global thread exhaustion.
# cgroup / systemd task budget (the TasksMax = pids.max mapping)
systemctl show <SERVICE> -p TasksMax -p TasksCurrent
# cgroup v2 (modern Ubuntu/Debian and RHEL/Rocky):
cat /sys/fs/cgroup/system.slice/<SERVICE>/pids.max
cat /sys/fs/cgroup/system.slice/<SERVICE>/pids.current
When pids.current equals pids.max, the cgroup — not the user limit — is the bottleneck. Match the symptom to the ceiling: status=219/CGROUP in the journal points at TasksMax; EAGAIN with ulimit -u exceeded points at RLIMIT_NPROC; a threads-max/pid_max hit shows across all users.
Fix / Remediation
-
Identify which ceiling you hit by comparing live counts to each limit (steps above). Apply only the fix that matches.
-
Stop the bleeding. If one process is leaking threads, restart it through its manager so the cgroup task count resets:
watch -n 2 'ps -o pid,nlwp,comm -p <PID>' # confirm growth first (read-only) sudo systemctl restart <SERVICE> -
Raise the per-user limit (
RLIMIT_NPROC) via/etc/security/limits.d/:appuser soft nproc 16384 appuser hard nproc 32768 -
For a systemd service, set the task cap and per-process limit in a drop-in (
limits.confdoes not apply — services skip PAM login):# /etc/systemd/system/<SERVICE>.service.d/limits.conf [Service] TasksMax=8192 LimitNPROC=16384Then
sudo systemctl daemon-reload && sudo systemctl restart <SERVICE>. -
Only when truly global, raise the kernel table and persist it in
/etc/sysctl.d/99-limits.conf:Warning: Raising
kernel.threads-max/kernel.pid_maxor settingTasksMax=infinityremoves a guardrail. A leaking service can then exhaust the global table and take down the whole host instead of just itself. Prefer a sized limit plus fixing the leak.sudo sysctl -w kernel.threads-max=1000000 sudo sysctl -w kernel.pid_max=4194304
Validation
# New per-process limit is live
cat /proc/$(pgrep -n <SERVICE>)/limits | grep -i 'processes'
# systemd task budget updated
systemctl show <SERVICE> -p TasksMax -p TasksCurrent
# Task count stays bounded under load (leak contained)
watch -n 5 'ps -o pid,nlwp,comm -p $(pgrep -n <SERVICE>)'
A limit reflecting the new value and a task count that plateaus well below it confirm the fix.
Prevention
- Bound your thread pools. Most “unable to create new native thread” incidents are application bugs — use fixed-size executors and connection pools; raising kernel limits only delays the crash.
- Set
TasksMaxdeliberately per systemd unit, sized for the real workload, to contain a runaway service to its own cgroup. - Monitor the
pids.current / pids.maxratio and per-user thread counts, and alert before saturation. - Watch
DefaultTasksMax(often 15% ofkernel.pid_max), which governs units without an explicit value —systemctl show -p DefaultTasksMax. - Reap zombies — ensure parents
wait()on children; in containers run a proper init (PID 1) that reaps orphans. - Manage limits via config management (Ansible/Puppet) so per-service
LimitNPROC/TasksMaxand sysctl values are consistent and survive reboots.
Related Errors
- Linux Error: Too many open files
- Linux Error: Cannot allocate memory
- Linux Error: Out of memory: Killed process
- Linux Error: start request repeated too quickly
- Linux Error: bind: Address already in use
- All Linux error guides
Final Notes
fork: Resource temporarily unavailable is EAGAIN — a count limit, not a memory limit. Match the symptom to one of the four ceilings (RLIMIT_NPROC, cgroup pids.max/TasksMax, kernel.pid_max, kernel.threads-max) before changing anything, then raise only that limit. If a single process keeps climbing under steady load, the real fix is bounding its threads in the code, not lifting the guardrail.
Want faster Linux incident response? Use DevOps AI Toolkit to turn production errors into clear diagnostics, remediation steps, and reusable runbooks.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.