Kubernetes Error Guide: '0/3 nodes are available: Insufficient cpu' Pod Pending / FailedScheduling
Fix the Kubernetes Insufficient cpu/memory scheduling error: diagnose pod requests, node allocatable vs allocated, daemonset overhead, and missing cluster-autoscaler headroom.
- #kubernetes
- #troubleshooting
- #errors
- #scheduling
Overview
A FailedScheduling event with Insufficient cpu (or Insufficient memory) means the kube-scheduler could not find a node whose remaining allocatable resources satisfy the pod’s resource requests. The scheduler filters every node against the pod’s requests; when no node passes, the pod stays in Pending and the scheduler records why.
You will see this in the pod’s events:
0/3 nodes are available: 3 Insufficient cpu. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.
Or, for memory:
0/3 nodes are available: 1 Insufficient memory, 2 Insufficient cpu.
It occurs the moment a pod (from a Deployment, StatefulSet, Job, or a bare pod) is created and the scheduler runs its filter phase. The check is against requests, not limits and not live usage — a node can be 5% busy yet still report Insufficient cpu because previously scheduled pods already reserved its allocatable capacity.
Symptoms
- Pod stuck in
Pending, never progressing toContainerCreating. kubectl describe podshows aFailedSchedulingevent withInsufficient cpu/Insufficient memory.- A Deployment rollout stalls with replicas
Pending; HPA scale-ups never land. - New pods schedule only after an old one is deleted.
kubectl get pods -n shop
NAME READY STATUS RESTARTS AGE
checkout-7d9c4b8f6c-2xk9p 0/1 Pending 0 4m12s
checkout-7d9c4b8f6c-lm4wt 1/1 Running 0 22h
kubectl describe pod checkout-7d9c4b8f6c-2xk9p -n shop | sed -n '/Events/,$p'
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m (x5 over 4m12s) default-scheduler 0/3 nodes are available: 3 Insufficient cpu. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.
Common Root Causes
1. Pod requests exceed any single node’s allocatable
The scheduler places a pod on one node; if the pod requests more CPU/memory than any node’s allocatable, it can never be scheduled regardless of total cluster capacity.
kubectl get pod checkout-7d9c4b8f6c-2xk9p -n shop -o jsonpath='{.spec.containers[*].resources.requests}'; echo
{"cpu":"6","memory":"12Gi"}
If your nodes are 4-vCPU machines, a cpu: 6 request will never fit — every node fails the filter.
2. The cluster is genuinely full (no headroom)
Every node’s allocatable is already reserved by running pods’ requests, leaving no room even for a small new pod.
kubectl describe node ip-10-0-1-23 | sed -n '/Allocated resources/,/Events/p'
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 3800m (95%) 6 (150%)
memory 14820Mi (92%) 20Gi (124%)
CPU requests at 95% of allocatable leaves ~200m — not enough for a 500m request.
3. Oversized requests vs. actual usage
Pods request far more than they use, so the scheduler reserves capacity that sits idle. The cluster looks busy on requests while real utilization is low.
kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
ip-10-0-1-23 620m 15% 4100Mi 25%
ip-10-0-1-44 540m 13% 3900Mi 24%
ip-10-0-1-77 710m 17% 4500Mi 28%
Live CPU is ~15% but requests are saturated — right-size the requests to reclaim schedulable capacity.
4. Requests bigger than the largest node type
In a heterogeneous cluster, a pod that requires the biggest instance type fails when only smaller nodes have room.
kubectl get nodes -L node.kubernetes.io/instance-type -o custom-columns=NODE:.metadata.name,CPU:.status.allocatable.cpu,MEM:.status.allocatable.memory,TYPE:.metadata.labels.'node\.kubernetes\.io/instance-type'
NODE CPU MEM TYPE
ip-10-0-1-23 1930m 3800Mi t3.medium
ip-10-0-1-44 1930m 3800Mi t3.medium
ip-10-0-1-77 3920m 15800Mi m5.xlarge
A pod requesting memory: 8Gi only fits the single m5.xlarge; if that node is full, scheduling fails.
5. DaemonSets and system pods consume allocatable
Allocatable is already reduced by kube-reserved/system-reserved, and DaemonSet pods (CNI, kube-proxy, log/metrics agents) take a fixed slice on every node before workloads land.
kubectl get pods --all-namespaces --field-selector spec.nodeName=ip-10-0-1-23 -o wide | grep -E 'daemonset|aws-node|kube-proxy|fluent|node-exporter'
kube-system aws-node-h7x2q 2/2 Running 0 22h 10.0.1.23 ip-10-0-1-23
kube-system kube-proxy-9c4dp 1/1 Running 0 22h 10.0.1.23 ip-10-0-1-23
monitoring node-exporter-k2lvr 1/1 Running 0 22h 10.0.1.23 ip-10-0-1-23
logging fluent-bit-rr8qd 1/1 Running 0 22h 10.0.1.23 ip-10-0-1-23
These pods reserve CPU/memory on every node, shrinking what is left for application pods.
6. No cluster-autoscaler, or autoscaler at max
If there is no autoscaler, the cluster never grows; if there is one but the node group is at maxSize (or a scale-up failed), pending pods are never given a new node.
kubectl -n kube-system logs deploy/cluster-autoscaler --tail=20 | grep -iE 'max node|scale.?up|no node group'
I0623 14:05:11 scale_up.go:300] Pod shop/checkout-7d9c4b8f6c-2xk9p can't be scheduled on ng-default, predicate failed: max size reached
I0623 14:05:11 static_autoscaler.go:520] Failed to scale up: node group ng-default is at maximum size (6)
The node group is capped, so the pending pod waits indefinitely.
Diagnostic Workflow
Step 1: Read the exact FailedScheduling message
kubectl describe pod <POD> -n <NS> | sed -n '/Events/,$p'
kubectl get events -n <NS> --field-selector reason=FailedScheduling --sort-by=.lastTimestamp | tail -5
Note whether it says Insufficient cpu, Insufficient memory, or both, and how many nodes failed.
Step 2: Read the pod’s requests
kubectl get pod <POD> -n <NS> -o jsonpath='{range .spec.containers[*]}{.name}{": "}{.resources.requests}{"\n"}{end}'
Sum the requests across all containers (plus any initContainers and pod overhead).
Step 3: Compare against each node’s allocatable
kubectl get nodes -o custom-columns=NODE:.metadata.name,CPU_ALLOC:.status.allocatable.cpu,MEM_ALLOC:.status.allocatable.memory
If the pod’s request exceeds the largest allocatable, this is a sizing problem, not a capacity problem.
Step 4: Check how much is already reserved per node
for n in $(kubectl get nodes -o name); do
echo "== $n =="
kubectl describe $n | sed -n '/Allocated resources/,/Events/p'
done
Look at the Requests percentages — anything near 100% has no room.
Step 5: Check the autoscaler (if any)
kubectl -n kube-system logs deploy/cluster-autoscaler --tail=40 | grep -iE 'scale.?up|max size|predicate'
kubectl get nodes -L node.kubernetes.io/instance-type
Confirm whether a new node could be (or was supposed to be) added.
Example Root Cause Analysis
A checkout Deployment in namespace shop is scaled from 2 to 4 replicas. Two new pods stay Pending.
kubectl describe pod shows:
Warning FailedScheduling default-scheduler 0/3 nodes are available: 3 Insufficient cpu.
The pod requests are reasonable:
kubectl get pod checkout-7d9c4b8f6c-2xk9p -n shop -o jsonpath='{.spec.containers[*].resources.requests}'; echo
{"cpu":"500m","memory":"512Mi"}
500m fits on any node in principle, so this is a capacity problem. Checking allocation on each node shows all three near saturation:
kubectl describe node ip-10-0-1-23 | sed -n '/Allocated resources/,/Events/p'
Allocated resources:
Resource Requests Limits
cpu 1850m (95%) 3 (155%)
memory 2900Mi (76%) 4Gi (107%)
Allocatable CPU is 1930m, of which 1850m is reserved, leaving only 80m — far short of the 500m request. All three t3.medium nodes look the same. The cluster-autoscaler log confirms it cannot grow:
static_autoscaler.go:520] Failed to scale up: node group ng-default is at maximum size (3)
Fix: raise the node group maxSize so the autoscaler can add a node, and right-size over-requested neighbors so existing nodes regain headroom. After the autoscaler adds a fourth node, both pending pods schedule and reach Running.
Prevention Best Practices
- Set realistic requests based on observed usage (
kubectl top/ VPA recommendations), not guesses — oversized requests strand capacity even when nodes are idle. - Keep requests well under the smallest node’s allocatable, and account for kube-reserved/system-reserved plus DaemonSet overhead when sizing nodes.
- Run a cluster-autoscaler (or Karpenter) with a
maxSizethat has real headroom, and alert when a node group hits its ceiling so scale-ups never silently stall. - Use
PriorityClassesso critical workloads can preempt low-priority filler when the cluster fills up. - Track per-node
Allocated resourcesrequests percentage and alert before it crosses ~85%, so you add capacity before pods goPending. For deeper guidance, see the Kubernetes & Helm guides. - Treat large single-pod requests carefully — a pod must fit on one node, so a giant request needs a matching instance type in the pool.
Quick Command Reference
# See pending pods and the scheduling failure reason
kubectl get pods -n <NS> --field-selector status.phase=Pending
kubectl describe pod <POD> -n <NS> | sed -n '/Events/,$p'
kubectl get events -n <NS> --field-selector reason=FailedScheduling --sort-by=.lastTimestamp | tail -5
# Pod requests
kubectl get pod <POD> -n <NS> -o jsonpath='{.spec.containers[*].resources.requests}'; echo
# Node allocatable
kubectl get nodes -o custom-columns=NODE:.metadata.name,CPU:.status.allocatable.cpu,MEM:.status.allocatable.memory
# What is already reserved per node
kubectl describe node <NODE> | sed -n '/Allocated resources/,/Events/p'
# Live usage (requires metrics-server)
kubectl top nodes
kubectl top pods -n <NS>
# Cluster-autoscaler status
kubectl -n kube-system logs deploy/cluster-autoscaler --tail=40 | grep -iE 'scale.?up|max size'
Conclusion
A pod stuck Pending with Insufficient cpu/Insufficient memory means no node had enough unreserved allocatable to satisfy the pod’s requests. The usual root causes:
- The pod’s requests exceed any single node’s allocatable (sizing, not capacity).
- The cluster is genuinely full — every node’s requests are near 100%.
- Requests are oversized relative to real usage, stranding schedulable capacity.
- The request only fits the largest node type, and that node is full.
- DaemonSets and reserved system resources have already eaten the allocatable.
- There is no cluster-autoscaler, or it is pinned at
maxSize/ a scale-up failed.
Start by comparing the pod’s requests to each node’s allocatable and reserved totals — the fix is almost always right-sizing requests or adding (or unblocking) capacity. For ad-hoc triage, the free incident assistant can summarize scheduler events into the likely cause.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.