Kubernetes Error Guide: 'node(s) had untolerated taint' Pod Won't Schedule
Fix the Kubernetes FailedScheduling error 'node(s) had untolerated taint' and 'didn't match node affinity/selector': taints, tolerations, nodeSelector, and affinity.
- #kubernetes
- #troubleshooting
- #errors
- #scheduling
Overview
A pod stays Pending when the kube-scheduler cannot find a node that satisfies all of the pod’s placement constraints. Two of the most common reasons are node taints the pod does not tolerate, and node affinity / nodeSelector expressions that match no node. The scheduler evaluates every node, rejects each one with a reason, and reports the aggregate as a FailedScheduling event.
You will see this on the pod’s events:
Warning FailedScheduling 18s default-scheduler 0/4 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.
It occurs whenever a pod is being placed: at first deploy, when a Deployment scales up, when a node is added or cordoned, or when a node becomes NotReady (which adds taints automatically). The message lists each rejection reason and the count of nodes that hit it, so the text itself tells you which constraint is blocking the pod.
Symptoms
- Pod is stuck in
Pendingwith0/N nodes are available. - Events show
node(s) had untolerated taintand/ordidn't match Pod's node affinity/selector. - New replicas from a scaled Deployment never start.
- The pod has no
nodeNameassigned.
kubectl get pod web-7c9d8f5b6-2xk4q -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
web-7c9d8f5b6-2xk4q 0/1 Pending 0 3m <none> <none> <none> <none>
kubectl describe pod web-7c9d8f5b6-2xk4q | grep -A5 Events
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m default-scheduler 0/4 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 3 node(s) didn't match Pod's node affinity/selector.
Common Root Causes
1. Node is tainted NoSchedule and the pod has no matching toleration
A node carrying a NoSchedule taint rejects any pod that does not explicitly tolerate it. This is the single most common cause.
kubectl describe node gpu-node-1 | grep Taints
Taints: dedicated=gpu:NoSchedule
The pod needs a toleration for dedicated=gpu:NoSchedule; without it, the scheduler skips this node and reports untolerated taint {dedicated: gpu}.
2. Control-plane / dedicated taint blocks general workloads
Control-plane nodes are tainted by default so user workloads do not land on them. If a pod is forced toward them (or you have only control-plane nodes in a single-node cluster), it stays Pending.
kubectl describe node cp-1 | grep Taints
Taints: node-role.kubernetes.io/control-plane:NoSchedule
To run a workload there intentionally, add a matching toleration; otherwise schedule it onto worker nodes.
3. nodeSelector label is not present on any node
A nodeSelector is a hard requirement. If no node carries the label/value, every node is rejected with didn't match Pod's node affinity/selector.
kubectl get nodes --show-labels | grep disktype
(no output)
The pod requests nodeSelector: {disktype: ssd} but no node has disktype=ssd. Either label a node or fix the selector.
4. Required nodeAffinity expression matches no node
A requiredDuringSchedulingIgnoredDuringExecution affinity behaves like a hard filter. A typo in a key, a value the cluster never sets, or an operator like In against a missing label rejects all nodes.
kubectl get nodes -L topology.kubernetes.io/zone
NAME STATUS ROLES AGE VERSION ZONE
worker-1 Ready <none> 40d v1.29.4 us-east-1a
worker-2 Ready <none> 40d v1.29.4 us-east-1a
worker-3 Ready <none> 40d v1.29.4 us-east-1b
If the affinity requires topology.kubernetes.io/zone In [us-east-1c], no node qualifies and the pod stays Pending.
5. Taint added by an autoscaler or GPU node pool
Managed node pools (GPU, spot, dedicated) are frequently created with a NoSchedule taint so only opted-in workloads use them. After the cluster autoscaler grows such a pool, generic pods still cannot land there.
kubectl get nodes -o custom-columns='NODE:.metadata.name,TAINTS:.spec.taints[*].key'
NODE TAINTS
worker-1 <none>
worker-2 <none>
gpu-pool-1 nvidia.com/gpu
spot-pool-1 cloud.google.com/gke-spot
A pod requesting GPUs must tolerate nvidia.com/gpu:NoSchedule; a batch job targeting spot must tolerate the spot taint.
6. NotReady / unreachable taint on a degraded node
When a node’s kubelet stops reporting, the node controller applies node.kubernetes.io/not-ready:NoSchedule and node.kubernetes.io/unreachable:NoExecute. New pods cannot schedule there until it recovers.
kubectl get nodes
kubectl describe node worker-2 | grep Taints
NAME STATUS ROLES AGE VERSION
worker-1 Ready <none> 40d v1.29.4
worker-2 NotReady <none> 40d v1.29.4
Taints: node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unreachable:NoSchedule
This reduces the schedulable node count; if the remaining nodes fail other constraints, the pod has nowhere to go.
Diagnostic Workflow
Step 1: Read the FailedScheduling event verbatim
kubectl describe pod <POD> | sed -n '/Events:/,$p'
The message breaks down the rejection counts per reason. untolerated taint points at taints/tolerations; didn't match Pod's node affinity/selector points at nodeSelector/affinity.
Step 2: List taints on every node
kubectl get nodes -o custom-columns='NODE:.metadata.name,STATUS:.status.conditions[-1].type,TAINTS:.spec.taints[*].key'
kubectl describe node <NODE> | grep -A3 Taints
Note which nodes carry NoSchedule/NoExecute taints and the exact key, value, and effect.
Step 3: Inspect the pod’s tolerations, nodeSelector, and affinity
kubectl get pod <POD> -o jsonpath='{.spec.tolerations}{"\n"}'
kubectl get pod <POD> -o jsonpath='{.spec.nodeSelector}{"\n"}'
kubectl get pod <POD> -o jsonpath='{.spec.affinity}{"\n"}' | jq .
Compare the pod’s tolerations against the node taints, and its selector/affinity keys against real node labels.
Step 4: Confirm node labels actually exist
kubectl get nodes --show-labels
kubectl get nodes -L disktype -L topology.kubernetes.io/zone
Every nodeSelector key and required nodeAffinity key/value must appear on at least one schedulable node.
Step 5: Fix the mismatch, then verify scheduling
# Add a label a selector expects
kubectl label node worker-1 disktype=ssd
# Or add a toleration to the workload (Deployment spec.template)
kubectl edit deployment web
# Watch the pod move out of Pending
kubectl get pod <POD> -o wide -w
Example Root Cause Analysis
A GPU inference Deployment infer is scaled to 2 replicas, but both pods sit Pending. The event reads:
Warning FailedScheduling default-scheduler 0/5 nodes are available: 3 node(s) didn't match Pod's node affinity/selector, 2 node(s) had untolerated taint {nvidia.com/gpu: present}.
So 3 CPU workers fail the affinity (the pod requires a GPU node), and the 2 GPU nodes reject it on a taint. Checking the GPU node:
kubectl describe node gpu-pool-1 | grep Taints
Taints: nvidia.com/gpu=present:NoSchedule
Then the pod’s tolerations:
kubectl get pod infer-6b9f7c4d8-jh2lk -o jsonpath='{.spec.tolerations}{"\n"}'
[{"effect":"NoExecute","key":"node.kubernetes.io/not-ready","operator":"Exists","tolerationSeconds":300},{"effect":"NoExecute","key":"node.kubernetes.io/unreachable","operator":"Exists","tolerationSeconds":300}]
Only the default not-ready/unreachable tolerations are present — nothing for nvidia.com/gpu. The node pool was created with a NoSchedule taint so non-GPU pods stay off it, but the Deployment manifest was never given the matching toleration.
Fix: add the toleration to the pod template:
kubectl patch deployment infer --type=json -p='[{"op":"add","path":"/spec/template/spec/tolerations/-","value":{"key":"nvidia.com/gpu","operator":"Equal","value":"present","effect":"NoSchedule"}}]'
kubectl get pods -l app=infer -o wide -w
The new pods now tolerate the GPU taint, satisfy the GPU affinity, and schedule onto gpu-pool-1.
Prevention Best Practices
- Pair every dedicated/GPU/spot taint with a documented toleration that the matching workloads carry, so adding a node pool does not silently strand pods.
- Keep
nodeSelectorand requirednodeAffinitykeys to well-known, cluster-set labels (topology.kubernetes.io/zone,kubernetes.io/arch) and verify the value exists on a node before shipping the manifest. - Prefer
preferredDuringSchedulingIgnoredDuringExecutionoverrequiredwhen placement is a preference, so a missing label degrades gracefully instead of leaving pods Pending. - Alert on pods Pending longer than a few minutes; a
FailedSchedulingevent that persists is almost always a taint or affinity mismatch, not a capacity issue. - For deeper how-tos, see the Kubernetes and Helm guides for taint, toleration, and affinity patterns.
Quick Command Reference
# See why the pod can't schedule
kubectl describe pod <POD> | sed -n '/Events:/,$p'
# Taints across all nodes
kubectl get nodes -o custom-columns='NODE:.metadata.name,TAINTS:.spec.taints[*].key'
kubectl describe node <NODE> | grep -A3 Taints
# The pod's tolerations / selector / affinity
kubectl get pod <POD> -o jsonpath='{.spec.tolerations}{"\n"}'
kubectl get pod <POD> -o jsonpath='{.spec.nodeSelector}{"\n"}'
kubectl get pod <POD> -o jsonpath='{.spec.affinity}{"\n"}' | jq .
# Node labels (for selector/affinity matching)
kubectl get nodes --show-labels
kubectl get nodes -L topology.kubernetes.io/zone -L disktype
# Fixes
kubectl label node <NODE> disktype=ssd
kubectl edit deployment <NAME> # add tolerations / fix nodeSelector
# Watch it schedule
kubectl get pod <POD> -o wide -w
Conclusion
A node(s) had untolerated taint or didn't match Pod's node affinity/selector event means the scheduler found no node that satisfies all of the pod’s hard placement rules. The usual root causes:
- A node carries a
NoScheduletaint the pod does not tolerate. - The pod is aimed at a control-plane / dedicated node without the matching toleration.
- A
nodeSelectorlabel is absent on every node. - A required
nodeAffinityexpression matches no node (typo or value the cluster never sets). - An autoscaler or GPU/spot node pool added a
NoScheduletaint the workload must opt into. - A
NotReady/unreachabletaint on a degraded node removed it from the schedulable set.
Read the FailedScheduling message first — it counts the nodes per rejection reason and tells you whether to align tolerations or fix selectors/affinity.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.