Kubernetes `FailedScheduling` Debug Prompt
Diagnose `FailedScheduling` events — taints/tolerations mismatch, node affinity, topology spread skew, resource fit failures, and PV zone constraints.
- Target user
- Kubernetes platform engineers
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior Kubernetes engineer who reads `kube-scheduler` decisions like a debugger. You can decode "0/8 nodes are available: 5 node(s) had untolerated taint, 3 Insufficient memory" into a precise diagnosis.
I will provide:
- The pod that's stuck `Pending`
- `kubectl describe pod <pod>` (especially the Events section — the scheduler's "0/N nodes are available" message is gold)
- The pod spec — focus on `resources`, `nodeSelector`, `affinity`, `tolerations`, `topologySpreadConstraints`
- Node inventory: `kubectl get nodes --show-labels`, taints per node
- Cluster autoscaler activity if enabled
Your job:
1. **Parse the scheduler verdict** from Events:
- **"X nodes had untolerated taint"** → tolerations missing
- **"X nodes don't match node selector"** → nodeSelector / required nodeAffinity
- **"X nodes don't match Pod's node affinity"** → preferred or required affinity rules
- **"X Insufficient cpu/memory"** → no node has enough room
- **"X had volume node affinity conflict"** → PV / PVC zonal pinning to a node not available
- **"X node(s) didn't match Pod's topology spread constraints"** → skew exceeded
- **"X had taint {key:value:effect}, that the pod didn't tolerate"** — specific taint
2. **Add the counts**: if "0/8" with "5 untolerated taint, 3 Insufficient memory" — those 5+3 may overlap (a node tainted AND short on memory). The scheduler considers a node out if ANY filter fails.
3. **For taint issues**:
- List all node taints: `kubectl get nodes -o json | jq '.items[] | {name:.metadata.name, taints:.spec.taints}'`
- Common: `node-role.kubernetes.io/control-plane:NoSchedule`, `nvidia.com/gpu:NoSchedule`, custom team taints
- Pod needs explicit tolerations matching
4. **For resource issues**:
- `kubectl describe nodes | grep -E "Resource|Requests|Limits"` shows allocatable + requested
- Allocatable < pod's `requests` → won't fit
- Sum of pod's containers (multi-container Pods) must fit on one node
5. **For affinity issues**:
- **`requiredDuringSchedulingIgnoredDuringExecution`** — hard; pod won't schedule if not met
- **`preferredDuringSchedulingIgnoredDuringExecution`** — soft; scheduler prefers but allows
- **Pod affinity / anti-affinity** — relative to other pods; can self-block ("must not schedule on node with other pod with same label" — first pod schedules; second can't if cluster has 1 node)
6. **For topology spread constraints**:
- `maxSkew: 1` between topology buckets (zones/nodes); if cluster has only 1 zone, second pod can't schedule
- `whenUnsatisfiable: DoNotSchedule` (hard) vs `ScheduleAnyway` (soft)
- Common: pod uses topology spread on `topology.kubernetes.io/zone`, cluster is single-zone → all subsequent pods Pending
7. **For PV zone conflicts**:
- PVC bound to a PV in zone us-east-1a; node in us-east-1b can't mount it
- Symptom: "node(s) had volume node affinity conflict"
- Fix: `volumeBindingMode: WaitForFirstConsumer` in StorageClass so PV is created in the pod's zone
8. **For "scheduled but pod stays Pending"** (rare):
- Scheduler chose a node but kubelet rejected; check `kubectl events --field-selector involvedObject.name=<pod>`
Mark DESTRUCTIVE: removing taints from production nodes (every pod that tolerated may now schedule there), changing `requiredDuringScheduling` rules live, scaling up to cover capacity without considering autoscaler.
---
Pod + namespace: [DESCRIBE]
`kubectl describe pod <pod>` Events section:
```
[PASTE — include the "0/N nodes are available" line and reasons]
```
Pod spec — `resources`, `nodeSelector`, `affinity`, `tolerations`, `topologySpreadConstraints`:
```yaml
[PASTE]
```
Node count + key labels + taints:
```
[PASTE — kubectl get nodes --show-labels with taints from describe]
```
Cluster autoscaler enabled? [yes / no / Karpenter]
Why this prompt works
The scheduler’s “0/N nodes are available” message is the single most informative line in the cluster — it lists every reason each node was filtered out. Most “Pending pod” debugging is reading this line, then mapping it back to pod spec or node state.
How to use it
- Read the Events section carefully — it usually states exactly which filter failed.
- Compare counts: “5 untolerated taint, 3 insufficient memory” on 8 nodes = candidate overlap analysis.
- For autoscaled clusters, give the autoscaler a minute or two before declaring failure.
- For multi-zone clusters, always check zone constraints (PV zones, topology spread).
Useful commands
# Pod events
kubectl describe pod <pod>
kubectl get events --field-selector involvedObject.name=<pod>,type=Warning
# Node inventory
kubectl get nodes --show-labels
kubectl get nodes -o json | jq '.items[] | {name:.metadata.name, taints:(.spec.taints//[]), allocatable:.status.allocatable}'
kubectl describe nodes | grep -E "Name:|Taints:|Resource|Allocated" | head -100
# What's on each node (resource view)
kubectl describe node <node> | grep -A20 "Allocated resources"
kubectl top node # actual usage; metrics-server needed
# Pod's spec
kubectl get pod <pod> -o yaml | yq '.spec.{resources,nodeSelector,affinity,tolerations,topologySpreadConstraints}'
# Cluster autoscaler (if installed)
kubectl logs -n kube-system -l app=cluster-autoscaler --tail=100
kubectl describe configmap cluster-autoscaler-status -n kube-system # last decision
# Karpenter (alternative autoscaler)
kubectl get nodepool
kubectl get nodeclaim
# Re-trigger scheduling
kubectl delete pod <pod> # if part of a Deployment/RS, will recreate
Decoding the verdict
0/8 nodes are available:
5 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }
2 node(s) didn't match Pod's node affinity/selector
1 Insufficient memory
Reading:
- 5 control-plane nodes (probably) need a toleration
- 2 nodes don’t have the label your
nodeSelectorrequires - 1 node has insufficient free memory
Fixes:
- If you wanted only worker nodes: 5 isn’t the issue — focus on the other 3
- For affinity: add the missing label, or relax the selector
- For memory: lower request, scale up cluster, or evict a noisy pod
Common findings this catches
Insufficient cpubutkubectl topsays nodes are at 30% → requests, not usage, dictate scheduling. Pod’s request exceeds free.untolerated taintwith control-plane nodes — add toleration if intended, or accept the exclusion.volume node affinity conflict→ PVC in zone A, no node in zone A available. UseWaitForFirstConsumerStorageClass.- Topology spread
DoNotScheduleblocks all replicas after first → cluster has fewer topology values than replica count. - Pod anti-affinity
hostname+ small cluster → only 1 replica fits per node; rest Pending. - Resource limits without request → scheduler uses 0 as request; fits anywhere but might OOM later. Set explicit requests.
- Stuck Pending with autoscaler enabled → autoscaler can’t add a node large enough for the pod; check node templates / max sizes.
Affinity patterns
Run on specific node pool
nodeSelector:
workload: batch
# or
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: workload
operator: In
values: [batch]
Prefer GPU nodes, fallback to CPU
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- { key: nvidia.com/gpu.present, operator: In, values: ["true"] }
Anti-affinity (spread across nodes)
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels: { app: critical-app }
topologyKey: kubernetes.io/hostname
Topology spread across zones
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway # soft, allows imbalance
labelSelector:
matchLabels: { app: web }
When to escalate
- Scheduler decisions don’t match its log output — pull
kube-schedulerlogs; possibly a controller race. - Cluster autoscaler scaling but new nodes don’t fit the pod — check Node Group / NodePool templates; resource asks exceed instance type.
- Cluster of “right” capacity but distribution wrong (zonal imbalance) — review topology spread config and PV zone strategy.
Related prompts
-
Kubernetes Cluster Autoscaler / Karpenter Debug Prompt
Diagnose cluster autoscaling — scale-up delay, scale-down protection, node group selection, pod doesn't fit any template, Karpenter NodePool/NodeClaim issues.
-
Kubernetes Pod Troubleshooting Prompt
Diagnose any misbehaving pod — pending, evicted, networking-broken, storage-stuck, or just plain slow — with a structured AI walkthrough.
-
Kubernetes Topology Spread Constraints Debug Prompt
Diagnose and design topology spread constraints — zone/node distribution, skew tolerance, hard vs soft, single-zone cluster traps.