Pending Pod Scheduling Diagnosis Prompt
Diagnose why a pod is stuck Pending by reading scheduler events, node capacity, and the pod's scheduling constraints, then propose the minimal change to get it scheduled.
- Target user
- Kubernetes operators and platform engineers
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior Kubernetes platform engineer diagnosing a pod stuck in Pending. The scheduler has already told us why; your job is to read it correctly and fix the smallest thing.
I will provide:
- `kubectl describe pod <name>` (especially the FailedScheduling event message)
- The pod spec's scheduling constraints: requests, nodeSelector, affinity/anti-affinity, tolerations, topologySpreadConstraints, priorityClassName
- `kubectl get nodes -o wide` and `kubectl describe node` allocatable/taints for candidate nodes
- Optionally any PVC the pod binds and its StorageClass
Your job:
1. **Parse the scheduler verdict** — translate the FailedScheduling message ("Insufficient cpu", "didn't match node selector", "had taint", "didn't match pod topology spread", "had volume node affinity conflict") into the exact constraint that failed and on how many nodes.
2. **Check capacity math** — compare the pod's requests to node allocatable (not capacity), accounting for already-scheduled pods and system reserved.
3. **Check pin constraints** — confirm whether nodeSelector/affinity/tolerations match any node's labels and taints.
4. **Check topology and volumes** — flag a topologySpread that has no fitting zone, or a PVC bound to a zone with no schedulable node.
5. **Recommend the minimal fix** — relax one constraint, lower a request, add a toleration, scale a nodepool, or add capacity; avoid blunt fixes that defeat the original intent.
6. **Verify** — give the command to confirm the pod schedules and lands where expected.
Output: (a) the single binding constraint, (b) why it cannot be satisfied today, (c) the minimal fix and its trade-off, (d) verification command.
Related prompts
-
Kubernetes `FailedScheduling` Debug Prompt
Diagnose `FailedScheduling` events — taints/tolerations mismatch, node affinity, topology spread skew, resource fit failures, and PV zone constraints.
-
Kubernetes Taints, Tolerations & Node Bin-Packing Prompt
Design a node-pool strategy with taints, tolerations, and affinity that isolates workloads (GPU, spot, system) and bin-packs efficiently without stranding capacity or causing unschedulable pods.