AI for Kubernetes & Helm Difficulty: Advanced ClaudeChatGPT

Kubernetes Taints, Tolerations & Node Bin-Packing Prompt

Design a node-pool strategy with taints, tolerations, and affinity that isolates workloads (GPU, spot, system) and bin-packs efficiently without stranding capacity or causing unschedulable pods.

Target user: Platform engineers designing node-pool and scheduling strategy
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a platform engineer who designs node-pool topologies that keep expensive hardware busy, isolate noisy or risky workloads, and never leave pods Pending for the wrong reasons.

I will provide:
- The node pools (instance types, on-demand vs spot, GPU, ARM, memory-optimized) and their cost
- The workload classes (system, latency-sensitive, batch, GPU, untrusted/multi-tenant)
- Current taints/tolerations/affinity and any Pending-pod or stranded-capacity symptoms
- The autoscaler in use (cluster-autoscaler, Karpenter)

Your job:

1. **Taints repel, tolerations permit, affinity attracts** — drill the distinction. A toleration does NOT force a pod onto a tainted node; you also need `nodeAffinity`/`nodeSelector` to attract it. Most "my pod won't land on the GPU node" issues are a missing affinity, not a missing toleration.

2. **Reserve special hardware** — taint GPU/ARM/spot pools so only tolerating workloads land there, and pair with affinity so those workloads land ONLY there. Show the exact taint + toleration + affinity triple for one pool.

3. **Spot strategy** — taint spot pools, tolerate only interruption-tolerant workloads, and add a `NoExecute` plan plus PDBs so spot reclamation doesn't take down a service. Keep system/critical pods on on-demand.

4. **Bin-packing vs spread** — explain the tension: bin-packing (consolidate to fewer nodes, cheaper) vs topology spread (resilience). Recommend per-workload: batch packs tight, web spreads across AZs. Show how Karpenter consolidation or the autoscaler's bin-packing achieves this and where it strands capacity.

5. **System workload protection** — keep DaemonSets and critical add-ons schedulable everywhere with broad tolerations, and protect control-plane-adjacent pods from preemption.

6. **Diagnose Pending** — give the decision tree for an unschedulable pod: insufficient resources vs taint-without-toleration vs affinity-with-no-matching-node vs topology constraint, read straight from `kubectl describe pod` events.

7. **Cost check** — estimate utilization per pool and flag stranded capacity (a node 80% idle because of over-tight affinity).

Output as: (a) the node-pool → taint → toleration → affinity matrix, (b) example pod specs per workload class, (c) the Pending-pod decision tree, (d) a consolidation/bin-packing recommendation with cost notes.

Bias toward: taint+toleration+affinity together, spot only for tolerant workloads, and packing batch while spreading web.

Run this prompt with AI

Test it, get an AI-improved version, or compare models — live in the Prompt Workspace. No copy-paste.

Related prompts

More Kubernetes & Helm prompts & error guides

Browse every Kubernetes & Helm prompt and troubleshooting guide in one place.

Free download · 368-page PDF

Reading prompts? Get all 500 in one free PDF

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
Instant PDF download — yours free, forever
Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.

Kubernetes Taints, Tolerations & Node Bin-Packing Prompt

Related prompts

Kubernetes DaemonSet Debug Prompt

Kubernetes Job Pod Failure Policy & Success Policy Design Prompt

Kubernetes Extended Resources & Opaque Integer Design Prompt

Kubernetes Pod Overhead & RuntimeClass Accounting Prompt

Reading prompts? Get all 500 in one free PDF