Designing Node Affinity, Taints, and Tolerations With AI

The pods that should have landed on our GPU nodes kept scheduling onto plain CPU workers, and the GPU nodes sat empty and expensive. The cause was a tangle of node affinity, taints, and tolerations that nobody on the team fully held in their head at once. Scheduling is the corner of Kubernetes where the rules are simple individually but combine into behavior that’s genuinely hard to reason about. This is where I lean on an AI model — not to apply changes, but to draft the right combination of constraints and, just as usefully, to explain why the current ones aren’t doing what I expected.

The three mechanisms and how they interact

Before any prompting, it helps to be precise about what each piece does, because the model’s explanations only help if you can check them:

Node affinity is a property of the pod. It expresses “I want to run on nodes that match these labels.” It attracts.
Taints are a property of the node. They say “don’t schedule here unless you explicitly tolerate this.” They repel.
Tolerations are a property of the pod. They say “I can put up with this taint.” They permit, but don’t attract.

The classic confusion: a toleration lets a pod onto a tainted node but does not pull it there. To dedicate GPU nodes to GPU workloads, you need both a taint (to keep everyone else off) and node affinity (to draw the GPU pods on). Tolerations alone leave you with empty GPU nodes — exactly my bug.

What the model needs is the shape of my nodes and what I’m trying to achieve. I get the labels and taints with read-only commands and paste the output:

kubectl get nodes --show-labels
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints

That output is just metadata — node names and labels — which is safe to share. Running kubectl get is read-only and I run it myself; the model never holds the kubeconfig that produced it. It transforms the description into YAML; it doesn’t reach into the cluster.

Drafting the dedicated-node pattern

I describe the goal plainly:

I have nodes labeled hardware=gpu that I want reserved exclusively for GPU workloads. Give me the taint command for the nodes and the pod spec (node affinity plus toleration) so only GPU pods schedule there and they actually prefer those nodes.

The model returns the matched pair. First the taint, which I review and apply myself:

kubectl taint nodes -l hardware=gpu dedicated=gpu:NoSchedule

Then the pod-side spec that both tolerates the taint and is attracted by affinity:

spec:
  tolerations:
    - key: "dedicated"
      operator: "Equal"
      value: "gpu"
      effect: "NoSchedule"
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: hardware
                operator: In
                values: ["gpu"]

Both halves present — that’s the fix. The model got the pairing right, but I confirmed the taint key in the command matches the toleration key exactly, because a one-character mismatch silently breaks the whole thing.

Required versus preferred — the choice that bites

Node affinity comes in two strengths, and picking the wrong one causes either stuck pods or sloppy placement. I ask the model to spell out the trade-off for my case:

requiredDuringSchedulingIgnoredDuringExecution — hard rule. If no node matches, the pod stays Pending forever.
preferredDuringSchedulingIgnoredDuringExecution — soft rule with a weight. The scheduler tries, but falls back to any node if needed.

For zone spreading I usually want preferred; for hardware requirements I want required. The model is good at recommending which, but I make the final call — a “required” rule that can’t be satisfied is how you get a 2 a.m. page about pods that won’t start.

      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          preference:
            matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values: ["us-east-1a"]

Pro Tip: A Pending pod with no obvious reason is almost always an unsatisfiable required affinity or a missing toleration. Run kubectl describe pod <name> and read the Events — the scheduler tells you exactly which predicate failed, and pasting that event line into the model gets you a diagnosis fast.

Verify scheduling decisions before you trust them

Generated scheduling config is easy to get subtly wrong, so I verify in stages, all without mutating prod. First a client-side dry run for validity, then I watch where a test pod actually lands in a non-prod namespace:

kubectl apply --dry-run=client -f pod.yaml
kubectl get pod gpu-test -o wide   # check the NODE column
kubectl describe pod gpu-test | grep -A5 Events

The -o wide NODE column is the moment of truth — it shows whether the pod went where the rules intended. If it landed wrong, the Events from describe tell me which constraint the scheduler honored or ignored, and that goes back to the model for another pass.

The mistakes AI makes here

Reviewing generated scheduling config, the recurring errors are specific enough to watch for:

Effect mismatch. Taint uses NoExecute but the toleration only covers NoSchedule, so running pods get evicted unexpectedly.
operator: Exists overreach. A toleration with operator: Exists and no key tolerates every taint, which quietly defeats dedicated nodes.
Affinity without the taint. The model adds affinity to attract pods but forgets the taint, so other workloads still crowd onto the nodes.

Each of these is invisible until production traffic shows up on the wrong hardware. That’s why a human reads every generated constraint.

Keep mutation human

Tainting nodes changes scheduling for the whole cluster — it’s a mutating action, so it stays with a human running the command. The model drafts the YAML and the kubectl taint line and explains the behavior; I review, dry-run, and apply. The model never holds prod credentials and never executes against the cluster.

Scheduling is fiddly enough that a fast, well-informed drafting partner genuinely speeds you up — as long as you stay the one who decides what lands where.

Dig into more placement topics in the Kubernetes & Helm category, reach for scheduling prompts or the Kubernetes prompt pack, and lean on the incident response dashboard when stuck Pending pods are the active incident.