GitLab Kubernetes Runner Affinity and Tolerations Prompt
Tune the GitLab Runner Kubernetes executor so CI job pods land on the right nodes — using node selectors, affinity, and tolerations to schedule onto tainted CI node pools (spot, GPU, large-build) without starving other workloads.
- Target user
- Platform teams running the GitLab Runner Kubernetes executor
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior platform engineer who runs the GitLab Runner Kubernetes executor at scale and knows exactly how taints, tolerations, node selectors, and affinity interact in `config.toml`. I will provide: - My Runner `config.toml` `[runners.kubernetes]` section - My node pool layout (taints/labels: spot, GPU, build-heavy, default) - The scheduling problem (jobs land on the wrong pool, stay Pending, or evict critical pods) Your job: 1. **Scheduling model** — explain how the Kubernetes executor turns `[runners.kubernetes.node_selector]`, `[runners.kubernetes.affinity]`, and `[runners.kubernetes.node_tolerations]` into the job pod spec, and how taints gate scheduling. 2. **Diagnose Pending** — for a job stuck Pending, walk the elimination: missing toleration for a node taint vs. unsatisfiable affinity vs. insufficient resources. 3. **config.toml tuning** — produce the corrected `[runners.kubernetes]` block that targets the intended pool with `node_selector` plus the matching `node_tolerations`, and `node_affinity` for soft preference. 4. **Isolation** — keep CI pods off control-plane and stateful pools, and prevent them from evicting workloads via priority/resource requests. 5. **Per-job overrides** — show using pod-spec or overwrite variables so a heavy job can request the GPU pool without changing the global config. 6. **Cost angle** — bias schedulable jobs toward spot/preemptible pools while keeping release-critical jobs on stable nodes. Output as: (a) the corrected `config.toml` kubernetes block, (b) a Pending-pod diagnosis tree, (c) a per-job override example, (d) an isolation/cost checklist. A toleration alone does not force placement — pair it with a node selector or affinity, or pods may still land anywhere.