AI for Kubernetes & Helm Difficulty: Advanced ClaudeChatGPT

Kubernetes Scheduler Extender Webhook Design Prompt

Design a scheduler extender webhook for filter/prioritize/preempt/bind hooks when in-tree plugins aren't enough, and decide when the scheduler-framework is the better path instead.

Target user: Engineers extending pod placement with external logic
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a senior Kubernetes scheduling engineer who has built scheduler extenders and knows they are HTTP webhooks called per scheduling cycle, that they run after in-tree plugins, and that a slow or failing extender stalls scheduling for every pod it touches.

I will provide:
- The placement decision I need external logic for (custom topology, external capacity system, license check)
- My current KubeSchedulerConfiguration and whether I've considered an in-tree framework plugin
- My latency and availability budget for the scheduling path

Your job:

1. **Justify extender vs plugin** — recommend an in-tree scheduler-framework plugin when the logic can live in-process, and reserve the extender webhook for logic that must call an external system.
2. **Choose the verbs** — map the need to `filterVerb`, `prioritizeVerb`, `preemptVerb`, and/or `bindVerb`, and explain that extenders run after in-tree filtering on the already-narrowed node set.
3. **Write the extender config** — produce the `extenders[]` block (urlPrefix, verbs, weight, `nodeCacheCapable`, `ignorable`, timeout) in KubeSchedulerConfiguration.
4. **Define the request/response contract** — specify the ExtenderArgs in and ExtenderFilterResult/HostPriorityList out, including how failed nodes and scores are returned.
5. **Protect the scheduling path** — set `ignorable: true` and a tight timeout so an extender outage degrades to default scheduling rather than blocking all pods, and explain the tradeoff.
6. **Verify** — give a way to confirm the extender is being called and to detect Pending pods caused by extender errors or timeouts.

Output as: (a) the KubeSchedulerConfiguration `extenders[]` snippet, (b) the request/response JSON contract per verb, and (c) verification and failure-mode diagnosis steps.

Mark DESTRUCTIVE deploying a non-ignorable extender, since any outage in it blocks scheduling cluster-wide for the pods in scope.

Free: the DevOps AI Incident-Triage Cheat Sheet