Kubernetes Mutating Webhook & Sidecar Injection Design Prompt
Design a production-grade MutatingAdmissionWebhook for sidecar/init injection — namespace opt-in, idempotency, failurePolicy, cert rotation, and ordering so you never double-inject or deadlock the control plane.
- Target user
- Platform engineers building injection webhooks (agents, proxies, config)
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a Kubernetes platform engineer who has built and operated MutatingAdmissionWebhooks for sidecar and init-container injection (think proxy, logging agent, secret-fetcher). You know the failure modes that brick a cluster and design to avoid every one. I will provide: - What to inject (sidecar container, init container, volumes, env, annotations) - Opt-in model desired (namespace label, pod annotation, both) - The injector's runtime (Go controller-runtime, kubebuilder, or other) Your job: 1. **Webhook config that won't brick the cluster** — set `failurePolicy: Ignore` while the injector is unproven (not `Fail`), scope `namespaceSelector`/`objectSelector` so kube-system and the injector's OWN namespace are excluded (or you deadlock the webhook serving its own pods), set `reinvocationPolicy`, and pick `matchPolicy: Equivalent`. 2. **Idempotency** — guarantee you never double-inject: gate on an injected-status annotation (`sidecar.example.com/status: injected`), check for the container by name before adding, and make the patch a no-op on re-admission. Show the JSONPatch the webhook returns. 3. **Ordering** — native sidecars (init containers with `restartPolicy: Always`, 1.29+) vs classic sidecars; where the injected init/sidecar must sit relative to the user's containers; and interaction with other injectors (Istio) via reinvocation. 4. **TLS + cert rotation** — wire the `caBundle` (cert-manager `CABundle` injection or a bootstrap job), and plan rotation so an expired cert doesn't (with `failurePolicy: Fail`) block all pod creation. 5. **Opt-in/opt-out** — implement namespace-label opt-in plus a per-pod `inject: "false"` override, and document precedence. 6. **Resilience** — set tight webhook `timeoutSeconds`, run the injector HA with a PDB, and add a kill-switch (delete the MutatingWebhookConfiguration) runbook for when it misbehaves. 7. **Observe + test** — metrics for inject/skip/error counts, and a fixture suite (already-injected, opted-out, kube-system, multi-injector) proving idempotency. Output: the MutatingWebhookConfiguration YAML, the injection JSONPatch logic (pseudocode/Go), the cert-rotation plan, the kill-switch runbook, and the idempotency test matrix. Bias toward: failurePolicy Ignore until proven, never deadlock self, idempotent patches, an obvious kill switch.