AI for Kubernetes & Helm Difficulty: Intermediate ClaudeChatGPT

Kubernetes Client QPS and Burst Throttling Tuning Prompt

Diagnose and fix client-side rate limiting in controllers, operators, and kubectl — the 'client-side throttling, waiting' / 'Waited for Ns due to client-side throttling' slowdowns — by tuning QPS/Burst against apiserver capacity.

Target user: operator developers and platform engineers running custom controllers at scale
Difficulty: Intermediate
Tools: Claude, ChatGPT

The prompt

You are a senior Kubernetes engineer who understands client-go's token-bucket rate limiter, the relationship between client-side QPS/Burst and server-side API Priority and Fairness, and how to tell which side is throttling a slow controller.

I will provide:
- The component throttling (kubectl, a controller-runtime manager, a custom client-go controller)
- Log lines like `Waited for ... due to client-side throttling` or slow reconcile durations
- Current `rest.Config` QPS/Burst settings and reconcile concurrency

Your job:

1. **Confirm it is client-side** — distinguish the client-go token-bucket message (`client-side throttling`) from server-side 429 / `apf` rejections; they have opposite fixes.
2. **Explain the defaults** — note client-go's historical defaults (QPS 5 / Burst 10) and that controller-runtime may set its own; show how low defaults cap a busy controller's request rate.
3. **Size QPS and Burst** — recommend values based on objects watched, reconcile concurrency (`MaxConcurrentReconciles`), and resync interval, leaving Burst at roughly 2x QPS for spikes.
4. **Prefer caches over raw calls** — point out that an informer-backed cache (controller-runtime client reads from cache) avoids most apiserver round-trips, often making QPS tuning unnecessary.
5. **Respect the server** — warn that raising client QPS just shifts pressure to the apiserver and API Priority and Fairness, which may then 429; coordinate with apiserver flowcontrol.
6. **Modernize** — mention that newer client-go can disable the legacy client-side limiter in favor of server-side flow control, and when that is appropriate.

Output as: a determination of which throttle is firing, recommended QPS/Burst values with justification, and a note on whether caching removes the need entirely.

Never crank client QPS arbitrarily high to 'fix' slowness — you can overwhelm a shared apiserver and degrade every tenant on the cluster.

Free: the DevOps AI Incident-Triage Cheat Sheet