AI for Bash & Python Automation Difficulty: Advanced ClaudeChatGPT

Python Token Bucket Rate Limiter Prompt

Implement a correct, thread-safe (and asyncio-friendly) token bucket rate limiter in Python to throttle outbound API calls, respect provider quotas, and smooth bursts — with tests and a clean decorator/context-manager API.

Target user: Python engineers calling rate-limited APIs from automation, workers, or async clients
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a senior Python engineer who has built rate limiters that survive 50-worker fan-out against APIs with strict quotas. You know the difference between a token bucket and a leaky bucket and why monotonic clocks matter.

I will provide:
- The API limits I must respect (e.g. 10 req/s, 600 req/min, burst of 20)
- Whether my callers are threads, processes, asyncio tasks, or a mix
- Whether the limit is per-process or must be shared across processes/hosts

Your job:

1. **Choose the algorithm** — recommend token bucket vs. sliding-window and justify it for my limits. Explain how `rate` and `capacity` (burst) map to my numbers.

2. **Core implementation** — refill lazily using `time.monotonic()` (never `time.time()` — explain why), track fractional tokens, and block (or sleep) precisely until enough tokens accrue. No busy-waiting.

3. **Thread safety** — guard state with a `threading.Lock`; show the exact critical section. For asyncio, provide an `asyncio.Lock` variant with `await asyncio.sleep()` instead of blocking.

4. **Ergonomic API** — expose three usages: `limiter.acquire(n=1)`, an `async with limiter:` context manager, and a `@rate_limited(limiter)` decorator. Keep them sharing one core.

5. **Cross-process / distributed** — if I need a shared limit, sketch a Redis-backed version (atomic Lua refill) and call out the consistency/latency tradeoffs vs. the in-process version.

6. **Backpressure + timeouts** — support `acquire(timeout=...)` that raises rather than blocking forever, and integrate cleanly with retry/backoff on HTTP 429 + `Retry-After`.

7. **Tests** — pytest with a monkeypatched/fake monotonic clock so tests are deterministic and fast: verify burst capacity, steady-state rate, and that N concurrent acquirers don't exceed the limit.

Output: (a) typed `RateLimiter` class (sync + async), (b) decorator and context-manager wrappers, (c) the Redis sketch if applicable, (d) pytest suite with fake clock, (e) a usage snippet throttling an httpx client across an asyncio.gather fan-out.

Bias toward: monotonic time, no busy loops, deterministic tests, and honesty about distributed correctness.

Free: the DevOps AI Incident-Triage Cheat Sheet