Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for Redis Difficulty: Intermediate ClaudeChatGPT

Redis Rate Limiter Design Prompt

Design token bucket and sliding window rate limiters in Redis using INCR/EXPIRE or atomic Lua, avoiding race conditions and TTL bugs.

Target user
API and platform engineers building throttling
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior platform engineer who has built production rate limiters on Redis and knows the concurrency traps.

I will provide:
- The limit I want to enforce (requests per window, per key)
- The identity being limited (user, IP, API key, tenant)
- My current approach if any

Your job:

1. **Pick an algorithm**: fixed window (INCR + EXPIRE, cheap but bursty at boundaries), sliding window log (ZSET of timestamps, accurate but memory-heavy), sliding window counter (weighted two windows), or token bucket (smooth, allows bursts).
2. **Fix the classic INCR/EXPIRE race**: if you INCR then EXPIRE as two commands, a crash between them leaves a key with no TTL that never resets. Set EXPIRE only when the counter is first created, atomically in Lua, or use `SET key 1 EX <win> NX` then INCR.
3. **Make it atomic**: for anything beyond fixed window, use a single Lua script so read-decide-write cannot interleave across clients.
4. **Token bucket in Lua**: store tokens + last-refill timestamp in a hash; on each request compute refill = elapsed * rate, cap at capacity, allow if tokens >= 1, decrement.
5. **Sliding window log**: ZADD now, ZREMRANGEBYSCORE to drop old entries, ZCARD to count, set a TTL on the ZSET.
6. **Key design**: `rl:<identity>:<window>`; keep TTLs short so keys self-clean.
7. **Return useful signals**: remaining quota and reset time for X-RateLimit headers.
8. **Plan for Cluster**: keep all keys for one limiter on one slot with hash tags; consider per-node vs global limits.

Mark DESTRUCTIVE: FLUSHDB to reset counters on a shared instance, mass DEL of live rl:* keys, CONFIG changes to maxmemory-policy while testing.

---

Limit to enforce: [DESCRIBE]
Identity limited: [DESCRIBE]
Current approach: [DESCRIBE]

Why this prompt works

Rate limiters look trivial and are full of races. The INCR/EXPIRE gap, boundary bursts, and non-atomic read-decide-write bugs all pass casual review and fail under load. This prompt makes the model pick an algorithm deliberately and prove atomicity before you deploy a limiter that leaks or over-blocks.

How to use it

  1. State the exact limit (N per window) and the identity dimension.
  2. Say whether boundary bursts are acceptable — decides fixed vs sliding.
  3. Ask for a Lua implementation for anything beyond fixed window.
  4. Confirm the eviction policy on the instance holding limiter keys.

Useful commands

# Fixed window, atomic first-set with TTL then increment
redis-cli SET rl:user:42:1m 1 EX 60 NX     # creates counter with TTL only once
redis-cli INCR rl:user:42:1m               # subsequent hits
redis-cli TTL rl:user:42:1m                # reset time remaining

# Sliding window log with a sorted set
NOW=$(date +%s%3N)
redis-cli ZADD rl:ip:1.2.3.4 "$NOW" "$NOW"
redis-cli ZREMRANGEBYSCORE rl:ip:1.2.3.4 0 $((NOW-60000))
redis-cli ZCARD rl:ip:1.2.3.4
redis-cli PEXPIRE rl:ip:1.2.3.4 60000

Example config

# Atomic token bucket in Lua
# KEYS[1] = bucket hash, ARGV: rate/sec, capacity, now(ms), requested tokens
redis-cli EVAL '
local b = redis.call("HMGET", KEYS[1], "tokens", "ts")
local tokens = tonumber(b[1]) or tonumber(ARGV[2])
local ts = tonumber(b[2]) or tonumber(ARGV[3])
local rate = tonumber(ARGV[1])
local cap = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local want = tonumber(ARGV[4])
tokens = math.min(cap, tokens + (now - ts)/1000 * rate)
local allowed = 0
if tokens >= want then tokens = tokens - want; allowed = 1 end
redis.call("HMSET", KEYS[1], "tokens", tokens, "ts", now)
redis.call("PEXPIRE", KEYS[1], math.ceil(cap/rate*1000))
return allowed' 1 rl:tb:user:42 5 10 "$(date +%s%3N)" 1

Common findings this catches

  • INCR/EXPIRE race → immortal counter blocks a key.
  • Boundary burst → 2x limit at window edges.
  • Non-atomic path → concurrent overshoot.
  • Unbounded ZSET → memory growth on log limiter.
  • Evictable keys → limit resets early under memory pressure.
  • Cluster scatter → global limit split across slots.
  • No reset headers → clients cannot back off gracefully.

When to escalate

  • Global limits across many app nodes — consider a dedicated limiter instance.
  • Very high request rates — evaluate cell-local limits plus async aggregation.
  • Abuse/DDoS scenarios — combine with edge/WAF layers, not Redis alone.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week