NGINX Rate & Connection Limiting Prompt
Design limit_req and limit_conn zones that throttle abusive traffic and protect your backend — with burst, nodelay, and per-route tuning — without rate-limiting your own legitimate users into 503s.
- Target user
- Engineers protecting an API or login endpoint from floods and brute-force traffic
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT, Cursor
The prompt
You are a senior platform engineer who tunes NGINX rate limiting for real traffic. You know `limit_req` (leaky bucket) from `limit_conn` (concurrent connections), and you size zones and bursts from actual request patterns, not vibes.
I will provide:
- The endpoints to protect and their normal traffic shape (RPS, burstiness): [DESCRIBE TRAFFIC]
- The abuse I'm seeing or fear (login brute-force, scraping, flood): [DESCRIBE THREAT]
- Whether I sit behind a CDN/LB (so `$binary_remote_addr` may be the proxy): [DESCRIBE FRONTING]
- Current limiting config, if any: [PASTE CONFIG]
Design it:
1. **Choose the key.** `$binary_remote_addr` for direct clients; if behind a CDN/LB, use the real client IP from `X-Forwarded-For` via a `map`/`realip` setup, and explain why limiting on the proxy IP would throttle everyone.
2. **limit_req zones** — define `limit_req_zone` in `http {}` with a sane `rate` and memory size, then apply `limit_req` per `location` with `burst=` and a reasoned choice of `nodelay` vs delayed. Explain the leaky-bucket behavior so the user understands what burst actually permits.
3. **limit_conn zones** — `limit_conn_zone` + `limit_conn` to cap concurrent connections per key for slow-drip/Slowloris-style abuse; pair with timeouts.
4. **Per-route policy** — stricter limits on `/login` and auth endpoints, looser on static; show the layered config.
5. **Response + observability** — set `limit_req_status` / `limit_conn_status` (e.g. 429), and how to read `limiting requests` lines in error.log to tune without flying blind.
6. **Avoid self-DoS** — how to test the limits won't trip on legitimate bursts before enforcing.
Output: (a) the `http {}` zone definitions, (b) the per-`location` directives with comments, (c) a tuning rationale tying rate/burst to my stated traffic, (d) a test plan (e.g. a controlled load test) and the `nginx -t` step. Validate with `nginx -t` and reload; roll out in log-only/observe mode first rather than hot-enforcing on live prod.
Why this prompt works
The number-one rate-limiting bug is keying on the wrong IP: behind a CDN or load balancer, $binary_remote_addr is the proxy, so one zone counter covers your entire user base and a single noisy client trips the limit for everyone. The prompt makes this the first decision and forces the real-client-IP setup when fronted, eliminating the most common self-inflicted outage.
limit_req and limit_conn solve different attacks — one shapes request rate with a leaky bucket, the other caps concurrent connections against slow-drip floods — and burst/nodelay semantics are routinely misunderstood. By requiring the model to explain the bucket behavior and justify burst sizing against your stated traffic, you get limits you can defend rather than copy.
The observe-first rollout and the error.log tuning step keep a human in control. You watch real limiting requests lines and load-test legitimate bursts before enforcing, gated by nginx -t, so the protection never becomes the outage.