NGINX Upstream Health Checks & Load Balancing Prompt
Design an upstream block with the right load-balancing algorithm, passive health checks, and failover tuning — so a single sick backend stops poisoning your error rate instead of getting quietly removed from rotation.
- Target user
- Engineers load-balancing multiple backend instances behind NGINX
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT, Cursor
The prompt
You are a senior infrastructure engineer who tunes NGINX upstreams for resilience. You know the difference between the load-balancing algorithms, you know passive health checks (open source) from active ones (NGINX Plus), and you never ship `max_fails 0` by accident.
I will provide:
- The backend instances (count, addresses, capacity differences): [DESCRIBE BACKENDS]
- Whether sessions are sticky / stateful: [STATELESS / STICKY + DETAILS]
- Your NGINX flavor: [open source / NGINX Plus]
- Traffic shape (steady, bursty, long requests): [DESCRIBE]
- How a backend typically fails (crash, slow, 5xx): [DESCRIBE]
Build the config:
1. **Algorithm choice** — recommend round-robin, `least_conn`, or `ip_hash`/hash for stickiness, and justify it from the traffic shape and session needs. Explain the trade-off of each in one line.
2. **Weights** — if instances differ in capacity, show `weight=` and explain how it skews distribution.
3. **Passive health checks** — set `max_fails` and `fail_timeout` so a backend that returns errors or times out is taken out of rotation, then probed again. Explain exactly what counts as a "fail" and how `proxy_next_upstream` interacts with this.
4. **Failover behavior** — configure `proxy_next_upstream` (and its `_tries`/`_timeout` caps) so a failed request retries another backend without retrying forever or duplicating non-idempotent writes.
5. **Active checks (if NGINX Plus)** — show the `health_check` directive and a `match {}` block validating status and body; otherwise state clearly that open-source NGINX only has passive checks and suggest an external prober.
Output: (a) the complete commented `upstream {}` block plus the relevant `location` directives, (b) a table of each resilience directive and the failure it guards against, (c) a note on how to observe ejections in the error log, plus the `nginx -t` line. Validate with `nginx -t` and reload — do not edit a live prod upstream in place.
Why this prompt works
Load balancing looks trivial — list a few servers in an upstream block and you’re done — which is exactly why it fails badly. The defaults give you plain round-robin with no health checking, so a backend that’s up but returning 500s keeps getting one-third of your traffic. This prompt forces the two decisions that actually matter: the algorithm (which depends on whether your sessions are sticky and your requests are uniform) and the passive health-check thresholds (max_fails/fail_timeout) that pull a sick instance out of rotation.
The retry behavior is the subtle trap. proxy_next_upstream is what lets NGINX retry a failed request on another backend, but if you enable it for POSTs you can duplicate a payment or a write when a backend is merely slow rather than dead. Making the model reason about idempotency and cap the retries turns a foot-gun into a deliberate, bounded policy.
The prompt also pins down a licensing reality that trips people up constantly: active health_check probes are an NGINX Plus feature. Asking the model to state your flavor and fall back to passive checks plus an external prober prevents the classic case of pasting a health_check directive into open-source NGINX and getting a config that fails nginx -t — or worse, one that quietly does nothing.
Related prompts
-
NGINX 502/504 Bad Gateway Triage Prompt
Turn a wall of error.log lines plus your upstream config into a ranked root-cause list and a concrete fix for 502/504 errors — without guessing or restarting blindly.
-
NGINX Reverse-Proxy vhost Design Prompt
Generate a clean, production-ready reverse-proxy server block for your backend app — correct headers, timeouts, keepalive, and WebSocket support — instead of copy-pasting a Stack Overflow snippet that leaks the client IP.