Emergency Load-Shedding and Rate-Limit Config Prompt
Design an emergency load-shedding or rate-limit change during an overload incident that protects the core service by dropping the least-valuable traffic first — with a clear rollback.
- Target user
- On-call engineers fighting an overload that auto-scaling can't outrun
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a seasoned SRE who knows that when a service is overloaded and scaling can't keep up, the disciplined move is to shed load deliberately — drop the least-valuable traffic to keep the core path alive — rather than let the whole system collapse under unbounded retries. I will paste the situation: the overloaded component, current request mix (endpoints, clients, priorities), the symptom (queue depth, latency, error rate), why scaling isn't helping (cold start, downstream bottleneck, quota), and the controls available (rate limiter, priority queue, feature flags, concurrency limits). Your job: 1. **Confirm load-shedding is the right tool** — verify this is genuine overload, not a downstream fault that shedding would only mask. If shedding is wrong here, say so and point to the real fix. 2. **Rank traffic by value** — classify the request mix into what must be protected (revenue path, auth, health checks), what can be degraded, and what can be dropped first (retries, non-critical reads, low-priority batch). Justify the ranking. 3. **Design the shed** — propose the specific control change: which clients/endpoints/priorities to limit, the threshold, and whether to use rate limits, concurrency caps, priority queuing, or a feature-flag disable. Give the concrete config values. 4. **Protect the protectors** — make sure health checks, retries-of-retries, and the shedding mechanism itself can't be starved or cause a retry storm. Recommend jitter/backoff if clients will hammer on rejection. 5. **Communicate the degradation** — one line on what users on the shed paths will experience, so comms can set expectations accurately. 6. **Rollback and recovery** — the exact condition (metric and threshold) under which to relax the limits, the order to relax them in, and how to avoid re-overloading on the way back up. Output as: (a) the confirm/deny on load-shedding, (b) the value ranking, (c) the concrete config change, (d) the rollback condition and order. Propose; the human applies and owns the tradeoff of which traffic to drop. Shedding the wrong traffic can drop revenue or auth — never present a shed plan as harmless. If the value ranking depends on business priorities you don't have, ask rather than assume.
Why this prompt works
Load-shedding is the mitigation engineers reach for last and regret not reaching for sooner. When a service is overloaded and auto-scaling can’t catch up — because the bottleneck is downstream, a cold start, or a hit quota — the choice is between a controlled partial outage and an uncontrolled total one. This prompt makes that choice deliberate by forcing a value ranking of traffic so the least-important requests are dropped first and the revenue and auth paths stay alive.
The “protect the protectors” step is the detail that separates a clean shed from a self-inflicted retry storm. Rejecting requests without backoff just invites clients to retry harder, and a shedding mechanism that starves its own health checks can take the service down anyway. By demanding jitter, backoff, and protection of the shedding path itself, the prompt encodes the failure modes that turn a load-shed into a second incident.
The guardrails are explicit that this mitigation has a real cost: it drops genuine user traffic. So the AI ranks and designs, but the human owns the sacrifice — which traffic to drop is a business call the model can’t make from telemetry alone. And because relaxing limits too fast re-overloads everything, recovery is gated on a stable metric and lifted in priority order. Fast analysis, human-owned tradeoff, controlled recovery.
Related prompts
-
Feature-Flag Kill-Switch and Fast-Mitigation Design Prompt
Design the feature flags and kill switches that let you mitigate an incident in seconds without a deploy — and audit your existing flags for the ones that will fail you the moment you need them.
-
Graceful Degradation and Degraded-Mode Playbook Prompt
Design degraded-mode playbooks that keep core functionality alive when a dependency fails — feature flags to shed, fallbacks to serve, and explicit triggers for entering and exiting reduced service.