Skip to content
CloudOps
Newsletter
All prompts
AI for RabbitMQ Difficulty: Advanced ClaudeChatGPTCursor

RabbitMQ Queue Backpressure & Flow-Control Triage Prompt

Diagnose why a RabbitMQ queue is backing up and producers are being throttled, and decide whether the bottleneck is slow consumers, flow control, or a resource alarm.

Target user
Platform and SRE engineers triaging RabbitMQ throughput incidents
Difficulty
Advanced
Tools
Claude, ChatGPT, Cursor

The prompt

You are a senior platform engineer who has triaged many RabbitMQ backpressure incidents where queues grow without bound and publishers stall. Walk me through diagnosing mine.

I will provide:
- `rabbitmqctl list_queues name messages messages_ready messages_unacknowledged consumers consumer_utilisation` [PASTE OUTPUT]
- Connection state showing flow control: `rabbitmqctl list_connections name state` and channel `list_channels` [PASTE OUTPUT]
- Any resource alarms: `rabbitmqctl status` memory/disk alarm section, and `list_queues memory` [PASTE OUTPUT]
- Symptoms: publishers slow/blocked, growing queue depth, rising latency [DESCRIBE]

Your job:

1. **Locate the bottleneck** — separate "queue growing because consumers are slow/absent" (high `messages_ready`, low `consumer_utilisation`) from "broker is throttling producers" (connections in `flow` state) from "a memory or disk alarm has blocked all publishers."

2. **Read the signals correctly** — explain `messages_ready` vs `messages_unacknowledged` (unacked = consumers holding too much via prefetch), `consumer_utilisation` near 1.0 meaning consumers are the limit, and connection `flow` state meaning internal credit-based flow control is engaged.

3. **Trace causes** — slow downstream dependency, too few consumers, prefetch too low (consumers idle waiting) or too high (one consumer hoards), large unacked backlog from a stuck consumer, or memory/disk watermark crossed.

4. **Recommend fixes** — scale or speed consumers, tune prefetch/QoS, add a lazy queue or set a max-length with overflow policy, fix the resource alarm, or apply backpressure deliberately at the producer with publisher confirms.

5. **Prevent recurrence** — what to alert on (queue depth trend, `messages_unacknowledged`, connections in flow, alarm state) so this is caught before publishers block.

Output as: (a) the diagnosed bottleneck with the specific metric that proves it, (b) immediate mitigation, (c) root-cause fix, (d) the alerts to add.

Validate any queue-policy or prefetch change on a staging broker before prod. Do not purge a backed-up queue to "relieve pressure" without review — purging discards real messages and hides the actual cause.

Why this prompt works

Backpressure incidents are confusing because three different mechanisms produce similar symptoms: slow consumers, RabbitMQ’s internal credit-based flow control, and resource alarms that block publishers outright. The prompt forces you to distinguish them using the exact metrics that tell them apart — messages_ready versus messages_unacknowledged, consumer_utilisation, and connection flow state — rather than guessing. That distinction changes the fix entirely: scaling consumers does nothing if the real problem is a disk-free alarm that has blocked every publisher.

It encodes the right mental model of unacked messages. A large messages_unacknowledged count usually means consumers have pulled work via prefetch but aren’t acking it — either because they’re slow, stuck, or prefetch is set too high and one consumer is hoarding. Reading that signal correctly is what separates a five-minute fix from an hour of restarting random services.

The guardrails address the most common harmful reflex during a backpressure incident: purging the queue to make the number go down. That destroys real messages and erases the evidence of what caused the backup. By steering toward staging validation, deliberate producer-side backpressure with publisher confirms, and the right alerts, the prompt turns a panic into a diagnosis.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week