RabbitMQ Connection & Channel Leak Debugging Prompt

Track down why RabbitMQ connection or channel counts keep climbing until the broker hits limits, and find the client code that opens but never closes them.

Target user

Backend and platform engineers debugging RabbitMQ client resource leaks

Difficulty

Intermediate

Tools

Claude, ChatGPT, Cursor

You are a senior platform engineer who has chased down RabbitMQ connection and channel leaks that slowly exhausted broker limits. Help me find mine. I will provide: - `rabbitmqctl list_connections name user peer_host channels state` [PASTE OUTPUT] - `rabbitmqctl list_channels connection number consumer_count messages_unacknowledged` [PASTE OUTPUT] - The growth pattern over time (steady climb, climbs under load, never drops after deploys) [DESCRIBE] - The client library/framework and how the app creates connections and channels [DESCRIBE] Your job: 1. **Confirm a leak vs legitimate growth** — distinguish a steadily climbing count that never falls (leak) from expected per-load scaling. Identify which `peer_host`/`user`/app is accumulating connections or channels. 2. **Find the anti-pattern** — the usual culprits: opening a new connection per message or per request instead of long-lived connections, opening a channel per publish without closing it, not closing channels/connections on error paths, or recreating connections on every retry. 3. **Recommend the correct model** — one (or a small pool of) long-lived connections per process, channels scoped to a unit of work and closed in a finally/using block, a publisher channel separate from consumer channels, and reconnection logic that reuses rather than multiplies. 4. **Check broker-side limits** — `channel_max`, connection limits, and file-descriptor headroom; explain what happens when they're hit (new connections refused, broker instability). 5. **Add guardrails** — alert on connection/channel count trend and per-connection channel count, and add client-side metrics so a leak is caught early. Output as: (a) the leaking source identified from the listings, (b) the specific anti-pattern, (c) the corrected connection/channel lifecycle, (d) the limits and alerts to add. Reproduce and confirm the fix on a staging broker before deploying. Do not force-close connections on a prod broker to clear the count without identifying the source — you will drop in-flight messages and the leak will simply refill.

Why this prompt works

Connection and channel leaks are slow-motion outages: the count creeps up over hours or days until the broker refuses new connections or destabilizes, often right after a deploy makes it worse. The prompt starts by separating a genuine leak — a count that climbs and never falls — from legitimate per-load scaling, and uses list_connections and list_channels to pin the leak to a specific host, user, or app rather than guessing.

It targets the handful of client anti-patterns that cause nearly every leak: opening a connection per request, a channel per publish, or failing to close resources on error paths. Channels are cheap but not free, and connections are genuinely expensive, so tying either to a single message both leaks and destroys throughput. The prompt prescribes the correct lifecycle — long-lived pooled connections, work-scoped channels closed in a finally block, reconnection that reuses rather than multiplies — which is the actual fix.

The guardrails stop the tempting but harmful reaction of force-closing connections on the broker to make the number drop. That discards in-flight messages and the count simply refills because the buggy client is still running. By insisting on finding the source, checking channel_max and FD headroom, and adding trend alerts, the prompt turns a recurring mystery into a one-time fix with early warning.

RabbitMQ Connection & Channel Leak Debugging Prompt

Why this prompt works

Related prompts

RabbitMQ Memory & Disk Alarm Resource-Limit Triage Prompt

Why this prompt works

Related prompts

RabbitMQ Memory & Disk Alarm Resource-Limit Triage Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet