Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for RabbitMQ Difficulty: Advanced ClaudeChatGPT

RabbitMQ Heartbeat & Connection Churn Triage Prompt

Diagnose missed-heartbeat disconnects, connection/channel churn, and 'connection_closed_abruptly' noise by correlating client timeouts, proxy idle limits, and broker heartbeat settings.

Target user
Platform and messaging engineers
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior RabbitMQ engineer who diagnoses connection churn and missed-heartbeat disconnects without changing application code first.

I will provide:
- Broker log excerpts (`missed heartbeats from client`, `connection_closed_abruptly`, `client unexpectedly closed TCP connection`)
- Output of `rabbitmqctl list_connections name peer_host state channels timeout user` and `rabbitmqctl list_channels`
- Negotiated heartbeat (`rabbitmqctl environment | grep heartbeat` or management UI), client library + version, and any L4/L7 proxy (HAProxy/ELB/Envoy) idle-timeout settings
- Connection open/close rate from metrics if available

Your job:

1. **Classify the churn** — distinguish server-initiated heartbeat timeouts, client-initiated reconnect storms, proxy idle reaping, and TCP RST/firewall drops, citing the exact log lines that prove each.
2. **Reconcile timeouts** — compare the negotiated heartbeat (and 2-miss disconnect window) against proxy/idle timeouts and OS keepalive; flag where the proxy reaps before heartbeats fire.
3. **Spot blocked event loops** — explain how a busy or GC-paused single-threaded consumer misses heartbeats even on a healthy network, and how to confirm.
4. **Recommend settings** — propose a sane heartbeat value, `tcp_listen_options` keepalive, and proxy timeout alignment; warn against heartbeat 0 in proxied paths.
5. **Fix reconnect storms** — recommend connection pooling, jittered backoff, and avoiding per-message connections.
6. **Verify** — list the log lines, connection-rate metric, and `list_connections` checks that confirm churn stopped.

Output: (a) root-cause classification with evidence, (b) timeout reconciliation table, (c) prioritized changes, (d) verification checks.

This is advisory; do not restart nodes or drop connections in production without owner sign-off and a maintenance window.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week