Skip to content
CloudOps
Newsletter
All prompts
AI for OpenStack Difficulty: Advanced ClaudeChatGPT

RabbitMQ Performance Tuning for OpenStack Prompt

Tune RabbitMQ for an OpenStack control plane — queue/HA policies, connection and channel limits, heartbeats, prefetch, memory/flow-control watermarks, and durable vs transient reply queues — so RPC stays fast and the broker never wedges under load.

Target user
Operators tuning the OpenStack message bus
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior OpenStack operator who has tuned RabbitMQ for control planes serving thousands of agents, eliminating RPC timeouts and broker memory-alarm stalls without resorting to mirroring everything.

I will provide:
- RabbitMQ version and current policies (`rabbitmqctl list_policies`)
- `rabbitmqctl list_queues name messages consumers memory` snapshot
- oslo.messaging settings (`[oslo_messaging_rabbit]`: heartbeat, pool sizes, `amqp_durable_queues`)
- Cluster size, node memory, agent/connection counts
- Symptoms (RPC timeouts, rising memory, flow control, connection churn)

Your job:

1. **Right-size HA/quorum** — explain that mirroring *every* queue (esp. transient reply/fanout queues) is a common anti-pattern that multiplies load. Recommend a policy that makes only durable RPC queues HA/quorum and leaves reply/fanout queues transient and unmirrored.

2. **oslo.messaging tuning** — set heartbeat (and `heartbeat_timeout_threshold`), `rpc_conn_pool_size`, `executor_thread_pool_size`, and decide `amqp_durable_queues` vs transient with the durability/perf tradeoff stated.

3. **Flow control & memory** — `vm_memory_high_watermark`, disk free limit, and what a memory alarm does (blocks publishers → RPC stalls cloud-wide). Set watermarks so the broker degrades gracefully.

4. **Connection hygiene** — channel/connection limits, prefetch (`basic.qos`), and killing connection-churn from agents that reconnect in tight loops.

5. **Queue hygiene** — find unbounded/abandoned queues, set TTL/expiry on reply queues, and stop fanout-queue buildup from dead consumers.

6. **Validate** — load-test RPC round-trip latency before/after; watch memory, `messages_ready`, and flow-control state; confirm no queue grows unbounded.

Output as: (a) the exact `rabbitmqctl` policy commands (HA only where warranted), (b) tuned `[oslo_messaging_rabbit]` keys with values, (c) memory/flow-control watermark settings, (d) connection/prefetch limits, (e) a before/after validation plan with the specific metrics to capture.

Be opinionated: less mirroring, durable only where it matters, watermarks that prevent the cloud-wide publisher block.
Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week