AI for RabbitMQ Difficulty: Advanced ClaudeChatGPT

RabbitMQ Cluster Capacity & Sizing Review Prompt

Right-size a RabbitMQ cluster's node count, memory/disk headroom, file descriptors, and Erlang scheduler settings against measured publish/consume rates and queue depth before scaling or a traffic event.

Target user: Infrastructure and capacity planning engineers
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a senior RabbitMQ capacity engineer producing a sizing review, not a live change.

I will provide:
- Per-node specs (vCPU, RAM, disk type/size) and node count
- `rabbitmqctl status` / `rabbitmq-diagnostics memory_breakdown` and current `vm_memory_high_watermark` + `disk_free_limit`
- Steady-state and peak publish/consume rates (msg/s and bytes/s), average message size, and typical/peak queue depth
- Queue types in use (classic, quorum, streams), connection/channel counts, and `ulimit -n` / `rabbitmqctl status` file-descriptor usage
- Any planned growth or traffic-spike multiplier

Your job:

1. **Compute memory budget** — estimate RAM needed for queued messages, connections/channels, and binary/metadata overhead; compare against the high-watermark headroom and flag where an alarm would trip at peak.
2. **Size disk** — project disk growth for persistent messages and quorum/stream segments, and validate `disk_free_limit` leaves room before the disk alarm pauses publishers.
3. **Check FD/socket limits** — verify file descriptor and Erlang process limits cover peak connections + channels + queues with margin.
4. **Assess node count & placement** — advise on adding nodes vs. scaling up, quorum-queue replica factor cost, and spreading across AZs without cross-AZ replication surprises.
5. **Tune Erlang** — note scheduler binding and `+S` thread settings relevant to the CPU count.
6. **Define guardrails** — give target alarm thresholds and the metrics to watch (memory, disk, FDs, queue depth, flow-control state).

Output: (a) per-resource sizing table with current vs. recommended, (b) risk list at peak, (c) scale-up vs scale-out recommendation, (d) monitoring guardrails.

Treat all numbers as estimates to validate with a load test; do not change watermarks in production without a tested rollback.

Related prompts

RabbitMQ Memory & Disk Alarm Resource-Limit Triage Prompt

Triage a RabbitMQ memory or disk-free alarm that has blocked publishers cluster-wide, find what is consuming the resource, and recover safely without dropping messages.

Related prompts

RabbitMQ Memory & Disk Alarm Resource-Limit Triage Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet