RabbitMQ Streams Design & Tuning Prompt
Decide when a RabbitMQ Stream (not a queue) is the right model, then size retention, segment, and consumer-offset settings for a high-throughput, replayable log.
- Target user
- Platform and backend engineers adopting RabbitMQ Streams
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT, Cursor
The prompt
You are a senior platform engineer who has run RabbitMQ Streams in production alongside classic and quorum queues. Help me decide whether a Stream fits my workload and tune it correctly. I will provide: - The workload: messages/sec, average message size, fan-out factor, and whether consumers need to re-read history [DESCRIBE] - Why I think a Stream beats a queue here (replay, multiple independent readers, high throughput?) [DESCRIBE] - Retention requirements: how long must data stay readable, and total disk budget [DESCRIBE] - Current cluster: node count, RAM, disk type, RabbitMQ version (Streams need 3.9+, stream plugin enabled) [DESCRIBE] Your job: 1. **Confirm Stream vs queue** — a Stream is an append-only, replayable log read by offset; a queue deletes on ack. If consumers only ever read once and never replay, a quorum queue is usually simpler. Tell me plainly if I'm reaching for a Stream where a queue would do. 2. **Size retention** — translate my retention need into `max-age` and/or `max-length-bytes` plus `stream-max-segment-size-bytes`. Explain how segments are truncated at the segment boundary, so effective retention can exceed the nominal limit until a full segment rolls off. 3. **Replication and the stream coordinator** — explain `initial-cluster-size` / replication factor, that Streams use their own log replication (not classic mirroring), and the disk and network cost of N replicas at my throughput. 4. **Consumer offset strategy** — single active consumer vs offset tracking, server-side offset storage, and how a new consumer chooses `first` / `last` / `next` / a timestamp. Flag the failure mode where consumers never commit offsets and re-read from the start on restart. 5. **Validate** — what to watch: committed offset lag per consumer, disk used vs `max-length-bytes`, segment file count, and publish/confirm latency. Output as: (a) Stream-vs-queue verdict with the deciding factor, (b) a concrete `x-queue-type: stream` / policy config with retention values and the reasoning, (c) the offset strategy, (d) the metrics that prove it's healthy. Validate retention and replication settings on a staging cluster with representative message sizes before production. Under-sized `max-length-bytes` silently discards history readers may expect; over-sized retention can fill disk and trip the disk alarm — model the disk math first.
Why this prompt works
RabbitMQ Streams look like a queue and quack like a queue, which is exactly why engineers misuse them. A Stream is an append-only log read by offset and retained by policy, not a destructive-read queue. The single most valuable thing this prompt does is force the Stream-vs-queue decision up front, because most “we need Streams” requests are really “we need durable fan-out” or “we need replay once” — and half of those are better served by a quorum queue with less operational surface. The prompt makes the model state the deciding factor rather than rubber-stamping the choice.
Once a Stream is the right call, the hard part is retention and offsets, and both have non-obvious failure modes. Retention truncates at segment boundaries, so the disk footprint is lumpy and effective retention overshoots the nominal limit until a segment rolls off — a detail that wrecks naive disk math. The prompt ties max-age, max-length-bytes, and stream-max-segment-size-bytes together so the disk budget is modeled against segments, not a single number.
The offset section closes the loop on the most common production surprise: a consumer that never commits an offset re-reads the entire log on every restart. By making offset strategy explicit and demanding metrics for committed-offset lag, the prompt turns Streams from a footgun into a tool with a verifiable healthy state. The staging guardrail keeps the disk and replication math honest before it meets real traffic.