RabbitMQ Error Guide: 'resource alarm' Memory/Disk Free

Overview

A resource alarm is RabbitMQ’s self-protection mechanism. When the broker’s memory use crosses the high watermark, or free disk space drops below the disk free limit, RabbitMQ raises an alarm and blocks all publishing connections to stop itself from running out of resources and crashing. Consumers keep draining; publishers are paused (TCP back-pressure) until the alarm clears.

You will see it in the broker log:

vm_memory_high_watermark set. Memory used:4203741184 allowed:4127195136
memory resource limit alarm set on node rabbit@mq-01.
**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************

Or for disk:

disk resource limit alarm set on node rabbit@mq-01.
Free disk space is insufficient. Free bytes: 41943040. Limit: 50000000.
Publishers will be blocked until this alarm clears.

Clients do not get an AMQP error — instead their publish calls block or their connection.blocked callback fires. From the app’s perspective, publishes simply stop completing.

Symptoms

Publishers hang or report the connection as blocked; consumers continue normally.
connection.blocked notifications appear in client logs.
The broker log shows memory resource limit alarm set or disk resource limit alarm set.

rabbitmqctl status | grep -iA4 'Alarms\|alarms'

Alarms

 * {resource_limit,memory,rabbit@mq-01}

rabbitmqctl list_connections name state | grep -i block | head

10.0.5.31:51544 -> 10.0.4.21:5672  blocked
10.0.5.31:51602 -> 10.0.4.21:5672  blocking

blocked/blocking connection states confirm the publisher-side back-pressure.

Common Root Causes

1. Memory high watermark crossed by a backed-up queue

A queue whose consumers stalled (or never connected) grows until total memory crosses the watermark.

rabbitmqctl list_queues name messages messages_ready memory --sort=memory | tail -5

events    1842301  1842301  3122884096
audit     412      0        184320

The events queue holds 1.8M messages using ~3.1 GB — its stuck consumers are the root cause of the memory alarm.

2. The memory high watermark is set too low for the host

A small vm_memory_high_watermark relative to actual RAM trips the alarm early.

rabbitmqctl environment | grep -A3 'vm_memory_high_watermark'

{vm_memory_high_watermark,{absolute,4000000000}},
{vm_memory_high_watermark_paging_ratio,0.5},

A 4 GB absolute limit on a 16 GB host wastes capacity and trips early. If the host genuinely has headroom, raise the watermark.

3. Disk free limit reached

Logs, message persistence, or another process filled the disk holding the RabbitMQ data directory below the free limit.

df -h $(rabbitmqctl eval 'rabbit_mnesia:dir().' | tr -d '"')

Filesystem  Size  Used Avail Use% Mounted on
/dev/sdb1    50G   49G  1.2G  98% /var/lib/rabbitmq

Only 1.2 GB free, below the disk free limit — the disk alarm is set. Free space or grow the volume.

4. Many connections/channels consuming memory

Tens of thousands of connections or channels, or large unacked message backlogs, push memory up independent of any single queue.

rabbitmqctl status | grep -A20 'Memory' | grep -E 'connection|channel|queue_procs|binary'

connection_readers: 0.18 gb
connection_channels: 0.21 gb
queue_procs: 2.9 gb
binary: 3.0 gb (1.4 gb available)

If binary/queue_procs dominate, message bodies in queues are the issue; if connection_* dominates, connection/channel sprawl is.

5. Unacked messages held in memory by slow consumers

Messages delivered but never acked stay in RAM. A high prefetch with slow processing inflates memory.

rabbitmqctl list_queues name messages_unacknowledged --sort=messages_unacknowledged | tail -3

images   78422

78k unacked messages on images are pinned in memory until the consumer acks or the connection drops.

6. Disk free limit configured as a large absolute value

A misconfigured disk_free_limit (e.g., set to many GB) can keep the disk alarm permanently set on a modest volume.

rabbitmqctl environment | grep -A2 'disk_free_limit'

{disk_free_limit,{absolute,"10GB"}},

On a 50 GB disk that normally sits at 45 GB used, a 10 GB free requirement keeps the alarm latched. Tune to the volume.

Diagnostic Workflow

Step 1: Confirm which alarm is set

rabbitmqctl status | grep -iA4 'Alarms'

{resource_limit,memory,...} vs. {resource_limit,disk,...} tells you whether to chase RAM or disk. Both can be set at once.

Step 2: For a memory alarm, find the heaviest queue

rabbitmqctl list_queues name messages messages_ready messages_unacknowledged memory \
  --sort=memory | tail -10

The top queue by memory is your target. Check whether it is backed up because consumers are stopped (messages_ready high) or because of unacked messages (messages_unacknowledged high).

Step 3: For a disk alarm, find what filled the volume

df -h $(rabbitmqctl eval 'rabbit_mnesia:dir().' | tr -d '"')
du -sh /var/lib/rabbitmq/mnesia/* 2>/dev/null | sort -h | tail -5

Distinguish message persistence growth from runaway logs or an unrelated process on the same filesystem.

Step 4: Take the corrective action that clears the alarm

For memory: restore the stalled consumers, or purge a non-critical backlog.

# restart/scale consumers, OR purge if data is disposable:
rabbitmqctl purge_queue events

For disk: free space or grow the volume.

sudo journalctl --vacuum-size=200M
# or extend the underlying volume / clean old logs

Step 5: Confirm the alarm cleared and publishers resumed

rabbitmqctl status | grep -iA4 'Alarms'
rabbitmqctl list_connections name state | grep -ci block

Empty alarms and zero blocked connections mean publishers are flowing again.

Example Root Cause Analysis

Producers across several services report blocked connections; orders stop being published. rabbitmqctl status shows a memory alarm on rabbit@mq-01.

Finding the heaviest queue:

rabbitmqctl list_queues name messages_ready messages_unacknowledged memory --sort=memory | tail -3

notifications   2104339  0  3344302080
orders          88       12 401408

The notifications queue holds 2.1M ready messages using ~3.3 GB. Its consumer service had been scaled to zero during a cost-cutting change, so nothing was draining it while producers kept publishing. The queue grew until total broker memory crossed the watermark and blocked every publisher — including the unrelated orders producers.

Fix: bring the notifications consumers back online so the queue drains, and (because most of that backlog was stale marketing pushes) purge the bulk of it to clear the alarm immediately:

rabbitmqctl purge_queue notifications
rabbitmqctl status | grep -iA2 'Alarms'

Alarms

(none)

With the alarm cleared, blocked connections return to running and all publishers resume. The lasting fix was an alert on the notifications consumer count so a scale-to-zero cannot silently back up a shared broker again.

Prevention Best Practices

Set vm_memory_high_watermark.relative to a sensible fraction (e.g., 0.6) of real RAM rather than a stale absolute value, and size the host for peak queue depth.
Alert on queue depth and on messages_unacknowledged per queue, so a stalled consumer is caught long before it trips a broker-wide alarm.
Monitor free disk on the RabbitMQ data volume and keep disk_free_limit proportional to that volume.
Use quorum queues with x-max-length / x-overflow=reject-publish on queues that must never grow unbounded, so a single queue rejects publishes instead of alarming the whole node.
Treat consumer count as a first-class SLO on shared brokers — a scale-to-zero on one queue should never be able to block every publisher.
When an alarm pages you, the free incident assistant can turn rabbitmqctl status and the queue list into the offending-queue diagnosis fast. More patterns in the RabbitMQ guides.

Quick Command Reference

# Which alarm is set?
rabbitmqctl status | grep -iA4 'Alarms'

# Blocked publishers
rabbitmqctl list_connections name state | grep -i block

# Heaviest queues by memory
rabbitmqctl list_queues name messages_ready messages_unacknowledged memory --sort=memory | tail -10

# Watermark and disk limit config
rabbitmqctl environment | grep -A3 'vm_memory_high_watermark\|disk_free_limit'

# Disk usage of the data dir
df -h $(rabbitmqctl eval 'rabbit_mnesia:dir().' | tr -d '"')

# Clear a memory alarm by draining/purging
rabbitmqctl purge_queue <QUEUE>

Conclusion

A resource alarm blocks publishers to keep the broker alive. The fix is to relieve the resource pressure so the alarm clears. The usual root causes:

A backed-up queue (stalled consumers) pushing memory past the high watermark.
A watermark or disk free limit set too low/high for the actual host.
The RabbitMQ data volume running out of free disk.
Connection/channel sprawl or large unacked backlogs inflating memory.
Slow consumers with high prefetch pinning unacked messages in RAM.

Identify whether it is a memory or disk alarm, find the queue or filesystem responsible, relieve it (restore consumers, purge, or free disk), then confirm the alarm cleared and publishers resumed.

RabbitMQ Error Guide: 'resource alarm' Memory/Disk Free Limit Reached