RabbitMQ Error Guide: '{error, queue_process_is_stopped}' Queue Process Down
Fix RabbitMQ queue_process_is_stopped errors: classic, mirrored, or quorum queue process down on a failed node, failed delete_queue, and stuck operations.
- #rabbitmq
- #troubleshooting
- #errors
- #clustering
Exact Error Message
{error, queue_process_is_stopped} is an internal Erlang error raised when an operation targets a queue whose backing Erlang process has stopped — typically because the node hosting it crashed, is shutting down, or partitioned away. It appears in broker logs, rabbitmqctl output, and as a delete failure:
Operation failed: {error, queue_process_is_stopped}
2026-06-29 14:22:51.118 [error] <0.7421.0> Error on AMQP connection <0.7390.0>:
operation queue.delete caused an internal error: {error, queue_process_is_stopped}
failed to delete queue 'orders' in vhost '/'
[warning] <0.812.0> Queue 'metrics' in vhost '/': process is stopped, leader on down node 'rabbit@node3'
What the Error Means
In RabbitMQ every queue is backed by one or more Erlang processes. A classic queue has a single master process on its home node; a classic mirrored queue has a master plus mirror processes; a quorum queue runs a Raft member process on each replica node. When the node that owns the active process goes down (crash, OOM, kill, partition, or shutdown), that process stops.
queue_process_is_stopped means a request reached the broker for that queue, but the process expected to handle it is no longer alive. The broker cannot service declares, deletes, publishes, or consumes against a stopped process, so the operation errors out. It is a liveness problem — the metadata still exists, but the worker behind it is gone.
Common Causes
1. The queue’s home node crashed or was stopped
A classic (non-replicated) queue lives entirely on one node. If that node is down, the queue process is stopped and every operation against it fails until the node returns.
2. Network partition isolated the owning node
In a partition, the side without the queue’s master sees its process as stopped. Operations routed there fail even though the node hosting the master may still be running on the other side.
3. Quorum/mirrored queue lost too many replicas
A quorum queue needs a majority of replicas online to elect a leader. If enough replica nodes are down, no leader exists and member processes on surviving nodes report stopped/unavailable.
4. Node shutdown mid-operation
Deleting or declaring a queue while its node is draining/restarting races the process teardown, producing queue_process_is_stopped on delete_queue.
5. Crashed queue process due to resource exhaustion
A queue process can be killed by memory/disk alarms or an Erlang crash, leaving the metadata present but the process stopped until the supervisor restarts it (or the node recovers).
How to Reproduce the Error
This requires a multi-node cluster. Conceptually: place a classic queue on a node, stop that node, then operate on the queue from a surviving node.
# On node1: confirm the queue lives on node3
rabbitmqctl list_queues name node state | grep orders
orders rabbit@node3 running
# Simulate node3 being down (observe from a surviving node)
rabbitmqctl -n rabbit@node1 cluster_status
rabbitmqctl -n rabbit@node1 delete_queue orders
Error: {error, queue_process_is_stopped}
With node3 down, the orders process is stopped and the delete cannot complete.
Diagnostic Commands
# Where does the queue live and what is its state?
rabbitmqctl list_queues name node state type leader members online | grep <QUEUE>
# Cluster membership, running nodes, and any partitions
rabbitmqctl cluster_status
rabbitmq-diagnostics cluster_status
# Are the expected nodes actually up?
rabbitmq-diagnostics ping -n rabbit@node3
rabbitmq-diagnostics check_running -n rabbit@node1
# Quorum queue health: leader, members, and which replicas are online
rabbitmq-diagnostics quorum_status <QUEUE>
# Resource alarms that may have killed the process
rabbitmq-diagnostics alarms
rabbitmq-diagnostics memory_breakdown
# Broker log for the stopped-process and node-down events
journalctl -u rabbitmq-server --since "30 min ago" | grep -iE "queue_process_is_stopped|nodedown|partition|down node"
list_queues ... node state tells you instantly whether the queue’s home node matches a node missing from cluster_status.
Step-by-Step Resolution
Step 1: Identify the queue’s home node and whether it is up
rabbitmqctl list_queues name node state | grep <QUEUE>
rabbitmqctl cluster_status
If the queue’s node is not in the running-nodes list, the process is stopped because that node is down. That is your root cause.
Step 2: Check for a network partition
rabbitmq-diagnostics cluster_status | grep -A5 -i partition
If a partition is reported, resolving it (and your configured cluster_partition_handling strategy) may restore the process without restarting anything.
Step 3: Bring the owning node back
Restart rabbitmq-server on the down node. Once it rejoins the cluster, classic queue processes restart on their home node and the error clears. Verify:
rabbitmq-diagnostics check_running -n rabbit@node3
rabbitmqctl list_queues name node state | grep <QUEUE>
Step 4: For quorum queues, restore a majority
rabbitmq-diagnostics quorum_status <QUEUE>
Bring enough replica nodes online to form a majority so a leader can be elected. Until a majority exists, the queue stays unavailable by design.
Step 5: Handle a permanently lost node
If the home node is gone for good, a non-replicated classic queue and its messages are lost; recreate the queue (ideally as a quorum queue) once the cluster is stable. For quorum queues, you may need to shrink membership to drop the dead replica after the cluster is healthy.
Step 6: Verify operations succeed
rabbitmqctl list_queues name node state messages | grep <QUEUE>
A running state on a live node means the process is back and declares/deletes/consumes will work.
Prevention and Best Practices
- Use quorum queues (or streams) for anything that must survive a node failure — classic non-replicated queues die with their home node.
- Run an odd number of nodes (3 or 5) so a majority can always be formed for quorum queues.
- Configure a sane
cluster_partition_handlingstrategy (pause_minorityorautoheal) to avoid split-brain stopped processes. - Monitor node liveness and resource alarms so you catch a node going down before operations start failing.
- Spread queue leaders across nodes so a single node failure does not stop a disproportionate share of queues.
- Avoid declaring/deleting queues during planned node drains; do topology changes when the cluster is fully healthy.
Related Errors
home node ... is down— the user-facing form of the same underlying node-down condition for classic queues.quorum queue ... no leader elected— a quorum queue without a majority; the leaderless cousin of a stopped process.- Mnesia network partition / split-brain — the partition that often causes processes to appear stopped.
NOT_FOUND - no queue— if the queue was deleted while its node was down, it disappears entirely.CHANNEL_ERROR— clients reusing channels after operations fail with the internal error.
Frequently Asked Questions
Is my data lost when I see this error? For a classic non-replicated queue on a dead node, yes — its messages are gone until that node returns (or permanently, if it never does). Quorum queues retain data as long as a majority survives.
Will the process restart on its own? If the home node restarts and rejoins, classic queue processes restart automatically. Quorum members restart once a majority is reachable.
Can I delete a queue whose process is stopped?
Usually not while the node is down — delete_queue itself returns the error. Bring the node back, or for quorum queues remove the dead member after the cluster is healthy.
Does this mean the whole cluster is down? No. It means the specific node hosting that queue’s active process is unavailable. Other queues on healthy nodes keep working.
How do I prevent it for critical queues? Make them quorum queues on a 3+ node cluster with partition handling configured. See the RabbitMQ guides for HA topology patterns.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.