RabbitMQ Error Guide: '{error, queue_process_is

Exact Error Message

{error, queue_process_is_stopped} is an internal Erlang error raised when an operation targets a queue whose backing Erlang process has stopped — typically because the node hosting it crashed, is shutting down, or partitioned away. It appears in broker logs, rabbitmqctl output, and as a delete failure:

Operation failed: {error, queue_process_is_stopped}

2026-06-29 14:22:51.118 [error] <0.7421.0> Error on AMQP connection <0.7390.0>:
operation queue.delete caused an internal error: {error, queue_process_is_stopped}
failed to delete queue 'orders' in vhost '/'

[warning] <0.812.0> Queue 'metrics' in vhost '/': process is stopped, leader on down node 'rabbit@node3'

What the Error Means

In RabbitMQ every queue is backed by one or more Erlang processes. A classic queue has a single master process on its home node; a classic mirrored queue has a master plus mirror processes; a quorum queue runs a Raft member process on each replica node. When the node that owns the active process goes down (crash, OOM, kill, partition, or shutdown), that process stops.

queue_process_is_stopped means a request reached the broker for that queue, but the process expected to handle it is no longer alive. The broker cannot service declares, deletes, publishes, or consumes against a stopped process, so the operation errors out. It is a liveness problem — the metadata still exists, but the worker behind it is gone.

Common Causes

1. The queue’s home node crashed or was stopped

A classic (non-replicated) queue lives entirely on one node. If that node is down, the queue process is stopped and every operation against it fails until the node returns.

2. Network partition isolated the owning node

In a partition, the side without the queue’s master sees its process as stopped. Operations routed there fail even though the node hosting the master may still be running on the other side.

3. Quorum/mirrored queue lost too many replicas

A quorum queue needs a majority of replicas online to elect a leader. If enough replica nodes are down, no leader exists and member processes on surviving nodes report stopped/unavailable.

4. Node shutdown mid-operation

Deleting or declaring a queue while its node is draining/restarting races the process teardown, producing queue_process_is_stopped on delete_queue.

5. Crashed queue process due to resource exhaustion

A queue process can be killed by memory/disk alarms or an Erlang crash, leaving the metadata present but the process stopped until the supervisor restarts it (or the node recovers).

How to Reproduce the Error

This requires a multi-node cluster. Conceptually: place a classic queue on a node, stop that node, then operate on the queue from a surviving node.

# On node1: confirm the queue lives on node3
rabbitmqctl list_queues name node state | grep orders

orders  rabbit@node3  running

# Simulate node3 being down (observe from a surviving node)
rabbitmqctl -n rabbit@node1 cluster_status
rabbitmqctl -n rabbit@node1 delete_queue orders

Error: {error, queue_process_is_stopped}

With node3 down, the orders process is stopped and the delete cannot complete.

Diagnostic Commands

# Where does the queue live and what is its state?
rabbitmqctl list_queues name node state type leader members online | grep <QUEUE>

# Cluster membership, running nodes, and any partitions
rabbitmqctl cluster_status
rabbitmq-diagnostics cluster_status

# Are the expected nodes actually up?
rabbitmq-diagnostics ping -n rabbit@node3
rabbitmq-diagnostics check_running -n rabbit@node1

# Quorum queue health: leader, members, and which replicas are online
rabbitmq-diagnostics quorum_status <QUEUE>

# Resource alarms that may have killed the process
rabbitmq-diagnostics alarms
rabbitmq-diagnostics memory_breakdown

# Broker log for the stopped-process and node-down events
journalctl -u rabbitmq-server --since "30 min ago" | grep -iE "queue_process_is_stopped|nodedown|partition|down node"

list_queues ... node state tells you instantly whether the queue’s home node matches a node missing from cluster_status.

Step-by-Step Resolution

Step 1: Identify the queue’s home node and whether it is up

rabbitmqctl list_queues name node state | grep <QUEUE>
rabbitmqctl cluster_status

If the queue’s node is not in the running-nodes list, the process is stopped because that node is down. That is your root cause.

Step 2: Check for a network partition

rabbitmq-diagnostics cluster_status | grep -A5 -i partition

If a partition is reported, resolving it (and your configured cluster_partition_handling strategy) may restore the process without restarting anything.

Step 3: Bring the owning node back

Restart rabbitmq-server on the down node. Once it rejoins the cluster, classic queue processes restart on their home node and the error clears. Verify:

rabbitmq-diagnostics check_running -n rabbit@node3
rabbitmqctl list_queues name node state | grep <QUEUE>

Step 4: For quorum queues, restore a majority

rabbitmq-diagnostics quorum_status <QUEUE>

Bring enough replica nodes online to form a majority so a leader can be elected. Until a majority exists, the queue stays unavailable by design.

Step 5: Handle a permanently lost node

If the home node is gone for good, a non-replicated classic queue and its messages are lost; recreate the queue (ideally as a quorum queue) once the cluster is stable. For quorum queues, you may need to shrink membership to drop the dead replica after the cluster is healthy.

Step 6: Verify operations succeed

rabbitmqctl list_queues name node state messages | grep <QUEUE>

A running state on a live node means the process is back and declares/deletes/consumes will work.

Prevention and Best Practices

Use quorum queues (or streams) for anything that must survive a node failure — classic non-replicated queues die with their home node.
Run an odd number of nodes (3 or 5) so a majority can always be formed for quorum queues.
Configure a sane cluster_partition_handling strategy (pause_minority or autoheal) to avoid split-brain stopped processes.
Monitor node liveness and resource alarms so you catch a node going down before operations start failing.
Spread queue leaders across nodes so a single node failure does not stop a disproportionate share of queues.
Avoid declaring/deleting queues during planned node drains; do topology changes when the cluster is fully healthy.

home node ... is down — the user-facing form of the same underlying node-down condition for classic queues.
quorum queue ... no leader elected — a quorum queue without a majority; the leaderless cousin of a stopped process.
Mnesia network partition / split-brain — the partition that often causes processes to appear stopped.
NOT_FOUND - no queue — if the queue was deleted while its node was down, it disappears entirely.
CHANNEL_ERROR — clients reusing channels after operations fail with the internal error.

Frequently Asked Questions

Is my data lost when I see this error? For a classic non-replicated queue on a dead node, yes — its messages are gone until that node returns (or permanently, if it never does). Quorum queues retain data as long as a majority survives.

Will the process restart on its own? If the home node restarts and rejoins, classic queue processes restart automatically. Quorum members restart once a majority is reachable.

Can I delete a queue whose process is stopped? Usually not while the node is down — delete_queue itself returns the error. Bring the node back, or for quorum queues remove the dead member after the cluster is healthy.

Does this mean the whole cluster is down? No. It means the specific node hosting that queue’s active process is unavailable. Other queues on healthy nodes keep working.

How do I prevent it for critical queues? Make them quorum queues on a 3+ node cluster with partition handling configured. See the RabbitMQ guides for HA topology patterns.

RabbitMQ Error Guide: '{error, queue_process_is_stopped}' Queue Process Down

Exact Error Message

What the Error Means

Common Causes

1. The queue’s home node crashed or was stopped

2. Network partition isolated the owning node

3. Quorum/mirrored queue lost too many replicas

4. Node shutdown mid-operation

5. Crashed queue process due to resource exhaustion

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Step 1: Identify the queue’s home node and whether it is up

Step 2: Check for a network partition

Step 3: Bring the owning node back

Step 4: For quorum queues, restore a majority

Step 5: Handle a permanently lost node

Step 6: Verify operations succeed

Prevention and Best Practices

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit

Exact Error Message

What the Error Means

Common Causes

1. The queue’s home node crashed or was stopped

2. Network partition isolated the owning node

3. Quorum/mirrored queue lost too many replicas

4. Node shutdown mid-operation

5. Crashed queue process due to resource exhaustion

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Step 1: Identify the queue’s home node and whether it is up

Step 2: Check for a network partition

Step 3: Bring the owning node back

Step 4: For quorum queues, restore a majority

Step 5: Handle a permanently lost node

Step 6: Verify operations succeed

Prevention and Best Practices

Related Errors

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit