RabbitMQ Error Guide: 'inconsistent_cluster' Node Disagrees on Membership
Fix RabbitMQ inconsistent_cluster errors where a node thinks it's clustered with a peer that disagrees. Caused by stale state after a reset or forget. Recover safely.
- #rabbitmq
- #troubleshooting
- #errors
- #clustering
Exact Error Message
A node fails to start and the boot log shows that its view of cluster membership conflicts with a peer’s:
BOOT FAILED
===========
Error during startup: {error,
{inconsistent_cluster,
"Node rabbit@mq-02 thinks it's clustered with node rabbit@mq-01, but rabbit@mq-01 disagrees"}}
You may also see it surfaced in the crash report and journal:
=ERROR REPORT====
** Cannot start application rabbit:
{bad_return,
{{rabbit,start,[normal,[]]},
{error,{inconsistent_cluster,"... but rabbit@mq-01 disagrees"}}}}
What the Error Means
Every RabbitMQ node keeps a local copy of cluster membership in its Mnesia database. When a node boots and it is configured to be part of a cluster, it contacts the peers it believes it belongs to and asks them to confirm the relationship. The inconsistent_cluster error means the booting node (mq-02) still has mq-01 recorded as a clustermate, but mq-01 no longer has mq-02 in its own membership — the two databases disagree about who is in the cluster.
This is a deliberate refusal to start. RabbitMQ will not let a node join on a one-sided belief, because doing so could corrupt metadata or resurrect state the rest of the cluster has already discarded. The disagreement is almost always the result of an action taken on the other node while this one was offline: the peer ran forget_cluster_node, or was reset and rebuilt, removing the link from its side only.
The key mental model: cluster membership must be mutually consistent. The booting node holds stale state; the running cluster holds the current truth. Recovery means making the stale node forget its old view and rejoin cleanly — not forcing the cluster to accept it.
Common Causes
- A peer ran
forget_cluster_nodefor this node while it was down, so the cluster removed it but its own Mnesia still references the cluster. - The peer was reset (
reset/force_reset) or reprovisioned, getting a fresh Mnesia, while this node kept the old membership. - A node was restored from a stale backup/snapshot of its data directory that predates a membership change.
- A failed or interrupted join/leave left one side updated and the other not.
- A renamed node (hostname or
RABBITMQ_NODENAMEchange) so peers no longer recognize the identity recorded locally. - Disk/Mnesia directory swapped between hosts during a migration, carrying the wrong membership.
How to Reproduce the Error
Stop one node, forget it from a surviving node, then start it again:
# On mq-02: take the node offline
rabbitmqctl stop_app
# On mq-01: remove mq-02 from the cluster while it is offline
rabbitmqctl forget_cluster_node rabbit@mq-02
# On mq-02: starting now -> inconsistent_cluster, because mq-02 still
# believes it is clustered with mq-01, but mq-01 has forgotten it
rabbitmqctl start_app
The booting mq-02 reports it thinks it is clustered with mq-01, but mq-01 disagrees.
Diagnostic Commands
From a healthy node, confirm the authoritative membership and whether the troubled node is still listed:
# Current, agreed membership from a running node
rabbitmqctl cluster_status
Disk Nodes
rabbit@mq-01
rabbit@mq-03
Running Nodes
rabbit@mq-01
rabbit@mq-03
Note that mq-02 is absent — the cluster has already forgotten it. On the troubled node, read the boot failure and confirm distribution basics:
sudo journalctl -u rabbitmq-server --since '-20min' | grep -iE 'inconsistent_cluster|BOOT FAILED|disagrees'
Check that distribution and the cookie are otherwise fine, so you know the problem is membership, not connectivity:
epmd -names
rabbitmq-diagnostics -n rabbit@mq-01 ping
Inspect the troubled node’s recorded view (offline status, which reads local Mnesia without a running app):
rabbitmq-diagnostics cluster_status 2>/dev/null || rabbitmqctl cluster_status
Step-by-Step Resolution
Step 1: Establish which side is authoritative
The running cluster is the source of truth. The booting node holds stale membership. Do not try to make the cluster accept the stale view — reconcile the stale node to the cluster.
Step 2: Confirm the cluster already forgot the node
From a healthy node, cluster_status should show the troubled node absent from Disk Nodes. If it is still listed, the disagreement is the reverse case — make sure the peer the node names is truly the current cluster.
Step 3: Reset the stale node’s local state
On the troubled node, stop the app and reset its Mnesia so it drops the obsolete membership. (reset discards local data on that node — acceptable because the cluster holds the authoritative metadata.)
rabbitmqctl stop_app
rabbitmqctl reset
If a plain reset itself fails because the node cannot reach its old peers, use force_reset to clear local state unconditionally.
Step 4: Rejoin the node to the live cluster
Point the freshly-reset node at a running member and rejoin:
rabbitmqctl join_cluster rabbit@mq-01
rabbitmqctl start_app
Step 5: Verify mutual agreement
rabbitmqctl cluster_status
The rejoined node should now appear in Disk Nodes and Running Nodes on every node, with both sides agreeing.
Step 6: If you restored from a stale backup, prefer a clean rejoin
When the cause was a stale data directory, do not keep trying to boot it. Reset and rejoin so the node re-syncs definitions from the authoritative cluster rather than reintroducing outdated metadata.
Prevention and Best Practices
- Always pair membership changes. If you
forget_cluster_nodea member, do not later boot the old data directory expecting it to rejoin — reset it first. - Reset before rejoining, as a standard runbook step, whenever a node has been out of the cluster across a membership change.
- Avoid restoring single-node snapshots into a live cluster; let nodes re-sync from peers instead.
- Keep node names stable. Renames change identity and invite membership mismatches; pin
RABBITMQ_NODENAME. - Document forget/rejoin procedures so on-call operators do not boot stale nodes by reflex.
- Use quorum queues, which carry their own membership and re-replicate cleanly when a node rejoins.
Related Errors
- Node rabbit@host is down — the precursor state where a member is unreachable and might be forgotten while offline.
- Failed to join cluster — the related join-time failures (cookie, version, already-a-member) you hit during the rejoin step.
Mnesia network partition— a different membership problem where nodes stay clustered but diverge.
Frequently Asked Questions
Does reset lose my data on that node? It clears that node’s local Mnesia, including its queues’ local state. In a healthy cluster the authoritative metadata lives on the surviving nodes, and the node re-syncs after rejoining — but mirrored/classic queue contents that existed only on the reset node are discarded.
Why won’t RabbitMQ just auto-correct the membership? A one-sided belief could reintroduce stale or conflicting metadata. Refusing to boot forces an operator to decide which side is authoritative, which protects cluster integrity.
The cluster still lists my node — is it the same error? If the peer the node names still claims the relationship, you may instead have a different mismatch (for example a renamed or backed-up node). Verify with cluster_status on the peer named in the message before resetting.
Can I fix this from the running cluster instead of the broken node? Generally no — the stale state lives on the booting node. The clean path is reset-and-rejoin on that node, not changing the healthy cluster.
When do I need force_reset instead of reset? Use force_reset only when a normal reset fails because the node cannot contact its recorded peers. It clears local state unconditionally, so confirm the cluster is authoritative first.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.