Kafka Error Guide: 'Controller not available' No Active

Exact Error Message

An admin operation that needs the controller fails because no broker is currently acting as one. From a client/admin tool:

org.apache.kafka.common.errors.ControllerMovedException: Controller not available

or, more commonly surfaced as a broker that cannot find the controller in server.log:

[2026-06-28 11:03:52,140] WARN [Broker id=2] Controller not available, retrying metadata request
 (org.apache.kafka.clients.NetworkClient)
[2026-06-28 11:03:52,455] ERROR [Broker id=2] Connection to controller (id 1) failed.
 Controller connection failed: java.net.ConnectException: Connection refused (kafka.server.BrokerToControllerChannelManager)
[2026-06-28 11:03:53,001] WARN [Broker id=2] No controller is currently active in the cluster
 (kafka.controller.KafkaController)

In a KRaft cluster the quorum tool shows no leader:

LeaderId:               -1
LeaderEpoch:            34
CurrentVoters:          [1,2,3]

What the Error Means

“Controller not available” means there is currently no broker holding the controller role, or the controller exists but no broker can reach it. Because the controller is the single component that performs metadata operations — topic create/delete, leader election, partition reassignment, ISR updates — its absence freezes all administrative changes. Existing produce/consume traffic on already-elected partition leaders may continue for a while, but any operation requiring metadata coordination fails or hangs.

There are two distinct flavors. No elected controller (the worst case): election cannot complete because the underlying coordination layer has lost quorum — a majority of ZooKeeper nodes are down, or a majority of KRaft controller voters are unreachable. A LeaderId of -1 in the quorum tool is the signature. Controller unreachable: a controller is elected but a particular broker or client cannot connect to it (network/firewall/listener issue), so it reports “controller connection failed”. The first is a cluster-wide control-plane outage; the second is a connectivity problem from one node’s perspective. Distinguishing them is the core of diagnosis.

Common Causes

ZooKeeper quorum loss (ZK mode): fewer than a majority of ensemble nodes are up, so no controller can be elected or maintained.
KRaft voter majority loss: a majority of controller-quorum voters are down or partitioned; LeaderId becomes -1.
Controller broker down with no election possible because the metadata layer cannot record the new controller.
Network partition / firewall blocking the controller’s listener port so peers report “controller connection failed”.
Wrong controller.quorum.voters / controller.listener.names configuration in KRaft, so voters cannot form a quorum.
All controller-eligible nodes restarted simultaneously, leaving no quorum during the gap.

How to Reproduce the Error

On a disposable cluster:

ZK mode: stop a majority of the ZooKeeper ensemble (e.g. 2 of 3). Brokers log “No controller is currently active” and admin ops return “Controller not available”.
KRaft mode: stop a majority of the controller voters (e.g. 2 of 3 dedicated controllers). kafka-metadata-quorum.sh ... describe --status reports LeaderId: -1.
Run kafka-topics.sh --bootstrap-server localhost:9092 --create --topic repro ...; it fails because there is no controller.
Restore the majority of ZK nodes / KRaft voters; a controller is elected and the error clears.

Do this only in a throwaway environment — it intentionally takes down the control plane.

Diagnostic Commands

All read-only.

First determine whether a controller exists at all (KRaft):

kafka-metadata-quorum.sh --bootstrap-server localhost:9092 describe --status
kafka-metadata-quorum.sh --bootstrap-server localhost:9092 describe --replication

LeaderId: -1 means no active controller — quorum problem. A valid LeaderId with brokers still erroring means a connectivity problem to that node.

For ZooKeeper mode, check for the controller znode and ZK health (read-only):

zookeeper-shell.sh localhost:2181 get /controller
echo srvr | nc localhost 2181 | grep -i mode

If get /controller returns “Node does not exist”, no controller is elected.

Verify reachability and which brokers respond:

kafka-broker-api-versions.sh --bootstrap-server localhost:9092 | head -5

Pull the relevant log lines:

grep -E "Controller not available|No controller is currently active|Controller connection failed|controller (id [0-9]+) failed" \
  /var/log/kafka/server.log | tail -30
journalctl -u kafka --since "1 hour ago" | grep -iE "controller|quorum|zookeeper|connection refused"

Step-by-Step Resolution

Classify the failure. Run the quorum/znode check. LeaderId: -1 or a missing /controller znode = no controller (quorum loss). A valid leader but per-broker “connection failed” = connectivity problem.
If quorum is lost (ZK): restore a majority of ZooKeeper nodes. Verify with echo srvr | nc <zk> 2181 that one reports Mode: leader and the rest follower. Once quorum returns, brokers elect a controller automatically.
If quorum is lost (KRaft): bring back enough controller voters to form a majority of CurrentVoters. Confirm controller.quorum.voters is identical on every voter and that their listener ports are reachable.
If the controller is reachable by some brokers but not others: treat it as a network problem — check firewall rules and the controller listener port, and confirm the affected broker can open a TCP connection to the controller’s advertised address.
If all controller-eligible nodes were restarted together, simply wait for a majority to come back; election needs a quorum present simultaneously.
Validate: describe --status shows a non-negative LeaderId, get /controller returns a broker, and admin operations succeed again.

Prevention and Best Practices

Always run an odd-sized quorum (3 or 5 ZK nodes / KRaft voters) and never let a majority go offline at once during maintenance — drain and replace one node at a time.
Keep controller.quorum.voters and controller.listener.names consistent across all KRaft voters; a mismatch silently prevents quorum.
Spread controller-eligible nodes across failure domains (racks/AZs) so a single fault cannot take out a majority.
Open and monitor the controller listener port between all brokers and voters so “controller connection failed” can’t come from a firewall change.
Alert on ActiveControllerCount (should be exactly 1 cluster-wide) and on LeaderId == -1; either is a control-plane outage.
For triage, the free incident assistant can turn the log and quorum output into a “quorum vs connectivity” verdict quickly.

This is not the correct controller for this cluster — a controller does exist; the client just hit a stale one.
Error while electing or becoming controller on broker N — election is actively failing (often the precursor to no controller being available).
Controller epoch X is older than Y — a stale controller after split brain, not an absent one.
java.net.ConnectException: Connection refused — the transport-level cause behind “controller connection failed”.

Frequently Asked Questions

Why can producers still work while the controller is “not available”? Already-elected partition leaders keep serving produce/consume for a time. Only operations needing the controller — topic management, reassignment, new leader election — fail. A prolonged outage will eventually impact availability as leaders fail without replacement.

What does LeaderId: -1 mean? There is no elected metadata-quorum leader in KRaft — i.e., no active controller. It almost always means a majority of controller voters is unreachable.

Is this a network problem or a quorum problem? Check the quorum tool / /controller znode. If no controller exists anywhere, it’s quorum. If a controller exists but one broker can’t reach it, it’s connectivity. The fixes are completely different.

How many ZooKeeper or KRaft nodes can I lose? You need a strict majority online. For 3 nodes you can lose 1; for 5 you can lose 2. Lose the majority and you lose the controller.

Can I manually elect a controller? No. Election is automatic once a quorum is present. The fix is to restore the quorum or the connectivity, not to force election.

Kafka Error Guide: 'Controller not available' No Active Controller

Exact Error Message

What the Error Means

Common Causes

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Prevention and Best Practices

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit

Exact Error Message

What the Error Means

Common Causes

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Prevention and Best Practices

Related Errors

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit