Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Kafka By James Joyner IV · · 9 min read

Kafka Error Guide: 'Leader election failed' Offline Partitions and No Leader

Why Kafka controller and preferred-leader elections fail, how unclean.leader.election leaves partitions leaderless, and read-only commands to diagnose it.

  • #kafka
  • #troubleshooting
  • #errors
  • #partitions

When a Kafka partition has no leader, producers and consumers for that partition stop working immediately. The controller reports “Leader election failed” and the partition shows Leader: none. This guide covers both controller-driven elections that fail after a broker loss and preferred-leader elections that fail because the preferred replica is not in sync, and explains how to diagnose them safely without triggering an election yourself.

Exact Error Message

The controller and the state change logger record the failure in controller.log and server.log:

[2026-06-29 09:41:55,210] ERROR [Controller id=2] Failed to elect leader for partition orders-0: no eligible replica in ISR and unclean leader election is disabled (kafka.controller.KafkaController)
[2026-06-29 09:41:55,212] WARN [Controller id=2] Leader election failed for partition orders-0; partition is now offline (kafka.controller.KafkaController)
[2026-06-29 09:41:55,260] WARN Preferred replica 1 for partition payments-2 is not in ISR; preferred leader election failed (kafka.controller.KafkaController)
[2026-06-29 09:41:55,318] ERROR No leader for partition orders-0 (state.change.logger)

The phrases to watch for are “Failed to elect leader for partition”, “Leader election failed”, “Preferred leader election failed”, and “No leader for partition”.

What the Error Means

Every partition has one leader replica that handles all reads and writes; the followers replicate from it. The set of replicas that are fully caught up is the in-sync replica set (ISR). When a leader fails, the controller picks a new leader from the ISR. If no replica in the ISR is alive, the controller cannot elect a leader, and unless unclean leader election is enabled it leaves the partition offline rather than promoting an out-of-sync replica. That is the central trade-off: availability versus durability.

Preferred-leader election is different. To balance load, Kafka tries to make the first replica in the assignment (the “preferred” replica) the leader. If that preferred replica is not currently in the ISR, the preferred election is skipped and logged as failed — but this is usually benign and does not take the partition offline; another in-sync replica remains leader.

Distinguish failure from success. A successful election logs something like:

[2026-06-29 09:45:02,004] INFO [Controller id=2] Elected leader 3 for partition orders-0 (kafka.controller.KafkaController)

If you see a new leader ID and the partition has a non-empty ISR, the election worked.

Common Causes

  1. No eligible (in-sync) replica alive. The leader and every other ISR member are down. With unclean.leader.election.enable=false, the controller refuses to promote a lagging replica, so the partition stays leaderless.
  2. Preferred replica not in ISR. A preferred-leader election is requested but the preferred replica has fallen behind and left the ISR, so the election is declined.
  3. All replicas down. Every broker hosting the partition is offline; there is nothing to elect.
  4. Controller failover or metadata issue. A ZooKeeper session loss or a KRaft metadata quorum problem disrupts the controller’s view of which replicas are eligible.
  5. Partition offline due to a storage failure. A log directory holding the partition went offline, removing that replica from contention.

How to Reproduce the Error

In a test cluster:

  1. Create a topic with replication factor 2 on brokers 1 and 2, and confirm unclean.leader.election.enable=false.
  2. Stop broker 2 (a follower). The partition keeps broker 1 as leader.
  3. Stop broker 1 (the leader) while broker 2 is still down, or stop broker 1 before broker 2 has rejoined the ISR.
  4. The controller has no in-sync replica to promote. kafka-topics.sh --describe shows Leader: none, and the controller logs “Leader election failed”.

To reproduce a failed preferred election, shrink the ISR (throttle or pause a follower) and request a preferred-leader election while the preferred replica is out of sync.

Diagnostic Commands

List the partitions that have no leader. This --describe form is read-only:

kafka-leader-election.sh --bootstrap-server localhost:9092 --describe
Current partition leaders:
Topic: orders     Partition: 0   Leader: none   Preferred: 1
Topic: payments   Partition: 2   Leader: 3      Preferred: 1

Only --describe is shown here. Do not pass --election-type / --elect, which would actively trigger an election and can cause data loss if it ends up being unclean.

Find unavailable partitions directly:

kafka-topics.sh --bootstrap-server localhost:9092 \
  --describe --unavailable-partitions
Topic: orders  Partition: 0  Leader: none  Replicas: 1,2  Isr:

An empty Isr with Leader: none confirms there is no eligible replica. Describe the topic in full to see which replicas are alive:

kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic orders
Topic: orders  Partition: 0  Leader: none  Replicas: 1,2  Isr:
Topic: orders  Partition: 1  Leader: 3     Replicas: 3,1  Isr: 3

Inspect controller health and the state change logger:

journalctl -u kafka --since "30 min ago" | grep -iE "leader election|no leader|controller"
grep -iE "Leader election failed|No leader|not in ISR" \
  /var/lib/kafka/logs/state-change.log /var/lib/kafka/logs/controller.log

Confirm which brokers are actually up:

ss -ltnp | grep 9092

Step-by-Step Resolution

Take the case where orders-0 shows Leader: none with an empty ISR after both replicas went down.

  1. Confirm the scope with --describe --unavailable-partitions. An empty ISR means no in-sync replica can be elected.
  2. Identify the replicas from the topic description: replicas are brokers 1 and 2, and both are down.
  3. Bring an in-sync replica back online. Restart whichever broker was the most recent ISR member — ideally the last known leader, broker 1. As soon as it starts and loads its log, the controller elects it leader automatically and the partition recovers with no data loss. Re-run --describe to confirm a real leader ID appears and the ISR is non-empty.
  4. If the last leader is unrecoverable, you face the unclean leader election trade-off. Enabling unclean election promotes a lagging replica and restores availability but discards any records that replica never received. This is a durability-versus-availability decision and must be made deliberately; it is configured in server.properties rather than shown as a command here:
# server.properties (durability default; weigh before changing)
unclean.leader.election.enable=false
  1. For a failed preferred-leader election, no outage exists — another in-sync replica is still leading. Wait for the preferred replica to rejoin the ISR (check with --describe), after which a future preferred election will succeed and rebalance leadership.
  2. Check controller health if elections fail cluster-wide rather than for one partition. A ZooKeeper session loss or KRaft quorum issue requires restoring the metadata quorum before any election can proceed; the controller logs will name the failing dependency.

When these incidents recur, the incident response dashboard helps correlate the controller log, ISR state, and broker availability in one place.

Prevention and Best Practices

Use a replication factor of at least 3 and set min.insync.replicas=2 so a single broker loss never empties the ISR. Spread replicas across racks or availability zones. Keep unclean.leader.election.enable=false for durability-critical topics and accept that recovery requires bringing an in-sync replica back. Monitor under-replicated and offline partition counts so a shrinking ISR is caught before the last in-sync replica disappears. Run preferred-leader elections during quiet periods and only after confirming preferred replicas are in the ISR. Keep the controller and its metadata quorum (ZooKeeper or KRaft) healthy and monitored.

  • Partition reassignment for topic-0 failed — stuck or failed replica moves that can shrink the ISR.
  • NotEnoughReplicasExceptionacks=all producers fail when the ISR drops below min.insync.replicas.
  • NotLeaderForPartitionException — clients hold a stale leader and must refresh metadata after an election.
  • KafkaStorageException — an offline log directory removes a replica from leader eligibility.

See the full Kafka category for more guides.

Frequently Asked Questions

Q: Why does my partition show “Leader: none”? Every in-sync replica for that partition is down and unclean.leader.election.enable=false, so the controller refuses to promote a lagging replica. Confirm with --describe --unavailable-partitions; an empty ISR with no leader is the signature.

Q: Is “Preferred leader election failed” an outage? Usually not. It only means the preferred replica is not in the ISR, so leadership was not rebalanced to it. Another in-sync replica is still serving the partition. Wait for the preferred replica to rejoin the ISR.

Q: Should I enable unclean leader election to recover? Only as a last resort. It restores availability by promoting an out-of-sync replica, but any records that replica never received are lost permanently. Prefer bringing the last in-sync replica back online whenever possible.

Q: Can kafka-leader-election.sh —describe cause an election? No. --describe only reports current and preferred leaders. An election is triggered by --election-type/--elect, which is not shown here because an unclean outcome can cause data loss.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.