Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Kafka By James Joyner IV · · 9 min read

Kafka Error Guide: 'Error while fetching metadata' LEADER_NOT_AVAILABLE

Fix Kafka's 'Error while fetching metadata ... LEADER_NOT_AVAILABLE' and UNKNOWN_TOPIC_OR_PARTITION client warnings: causes, diagnostics, and resolution.

  • #kafka
  • #troubleshooting
  • #errors
  • #partitions

If you have ever started a Kafka producer or consumer and watched it spew a wall of metadata warnings before settling down, you have met LEADER_NOT_AVAILABLE. Most of the time it is harmless noise. Sometimes it is the first symptom of a topic that does not exist, a broken listener configuration, or a partition stuck offline. This guide shows how to tell the difference quickly.

Exact Error Message

The warning is emitted by the client networking layer, not the broker, so you will see it in your application logs:

2026-06-29 09:14:22,481 WARN  [kafka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Error while fetching metadata with correlation id 7 : {my-topic=LEADER_NOT_AVAILABLE}
2026-06-29 09:14:22,602 WARN  [kafka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Error while fetching metadata with correlation id 9 : {my-topic=LEADER_NOT_AVAILABLE}
2026-06-29 09:14:23,118 WARN  [kafka-producer-network-thread | producer-1] org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Error while fetching metadata with correlation id 12 : {my-topic=UNKNOWN_TOPIC_OR_PARTITION}

You may see LEADER_NOT_AVAILABLE and UNKNOWN_TOPIC_OR_PARTITION interleaved for the same topic, which is normal during the first few seconds of a topic’s life.

What the Error Means

When a client wants to produce to or consume from a topic, it first asks any reachable broker (a bootstrap server) for that topic’s metadata: how many partitions exist and which broker is the leader of each. The leader is the only replica that accepts reads and writes, so without a known leader the client cannot proceed.

LEADER_NOT_AVAILABLE means the topic and partition are known to the cluster, but no leader is currently assigned or reachable for at least one partition. UNKNOWN_TOPIC_OR_PARTITION means the broker that answered has no record of the topic at all. Both are retriable: the client keeps re-requesting metadata (note the incrementing correlation id) until it gets a clean answer. The key question is whether the warnings clear within a second or two (transient) or persist (a real problem).

Common Causes

  1. Topic just created. Leader election for new partitions takes a moment. The client races ahead and gets LEADER_NOT_AVAILABLE until the controller assigns leaders. This clears on its own.
  2. Auto-creation on first produce. With auto.create.topics.enable=true, producing to a non-existent topic triggers creation. The first few metadata fetches return UNKNOWN_TOPIC_OR_PARTITION, then LEADER_NOT_AVAILABLE, then succeed.
  3. Topic does not exist / typo. If auto-create is disabled and the name is wrong, you get a permanent UNKNOWN_TOPIC_OR_PARTITION loop that never resolves.
  4. Partition has no leader (offline). All replicas of a partition are down or out of the ISR, so no leader can be elected. This produces a persistent LEADER_NOT_AVAILABLE.
  5. Controller unavailable. With no active controller, leader elections cannot happen and metadata stays stale.
  6. Bootstrap broker mid-restart. The client connected to a broker that is still loading log segments and has not yet rejoined as a leader.
  7. advertised.listeners misconfiguration. The client reaches the bootstrap server fine, but the leader’s advertised address (a Docker-internal hostname, a private IP, the wrong port) is unreachable from the client. The metadata says “leader is broker-2 at kafka-2:9092” and the client cannot resolve or connect to that.

How to Reproduce the Error

The cleanest reproduction is the misnamed-topic case. Point a console producer at a topic that does not exist with auto-create off:

kafka-console-producer.sh \
  --bootstrap-server localhost:9092 \
  --topic this-topic-does-not-exist

You will see a steady stream of UNKNOWN_TOPIC_OR_PARTITION metadata warnings. For the transient flavor, create a topic and immediately start producing to it in a tight loop; the first metadata fetch usually returns LEADER_NOT_AVAILABLE before the leaders settle.

Diagnostic Commands

Start by confirming the topic exists and has leaders. This is read-only and tells you almost everything:

kafka-topics.sh --bootstrap-server localhost:9092 \
  --describe --topic my-topic

Healthy output looks like this, with a real broker id in the Leader column:

Topic: my-topic  TopicId: kKp9...  PartitionCount: 3  ReplicationFactor: 3
  Topic: my-topic  Partition: 0  Leader: 1  Replicas: 1,2,3  Isr: 1,2,3
  Topic: my-topic  Partition: 1  Leader: 2  Replicas: 2,3,1  Isr: 2,3,1
  Topic: my-topic  Partition: 2  Leader: 3  Replicas: 3,1,2  Isr: 3,1,2

A Leader: none (or Leader: -1) is your smoking gun. Find every partition with no leader:

kafka-topics.sh --bootstrap-server localhost:9092 \
  --describe --unavailable-partitions

If that command returns rows, you have offline partitions and the warning is real, not transient. Inspect which broker should hold the leadership and whether it is alive:

kafka-leader-election.sh --bootstrap-server localhost:9092 --describe
journalctl -u kafka --since "10 min ago" | grep -iE "shutdown|controller|listener"

When the topic describes cleanly but clients still fail, suspect the listeners. Check what each broker advertises:

grep -E "^(advertised\.)?listeners" /opt/kafka/config/server.properties
listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://kafka-2.internal:9092

If kafka-2.internal does not resolve from your client host, that is the bug. Confirm reachability with a socket check:

ss -tnp | grep 9092

Step-by-Step Resolution

A worked example. A team reports producers logging LEADER_NOT_AVAILABLE for orders that never clears after 30 seconds.

  1. Describe the topic. kafka-topics.sh --describe --topic orders shows Partition: 4 Leader: none. So this is not transient; one partition genuinely has no leader.
  2. List unavailable partitions. --unavailable-partitions confirms only partition 4 is affected. Replicas are 5,6, ISR is empty.
  3. Check the brokers. journalctl -u kafka on brokers 5 and 6 shows both went down during a rack power event. With no in-sync replica, no leader can be elected.
  4. Restore a broker. Bring broker 5 back. Once it finishes log recovery it rejoins the ISR, the controller elects it leader for partition 4, and the metadata warnings stop on the next client refresh.

If instead the describe output is perfectly healthy but clients still warn, the problem is the listener path:

  1. grep advertised.listeners server.properties reveals a hostname only resolvable inside the Docker network.
  2. Fix advertised.listeners to an address the client can reach (a routable hostname or external IP) and restart the broker. The client now connects to the advertised leader and proceeds.

For the most common case (a brand-new topic or auto-create), no action is needed: a short retry window absorbs the warnings. If the warnings come from a typo, correct the topic name in your client config.

Prevention and Best Practices

  • Pre-create topics with an explicit partition count and replication factor instead of relying on auto.create.topics.enable; consider disabling auto-create in production to surface typos immediately.
  • Set advertised.listeners to addresses your clients actually use, and test resolution from the client network, not just the broker host.
  • Run with replication factor 3 and min.insync.replicas=2 so a single broker loss never strands a partition without a leader.
  • Tune client retry.backoff.ms and metadata.max.age.ms so transient warnings stay brief rather than flooding logs.
  • Alarm on the broker OfflinePartitionsCount metric; anything above 0 means a real LEADER_NOT_AVAILABLE is waiting to happen. You can wire this into an automated runbook from the incident response dashboard.
  • UNKNOWN_TOPIC_OR_PARTITION - the sibling warning; permanent loop usually means a missing topic or disabled auto-create.
  • NOT_LEADER_OR_FOLLOWER - the client cached a stale leader and contacted the wrong broker; resolves after a metadata refresh.
  • TimeoutException: Topic not present in metadata after 60000 ms - the producer-side timeout that fires when these metadata fetches never succeed.
  • NETWORK_EXCEPTION / connection refused - a lower-level failure to reach the bootstrap server at all.

Frequently Asked Questions

Q: Is LEADER_NOT_AVAILABLE an error I need to act on? Usually not. It is logged at WARN and is retriable. If it disappears within a couple of seconds (common right after topic creation), ignore it. Act only when it persists or coincides with --unavailable-partitions returning rows.

Q: Why do I see UNKNOWN_TOPIC_OR_PARTITION even though my topic exists? The broker that answered your metadata request may not have caught up to the latest cluster state, or you connected during a rolling restart. If it persists, verify the exact topic name and that auto-create is configured the way you expect.

Q: I can reach the bootstrap server but still get LEADER_NOT_AVAILABLE forever. Why? This is the classic advertised.listeners trap. Your client reads metadata fine but cannot connect to the address the leader advertises. Fix the advertised listener to a routable address and restart the broker.

Q: Where do I look first for cluster-wide health? Run kafka-topics.sh --describe --unavailable-partitions and check the OfflinePartitionsCount and active controller metrics. For more Kafka troubleshooting walkthroughs see the Kafka category.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.