Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Kafka By James Joyner IV · · 9 min read

Kafka Error Guide: 'Failed to update metadata after 60000 ms' Client Timeout

Fix Kafka 'TimeoutException: Failed to update metadata after 60000 ms': resolve bad bootstrap.servers, broken advertised.listeners, ACL denials, and unreachable brokers.

  • #kafka
  • #troubleshooting
  • #errors
  • #client

Exact Error Message

A producer or consumer that cannot obtain cluster metadata fails like this:

org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
        at org.apache.kafka.clients.producer.internals.ProducerMetadata.awaitUpdate(ProducerMetadata.java:120)
        at org.apache.kafka.clients.producer.KafkaProducer.waitOnMetadata(KafkaProducer.java:1066)
        at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:946)
        at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:911)

You will also see warnings leading up to it in the client log:

[Producer clientId=producer-1] Connection to node -1 (kafka-broker:9092) could not be established. Broker may not be available.
[Producer clientId=producer-1] Bootstrap broker kafka-broker:9092 (id: -1 rack: null) disconnected

Common phrasings include Failed to update metadata, Metadata update failed, and Topic <name> not present in metadata after 60000 ms.

What the Error Means

Before a client can produce to or consume from a topic, it must fetch metadata: the list of brokers, which broker leads each partition, and the advertised addresses to connect to. The client first contacts a bootstrap.servers entry, asks for metadata, and then connects to the leader broker each partition lives on. Failed to update metadata after 60000 ms means the client could not complete that exchange within max.block.ms (default 60000 ms).

Critically, the bootstrap connection and the post-metadata connections use different addresses. The client reaches bootstrap via bootstrap.servers, but it then connects to brokers using the addresses the brokers advertise (advertised.listeners). So this timeout has two distinct failure shapes: the client never reached any bootstrap broker, or it reached bootstrap fine but the advertised addresses it got back are unreachable.

Common Causes

  • Wrong or unreachable bootstrap.servers: Bad hostname, wrong port, stale DNS, or no broker listening — the client never gets metadata at all.
  • Broken advertised.listeners: Brokers advertise an internal hostname, container name, or localhost that the client cannot resolve or route to, so bootstrap succeeds but the follow-on connection fails.
  • Listener/protocol mismatch: The client speaks PLAINTEXT to an SSL/SASL_SSL listener (or vice versa), so connections drop during the handshake.
  • ACL denial / authorization: With an authorizer enabled, the principal lacks DESCRIBE on the topic or DESCRIBE/IDEMPOTENT_WRITE cluster permissions, so metadata for the topic never resolves and the client times out.
  • Topic does not exist: With auto-create disabled, the requested topic is absent, producing the Topic ... not present in metadata variant.
  • Network/firewall block: A security group or firewall drops traffic to the broker port, turning every metadata attempt into a hang.

How to Reproduce the Error

Configure a brokers’ advertised listener to an address only the broker can resolve, then connect from a remote client:

# server.properties — broker advertises a name the client cannot resolve
listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://kafka-internal:9092

The client’s bootstrap.servers=kafka-public:9092 connects to bootstrap, receives kafka-internal:9092 as the broker address, cannot resolve it, and times out after 60 s with Failed to update metadata. Disabling auto.create.topics.enable and producing to a nonexistent topic reproduces the Topic ... not present variant.

Diagnostic Commands

Confirm a broker is actually listening on the bootstrap port from the client host:

ss -ltnp | grep -E ':9092|:9093'

Check what addresses the brokers advertise — this is the most common culprit:

grep -E "^(listeners|advertised.listeners|listener.security.protocol.map)=" /etc/kafka/server.properties

Verify the client can complete an API handshake against the bootstrap endpoint (this itself performs a metadata-style exchange):

kafka-broker-api-versions.sh --bootstrap-server kafka-broker:9092 | head -20

Confirm the topic exists and has leaders assigned (a topic with no leader yields incomplete metadata):

kafka-topics.sh --bootstrap-server kafka-broker:9092 --describe --topic orders

Resolve the advertised hostname from the client host to rule out DNS issues:

getent hosts kafka-internal

Look for authorization denials on the broker if an authorizer is enabled:

grep -nE "Denied|authorization|Principal .* DESCRIBE" /var/log/kafka/server.log | tail -20

Step-by-Step Resolution

  1. Determine which half failed. If kafka-broker-api-versions.sh against your bootstrap endpoint also hangs, the problem is reaching bootstrap (DNS/port/firewall). If it succeeds but your app still times out, the problem is the advertised addresses or ACLs.
  2. Fix bootstrap reachability. Correct bootstrap.servers host/port, fix DNS, and open the broker port from the client subnet so the initial connection succeeds.
  3. Fix advertised listeners. Ensure advertised.listeners publishes addresses the clients can actually resolve and route to — not container names or localhost. For dual internal/external access, define multiple listeners with a correct listener.security.protocol.map.
  4. Align the security protocol. Make the client’s security.protocol match the listener it targets (PLAINTEXT, SSL, SASL_SSL, etc.).
  5. Resolve ACL denials. If the broker log shows Denied, grant the principal DESCRIBE on the topic (and WRITE/READ as needed) so metadata resolves.
  6. Confirm the topic exists with kafka-topics.sh --describe; create it or enable auto-create if appropriate.
  7. Retry and confirm the client obtains metadata and connects to partition leaders.

Prevention and Best Practices

  • Set advertised.listeners to client-routable addresses explicitly; never rely on the default that advertises the broker’s hostname, which often is not resolvable by clients.
  • For mixed internal/external clients, configure separate named listeners (e.g., INTERNAL and EXTERNAL) so each audience gets a reachable address.
  • Keep bootstrap.servers pointing at a stable load-balanced name or several brokers, not a single hardcoded IP.
  • Validate security-protocol alignment in client config validation so a PLAINTEXT client can’t silently hang against an SSL listener.
  • Audit ACLs when introducing a new client principal; a missing DESCRIBE looks exactly like a network timeout from the client side.
  • For fast triage of a metadata timeout, the free incident assistant can separate a bootstrap problem from an advertised-listener one.
  • Connection to node -1 could not be established — the bootstrap-unreachable warning that precedes this timeout.
  • Topic <name> not present in metadata after 60000 ms — the missing-topic variant.
  • TopicAuthorizationException / Not authorized to access topics — the ACL-denial form.
  • CoordinatorNotAvailableException — a related metadata-time failure for consumer-group coordination.

Frequently Asked Questions

Why does the client time out instead of failing fast? Metadata fetching is retried until max.block.ms (default 60000 ms) elapses, because transient unavailability during a restart or election is normal. The timeout is the client giving up after exhausting that window.

My broker is up and bootstrap works — why still a timeout? Almost always advertised.listeners. The client reaches bootstrap, gets back broker addresses it cannot resolve or route to, and fails on the follow-on connection. Check what the brokers advertise versus what the client can reach.

Can an ACL problem cause this exact message? Yes. If the principal lacks DESCRIBE on the topic, the topic’s metadata never resolves for that client and it times out with the same Failed to update metadata message. Check the broker log for Denied to distinguish it from a network issue.

Does increasing max.block.ms fix it? No. A longer timeout only delays the failure; metadata still cannot be obtained. Raising it is appropriate only to ride out genuinely transient elections, not to mask a misconfiguration.

How do I prove it’s a network block and not config? Run kafka-broker-api-versions.sh against the bootstrap endpoint from the client host. A hang there points at network/DNS/firewall to bootstrap; a success there with the app still timing out points at advertised listeners or ACLs.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.