Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Kafka By James Joyner IV · · 9 min read

Kafka Error Guide: 'NoBrokersAvailable' Client Cannot Reach Cluster

Fix kafka-python NoBrokersAvailable: diagnose wrong bootstrap_servers, DNS failures, firewall blocks, security protocol mismatches, and down brokers.

  • #kafka
  • #troubleshooting
  • #errors
  • #client

Exact Error Message

This is a client-side exception, most familiar from the kafka-python library, raised when none of the configured bootstrap brokers can be reached:

Traceback (most recent call last):
  File "producer.py", line 8, in <module>
    producer = KafkaProducer(bootstrap_servers="kafka-1.internal:9092")
  File ".../kafka/producer/kafka.py", line 381, in __init__
    client = KafkaClient(metrics=self._metrics, **self.config)
  File ".../kafka/client_async.py", line 244, in __init__
    self._bootstrap_connect()
kafka.errors.NoBrokersAvailable: NoBrokersAvailable

The accompanying debug logs usually show the failed bootstrap attempts:

DEBUG:kafka.conn:<BrokerConnection client_id=kafka-python-producer-1 node_id=bootstrap-0 host=kafka-1.internal:9092> connecting to ('10.0.4.21', 9092)
DEBUG:kafka.conn:Connection attempt to ... failed: [Errno 111] Connection refused
WARNING:kafka.conn:Node bootstrap-0 connection failed -- refreshing metadata

What the Error Means

NoBrokersAvailable means the Kafka client tried every entry in bootstrap_servers and could not establish a working connection to a single one. Because the client never reached a broker, it could not fetch cluster metadata, so it gives up before producing or consuming anything.

This is purely a client-to-cluster reachability problem at the bootstrap stage. It is the kafka-python equivalent of the Java client’s repeated “Broker may not be available” warnings followed by a metadata timeout. It says nothing about topics, partitions, consumer groups, or offsets — the client never got that far.

The cause is almost always one of: the cluster is genuinely unreachable (down or firewalled), the client is misconfigured (wrong host/port, wrong security protocol), or DNS/network resolution fails. A subtle and very common cause is a security protocol mismatch — connecting with PLAINTEXT to an SSL/SASL_SSL listener — which makes the handshake fail and the broker appear unavailable.

Common Causes

  • Wrong bootstrap_servers. A typo, wrong port, or pointing at a host that does not run Kafka.
  • Broker(s) down. Every listed bootstrap broker is stopped or crash-looping.
  • DNS resolution failure for the bootstrap hostname.
  • Firewall / security group blocks 9092 (or 9093) between client and cluster.
  • Security protocol mismatch. Client uses PLAINTEXT against an SSL/SASL_SSL listener (or vice versa), so the connection never completes.
  • Missing SASL/SSL config (no security_protocol, sasl_mechanism, credentials, or CA cert) when the listener requires it.
  • advertised.listeners unreachable from the client even though bootstrap host resolves.
  • Wrong cluster / environment — pointing at a cluster that is unreachable from this network.

How to Reproduce the Error

Point a kafka-python client at an unreachable or wrong endpoint:

# A host/port with no broker, or an SSL listener addressed as plaintext
kafka-broker-api-versions.sh --bootstrap-server kafka-1.internal:9092
# If this fails, a kafka-python KafkaProducer(bootstrap_servers="kafka-1.internal:9092")
# raises kafka.errors.NoBrokersAvailable on construction.

The classic reproduction of the silent variant: connect with default security_protocol="PLAINTEXT" to a broker whose listener is SASL_SSL. The TCP connection may open but the protocol negotiation fails, and the client reports NoBrokersAvailable.

Diagnostic Commands

Confirm name resolution and raw TCP reachability from the client host:

getent hosts kafka-1.internal
nc -z -v kafka-1.internal 9092
ss -ltnp | grep -E ':9092|:9093'

Verify a broker actually answers at the protocol level (matching the security protocol):

# PLAINTEXT listener
kafka-broker-api-versions.sh --bootstrap-server kafka-1.internal:9092
# SSL/SASL_SSL listener — supply matching client config
kafka-broker-api-versions.sh --command-config /etc/kafka/client-ssl.properties --bootstrap-server kafka-1.internal:9093

Check the broker’s listener/advertised config and service state from the broker side:

grep -iE 'listeners=|advertised.listeners|security.protocol' /opt/kafka/config/server.properties
sudo systemctl status kafka --no-pager | head -8
sudo journalctl -u kafka --since "15 min ago" | grep -iE 'started|bind|error|ssl|sasl'

Step-by-Step Resolution

  1. Test reachability from the client host. nc -z -v <host> 9092. A timeout means a firewall DROP or down broker; an instant refusal means no listener or a REJECT rule.
  2. Verify DNS. getent hosts <bootstrap_host> — confirm it resolves to the IP you expect. Fix stale or wrong DNS.
  3. Confirm a broker answers. Run kafka-broker-api-versions.sh --bootstrap-server <host>:<port>. If this fails too, the problem is the cluster/network, not your Python code.
  4. Match the security protocol. This is the most-missed cause. If the listener is SSL/SASL_SSL, the client must set security_protocol (and sasl_mechanism, credentials, CA cert) accordingly. A bare PLAINTEXT client against a secured listener yields NoBrokersAvailable.
  5. Check advertised.listeners. If bootstrap resolves but the advertised address is unroutable from the client, fix the broker’s advertised.listeners to a reachable name.
  6. Open the firewall. Allow the client CIDR to 9092/9093 in iptables and any cloud security group.
  7. Validate from Python. Reconstruct the KafkaProducer/KafkaConsumer with corrected bootstrap_servers and security settings; successful construction means metadata was fetched.

Prevention and Best Practices

  • List multiple brokers in bootstrap_servers so one down broker does not cause NoBrokersAvailable.
  • Centralize client connection config (host, port, security protocol, credentials) so environments cannot drift into protocol mismatches.
  • Use a stable DNS name or VIP for bootstrap rather than hardcoded IPs, so broker moves do not break clients.
  • Fail loudly in client config validation when a required security_protocol or credential is missing, rather than letting it default to plaintext.
  • Add a startup readiness check that runs kafka-broker-api-versions.sh (with the right client config) from the client network and alerts on failure.
  • Manage firewall rules through configuration management so a baseline reapply cannot silently drop the Kafka allow rule. The free incident assistant can turn the traceback plus nc/systemctl output into a likely cause.
  • Connection ... could not be established. Broker may not be available — the Java client equivalent of the same bootstrap-reachability failure.
  • BrokerEndPointNotAvailableException — the broker has no endpoint for the requested listener/security protocol, a frequent root cause of a mismatch-driven NoBrokersAvailable.
  • SSLHandshakeException / SaslAuthenticationException — what surfaces when the protocol matches but TLS/auth itself fails.
  • TimeoutException: Topic ... not present in metadata — the Java-side timeout when metadata can never be fetched.

Frequently Asked Questions

Why do I get NoBrokersAvailable when the broker is clearly running? Most often a security protocol mismatch (plaintext client vs SSL/SASL listener), an unreachable advertised.listeners address, or a firewall between you and the cluster. Test with kafka-broker-api-versions.sh using the matching client config.

Is this a kafka-python bug? No. It is a generic client-side signal that no bootstrap broker could be reached. It happens with other clients too; kafka-python just names it NoBrokersAvailable.

My nc test connects but Python still fails. Why? A successful TCP connect does not mean the protocol handshake succeeds. If the listener requires SSL/SASL and your client uses plaintext, the connection opens then fails negotiation, yielding NoBrokersAvailable. Set security_protocol correctly.

Could DNS be the cause? Yes. If the bootstrap hostname does not resolve (or resolves to the wrong IP), no broker can be reached. Confirm with getent hosts <host>.

Does listing more brokers help? Yes. With several entries in bootstrap_servers, the client tolerates individual broker outages. It only raises NoBrokersAvailable when none of them can be reached.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.