Kafka Error Guide: 'java.net.SocketTimeoutException: Connection timed out' TCP Connect
Fix Kafka 'Connection timed out' at TCP connect — diagnose firewall DROP rules, security groups, and routing black holes, distinct from request.timeout.ms.
- #kafka
- #troubleshooting
- #errors
- #network
Exact Error Message
This appears when a Kafka client tries to open a TCP connection to a broker and the connection attempt hangs until it times out:
[2026-06-29 08:14:55,003] WARN [Producer clientId=producer-9] Connection to node 2 (kafka-2.internal/10.0.4.32:9092) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
java.net.SocketTimeoutException: Connection timed out
at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:546)
at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602)
at org.apache.kafka.common.network.Selector.connect(Selector.java:285)
at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:1037)
The OS-level variant from the kernel:
java.net.ConnectException: Connection timed out (Connection timed out)
at java.base/sun.nio.ch.Net.pollConnect(Native Method)
What the Error Means
“Connection timed out” at the TCP connect stage means the client sent SYN packets to the broker’s address and port and never received a SYN-ACK back. After the connect timeout elapses, the attempt is abandoned. The crucial detail is silence: the packets vanished. Nothing rejected them (that would be an instant “Connection refused”), and no session was ever established (so it is not a reset or broken pipe).
That silent black-holing is the signature of a packet DROP — a firewall, security group, or NetworkPolicy that discards traffic without replying, or a routing problem where packets have no path to the broker. The host either does not exist on that path, is unreachable, or a device is silently swallowing the SYN.
This is fundamentally different from request.timeout.ms. That error occurs after a connection is established, when a request is sent but no response returns in time. A SocketTimeoutException at connect means you never even got a socket open — a lower, earlier failure that no amount of tuning request.timeout.ms will fix.
Common Causes
- Firewall / security group DROP rule. The Kafka port (9092/9093) is not allowed from the client’s subnet, and the rule drops rather than rejects, so the SYN is silently discarded.
- Cloud network ACL or Kubernetes NetworkPolicy blocking the port between client and broker.
- Routing black hole. No route to the broker’s subnet, a missing peering connection, or a misconfigured VPC/VPN, so packets go nowhere.
- Wrong / unreachable advertised address. The broker advertises an address that exists but is not routable from this client, so connects hang.
- Broker host down at the network level (powered off / NIC down) such that nothing answers the SYN.
- Asymmetric routing or NAT misconfiguration dropping the return path.
How to Reproduce the Error
Block the Kafka port with a DROP rule and connect a client:
# on a test broker host, a DROP rule (illustrative — do not run in prod):
# iptables -A INPUT -p tcp --dport 9092 -j DROP
kafka-broker-api-versions.sh --bootstrap-server kafka-2.internal:9092
The client hangs and eventually fails with SocketTimeoutException: Connection timed out. Contrast this with a REJECT rule, which produces an instant “Connection refused” — that difference is the whole diagnosis.
Diagnostic Commands
All read-only.
Distinguish timeout (DROP) from refused (REJECT/no listener) — the single most useful test:
nc -z -v -w 8 kafka-2.internal 9092
A hang to the 8-second timeout = DROP/black hole. An instant “refused” = a different problem. Confirm DNS resolves to the IP you expect:
getent hosts kafka-2.internal
Trace the path to find where packets die (TCP probes to the Kafka port):
sudo traceroute -T -p 9092 kafka-2.internal
From the broker host, confirm it is actually listening (rules out a broker-side bind issue):
sudo ss -ltnp | grep ':9092'
End-to-end probe via the Kafka CLI:
kafka-broker-api-versions.sh --bootstrap-server kafka-2.internal:9092
Step-by-Step Resolution
1. Confirm it is a connect timeout, not a request timeout. The stack trace frame is the tell: Selector.connect / initiateConnect / pollConnect means TCP connect. If instead you see request/await frames after a connection was made, you have a request.timeout.ms problem — a different guide.
2. Test refused vs dropped. nc -z -v -w 8 <host> 9092. A hang-then-timeout confirms a DROP/black hole (network policy). An instant refusal points to a missing listener or REJECT rule instead.
3. Verify DNS and target. getent hosts <host> — make sure the name resolves to the broker’s real, routable IP and that you are using the correct advertised address for this network.
4. Find where packets die. traceroute -T -p 9092 <host> shows the last hop that responds. If it stops before the broker, a firewall/security group/route in between is dropping traffic. Open 9092/9093 from the client CIDR in that device (security group, NACL, NetworkPolicy, or host firewall).
5. Confirm the broker side. On the broker, ss -ltnp | grep :9092 proves it is listening. If it is listening and the path is open, re-test with kafka-broker-api-versions.sh.
6. Check advertised listeners. If bootstrap succeeds but connecting to a specific node times out, the broker may advertise an address routable internally but not from this client. Make advertised listeners reachable from every client network.
Prevention and Best Practices
- Manage firewall and security-group rules for 9092/9093 through configuration management so an allow rule for client subnets cannot silently disappear and start dropping SYNs.
- Prefer REJECT over DROP on intentional blocks where safe, so clients fail fast instead of hanging to a timeout (faster diagnosis).
- Make
advertised.listenersan explicit, routable name reachable from every client network, and document internal vs external listeners separately. - Add a synthetic TCP-connect check (
nc -z) from each client network to every broker and alert on timeout, catching black holes before applications do. - Tune
socket.connection.setup.timeout.msonly to control how fast the client gives up — it does not fix a blocked path; it just shortens the wait. - More network-failure patterns are collected in the Kafka guides.
Related Errors
Connection refused— instant rejection (REJECT rule or no listener), the opposite signature to this silent timeout.Connection to node -1 could not be established— the bootstrap-time symptom; a connect timeout to the bootstrap address is one of its causes.request.timeout.msexceeded — a post-connection request timeout, not a TCP-connect timeout; do not confuse the two.Connection reset by peer— an established connection forcibly closed, not a connect that never completed.
Frequently Asked Questions
How is this different from request.timeout.ms?
This is a connect timeout — the TCP socket never opened. request.timeout.ms fires after a connection exists, when a request gets no response. The stack-trace frames (connect/pollConnect vs request/await) tell them apart.
Timeout vs connection refused — what is the difference?
Refused is an instant RST (a REJECT rule or no listener). Timeout is silence (a DROP rule, network ACL, or routing black hole). nc -z -v -w 8 distinguishes them.
The broker is up and listening — why does it time out?
A device between client and broker is dropping packets, or the broker advertises an address this client cannot route to. Use traceroute -T -p 9092 to find the dead hop.
Will increasing the timeout fix it? No. A longer timeout only makes the client wait longer before failing. You must open the blocked path or fix routing.
Why does my Kubernetes external client time out but internal works? The external client likely targets an internal service name or pod IP that is not routable from outside. Use the external listener (LoadBalancer/NodePort) advertised address.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.