Kafka Error Guide: 'java.io.IOException: Broken pipe' Write

Exact Error Message

This surfaces in a Kafka producer (or broker) log when the local side tries to write to a socket whose other end is already gone:

[2026-06-29 11:18:44,201] WARN [Producer clientId=producer-2] Error in I/O with kafka-2.internal/10.0.4.32:9092
java.io.IOException: Broken pipe
        at java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at java.base/sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:62)
        at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:132)
        at org.apache.kafka.common.network.PlaintextTransportLayer.write(PlaintextTransportLayer.java:152)
        at org.apache.kafka.common.network.KafkaChannel.send(KafkaChannel.java:435)
        at org.apache.kafka.common.network.Selector.write(Selector.java:735)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:483)

On the broker side, the same condition appears while responding to a slow or gone client:

[2026-06-29 11:18:44,260] WARN [SocketServer listenerType=ZK_BROKER, nodeId=2] Failed to send response to client 10.0.6.14:51544
java.io.IOException: Broken pipe

What the Error Means

“Broken pipe” is the OS reporting that your process wrote bytes to a TCP socket whose remote end has already closed (it received a FIN or RST earlier). The first write may succeed silently; the next write triggers EPIPE — a broken pipe.

The crucial distinction: “broken pipe” is a write-side failure, while “connection reset by peer” is typically a read-side failure. Both mean the connection is dead, but broken pipe specifically means you were trying to send when you discovered it. In Kafka, this commonly means a producer was flushing a batch, or a broker was sending a response, to a peer that had already disappeared.

This is not a bootstrap problem (the connection existed) and not authentication (the handshake had completed). The session was healthy, then the far side closed, and the local side noticed only when it next wrote.

Common Causes

The peer closed the connection while you were sending. A broker restart, crash, or controlled.shutdown closes the socket; the producer’s next write hits a broken pipe.
Idle-connection timeout. The broker’s connections.max.idle.ms (default 10 min) or a load balancer’s idle timeout closed an idle connection; the client then writes to it. (This is the flip side of resets — sometimes a graceful close, sometimes a reset.)
Oversized request rejected mid-write. A produce request exceeding socket.request.max.bytes or message.max.bytes causes the broker to close the connection; the client sees the write fail.
Broker overload / request-queue full leads the broker to close connections it cannot service, breaking the client’s in-flight write.
Network device (NAT/LB/firewall) dropping the flow while the client is mid-send.

How to Reproduce the Error

Start a long-lived producer and force the broker it is sending to away:

kafka-console-producer.sh --bootstrap-server kafka-2.internal:9092 --topic events
# in another shell, stop/restart kafka-2 while the producer is actively sending

The producer’s next write logs Broken pipe. To reproduce the oversized variant, send a record larger than the broker’s message.max.bytes while the connection is open — the broker closes it and the client’s write fails.

Diagnostic Commands

All read-only.

Check whether the target broker recently restarted or shut down:

sudo journalctl -u kafka --no-pager --since "15 min ago" | grep -iE 'shutdown|controlled.shutdown|started|OutOfMemory'

Look for oversized-request or response-send failures on the broker:

sudo journalctl -u kafka --no-pager | grep -iE 'Broken pipe|too large|request.max.bytes|message.max.bytes'

Confirm the broker’s effective size and idle limits from its config (read-only view):

grep -E 'socket.request.max.bytes|message.max.bytes|connections.max.idle.ms' /opt/kafka/config/server.properties

Verify the broker is currently reachable and serving:

kafka-broker-api-versions.sh --bootstrap-server kafka-2.internal:9092

Inspect live socket state and how long connections persist:

ss -tni dst 10.0.4.32:9092

Step-by-Step Resolution

1. Check for a broker restart. Use journalctl to see if the target broker shut down or crashed around the error time. If so, the broken pipe is expected churn; ensure the client has sane retries, delivery.timeout.ms, and retry.backoff.ms so it reconnects and resends. Fix the restart/OOM root cause.

2. Rule out oversized requests. If the error correlates with large messages, compare your record/batch size against the broker’s message.max.bytes and socket.request.max.bytes. Either reduce client max.request.size/batch.size or raise the broker limits to match. The broker log’s “too large” lines confirm this path.

3. Align idle timeouts. If broken pipes appear on connections that sat idle, the broker or a middlebox closed the idle socket. Set client connections.max.idle.ms below the broker’s and below any LB/firewall idle timeout so the client recycles connections proactively.

4. Check broker load. A saturated request queue or network thread pool can cause the broker to drop connections. Review broker metrics; scale partitions/brokers or tune num.network.threads if overloaded.

5. Confirm recovery. kafka-broker-api-versions.sh against the broker should return cleanly once it is healthy. Healthy clients will reconnect and resume sends.

Prevention and Best Practices

Configure producer reliability: acks=all, sensible retries, and delivery.timeout.ms so a broken pipe results in a retried send, not a dropped record.
Keep client max.request.size/batch.size within the broker’s message.max.bytes and socket.request.max.bytes — and document the matched values.
Set client connections.max.idle.ms below the broker’s and any middlebox idle timeout to avoid writing to a half-closed socket.
Use controlled.shutdown.enable=true and rolling restarts so closes are graceful and clients fail over cleanly.
Monitor sustained broken-pipe rates as a signal of overload or a chronically misconfigured size/idle limit, while ignoring brief spikes during deploys.

Connection reset by peer — the read-side counterpart; the peer RST’d a connection you were reading from.
Node N disconnected — Kafka’s higher-level report of a known broker connection dropping, often alongside a broken pipe.
RecordTooLargeException — the application-level error when a record exceeds limits, related to the oversized-request cause here.
SocketTimeoutException — a request/connect timeout rather than a write to a dead socket.

Frequently Asked Questions

What is the difference between broken pipe and connection reset? Broken pipe is a write failure (you sent to a closed socket); connection reset is usually a read failure (the peer RST’d you). Both mean the connection is dead; the stack trace’s write vs read frame tells them apart.

Is a broken pipe data loss? Not if the producer is configured to retry (acks=all, retries, delivery.timeout.ms). The in-flight batch is resent to a healthy broker. Without retries, the send can fail permanently.

Why does my broker log broken pipe? The broker was sending a response to a client that already went away (slow consumer, client crash, or idle close). It is usually benign on the broker side.

Could oversized messages cause this? Yes. A request larger than socket.request.max.bytes/message.max.bytes makes the broker close the connection, and the client’s write fails as a broken pipe.

How do I stop idle-timeout broken pipes? Set the client connections.max.idle.ms lower than the broker’s and any load balancer/firewall idle timeout so the client closes and reopens before the peer does.

Kafka Error Guide: 'java.io.IOException: Broken pipe' Write to Closed Socket

Exact Error Message

What the Error Means

Common Causes

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Prevention and Best Practices

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit

Exact Error Message

What the Error Means

Common Causes

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Prevention and Best Practices

Related Errors

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit