RabbitMQ Error Guide: '{socket_error, epipe}' Broken Pipe on Write
Fix RabbitMQ epipe / broken pipe errors: trace writes to a closed socket from slow consumers, vanished clients, and network drops, and stop one-sided connection loss.
- #rabbitmq
- #troubleshooting
- #errors
- #connectivity
Exact Error Message
A broken-pipe error means RabbitMQ (or a client) tried to write to a TCP socket whose other end is already gone. The write fails with epipe — the POSIX EPIPE errno. It is the write-side counterpart to a connection reset: instead of receiving an RST while reading, the broker discovers the dead peer when it attempts to send a frame.
2026-06-29 11:18:04.661 [warning] <0.30551.6> closing AMQP connection <0.30551.6> (10.0.6.88:55012 -> 10.0.4.21:5672, vhost: 'prod', user: 'feed-consumer'):
{writer,send_failed,{error,epipe}}
2026-06-29 11:18:04.662 [error] <0.30551.6> error on AMQP connection <0.30551.6> (10.0.6.88:55012 -> 10.0.4.21:5672, state: running):
{inet_error,epipe}
A client publisher pushing a large batch may see:
pika.exceptions.StreamLostError: Stream connection lost: BrokenPipeError(32, 'Broken pipe')
What the Error Means
epipe is raised when a process writes to a socket (or pipe) that has been closed by the peer. In RabbitMQ the broker’s connection writer process holds frames destined for the client; when it flushes them and the client’s socket is already gone, the kernel returns EPIPE and the writer fails with {writer, send_failed, {error, epipe}}.
The defining characteristic is direction: the failure is detected on a write, not a read. That usually means the broker had data queued to deliver — most commonly messages for a consumer — and the consumer disappeared. So broken-pipe errors skew toward the delivery path (broker writing to consumers) and toward large or backed-up writes, whereas resets skew toward idle reads.
Common Causes
1. Slow or stalled consumer with a full TCP buffer
A consumer that stops reading from the socket (blocked on disk, a lock, or a long handler) lets its receive buffer fill. When it finally dies or the buffer overflows the timeout, the broker’s next write hits a closed socket.
2. Consumer process vanished mid-delivery
The broker is streaming deliveries when the consumer is killed (OOM, deploy, crash). The half-sent frame’s write fails with epipe.
3. Network drop during a large write
A network blip, VPN flap, or path failure during transmission of a large message or a burst of deliveries closes the connection while the broker is mid-write.
4. Middlebox closed the connection one-directionally
A load balancer or firewall that half-closes or RSTs the flow can leave the broker writing into a dead pipe until the kernel reports EPIPE.
5. No heartbeats to detect the dead peer earlier
Without heartbeats, the broker only learns the peer is gone when it tries to write, so failures surface as epipe rather than a cleaner heartbeat timeout.
How to Reproduce the Error
Create a consumer that stops reading, then push enough messages to fill its socket buffer:
# Publish a backlog the consumer cannot drain, then kill the consumer mid-stream
python3 - <<'PY'
import pika, os, signal, time
conn = pika.BlockingConnection(pika.URLParameters("amqp://guest:guest@10.0.4.21:5672/%2F"))
ch = conn.channel()
ch.queue_declare(queue="epipe-test")
for i in range(50000):
ch.basic_publish("", "epipe-test", b"x"*4096) # ~200MB backlog
print("backlog queued; start a consumer, then SIGKILL it while it reads")
PY
# In another shell, start a consumer and immediately: kill -9 <consumer-pid>
While the broker streams the backlog to the consumer and the consumer is kill -9ed, the broker’s writer fails with {writer, send_failed, {error, epipe}}.
Diagnostic Commands
# Find broken-pipe events and the connections that hit them
sudo journalctl -u rabbitmq-server --no-pager | grep -E 'epipe|send_failed' | tail -20
# Identify consumers and their unacked backlog (slow consumers)
rabbitmqctl list_consumers queue_name channel_pid ack_required prefetch_count
# Show per-connection send pending and state
rabbitmqctl list_connections name peer_host state send_pend recv_cnt send_cnt
# Spot queues with large ready/unacked counts feeding slow consumers
rabbitmqctl list_queues name messages_ready messages_unacknowledged consumers
# Watch OS-level send-queue backup on the broker's AMQP sockets
sudo ss -tnm state established '( sport = :5672 )' | grep -A1 ':5672' | head
Step-by-Step Resolution
Step 1: Confirm it is a write-side failure
{writer, send_failed, {error, epipe}} confirms the broker failed while writing to a consumer. This points you at the delivery path, not the publish path.
Step 2: Find the affected consumers
Match the connection pid/peer in the log to a consumer. Check that consumer’s queue for a large messages_unacknowledged count — a sign it was too slow and built up in-flight deliveries.
Step 3: Set a sensible prefetch (QoS)
Unbounded prefetch lets the broker push thousands of deliveries into a consumer that cannot keep up, filling buffers. Set basic.qos(prefetch_count=N) to a modest value so the broker only sends what the consumer can ack.
Step 4: Make consumers acknowledge and read promptly
Ensure handlers do not block the I/O loop; offload slow work and ack as you go. A consumer that never reads the socket guarantees eventual epipe.
Step 5: Enable heartbeats and reconnect logic
Heartbeats (30-60s) let the broker detect a dead peer before a large write fails, and client auto-reconnect re-establishes delivery cleanly after a drop.
Prevention and Best Practices
- Always set a bounded
prefetch_countso the broker never streams more in-flight messages than a consumer can drain. - Keep consumer message handlers fast and non-blocking; move heavy work off the connection’s I/O loop.
- Enable heartbeats so dead peers are detected proactively instead of on the next write.
- Size messages sensibly and avoid pushing huge payloads to consumers that may stall mid-receive.
- Monitor
messages_unacknowledgedper queue and alert when it grows unbounded — that is the leading indicator of a slow consumer headed for a broken pipe. - For fast triage, the free incident assistant can connect an epipe spike to a stalled consumer or backlog.
Related Errors
{socket_error, econnreset}— the read-side equivalent; the broker learns the peer is gone while reading rather than writing.missed heartbeats, timeout— proactive detection of a dead peer before a write fails.connection.blocked/ resource alarm — publishers (not consumers) being throttled, a different flow-control path.Error on AMQP connection ... state: running— the generic lifecycle line that wraps the epipe reason.
More patterns in the RabbitMQ guides.
Frequently Asked Questions
Is broken pipe the same as connection reset?
They are siblings. Both mean the peer is gone, but epipe is detected on a write (the broker had data to send) while econnreset is detected on a read (an RST arrived). Broken pipe therefore concentrates on the delivery path to consumers.
Why does this mostly hit consumers, not publishers?
Because the broker spends its write effort delivering messages to consumers. When a consumer stalls or dies while the broker is sending, the write fails with epipe. Publishers more often see resets on their own send path.
Will setting prefetch really help?
Yes, significantly. Without QoS the broker can push a large in-flight backlog into a slow consumer, filling kernel buffers until a write fails. A bounded prefetch_count keeps in-flight volume matched to consumer speed.
Does a broken pipe lose messages?
Messages delivered but not acknowledged are requeued (with ack_required), so they are redelivered to another consumer. Unbuffered, unacked work is not lost as long as consumers use manual acknowledgements.
How do heartbeats reduce broken-pipe errors?
Heartbeats let the broker notice a dead peer during idle periods and close cleanly with a heartbeat timeout, rather than discovering the dead socket only when it attempts a large write that fails with epipe.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.