Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Kafka By James Joyner IV · · 9 min read

Kafka Error Guide: 'Session expired for /controller' ZooKeeper Session Expiry

Fix Kafka ZooKeeper SessionExpiredException for /controller: diagnose long GC pauses, low session timeouts, lost ephemeral nodes, controller re-election, and clock skew.

  • #kafka
  • #troubleshooting
  • #errors
  • #zookeeper

Exact Error Message

[2026-06-29 03:11:47,902] INFO [ZooKeeperClient Kafka server] Waiting until connected.
(kafka.zookeeper.ZooKeeperClient)
[2026-06-29 03:11:53,118] WARN Client session timed out, have not heard from server in 11004ms
for sessionid 0x20000c5d9a10002 (org.apache.zookeeper.ClientCnxn)

[2026-06-29 03:11:53,640] INFO Unable to reconnect to ZooKeeper service, session 0x20000c5d9a10002
has expired (org.apache.zookeeper.ClientCnxn)

[2026-06-29 03:11:53,641] INFO [ZooKeeperClient Kafka server] Session expired.
(kafka.zookeeper.ZooKeeperClient)
[2026-06-29 03:11:53,902] ERROR Error while electing or becoming controller on broker 2
(kafka.controller.KafkaController)
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /controller
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:130)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
        at kafka.zookeeper.ZooKeeperClient.handleRequests(ZooKeeperClient.scala:158)
        at kafka.zk.KafkaZkClient.retryRequestsUntilConnected(KafkaZkClient.scala:1947)
        at kafka.zk.KafkaZkClient.registerControllerAndIncrementControllerEpoch(KafkaZkClient.scala:171)
        at kafka.controller.KafkaController.elect(KafkaController.scala:1407)

What the Error Means

This error applies only to legacy, ZooKeeper-based Kafka clusters. KRaft-mode clusters do not use ZooKeeper, so there is no ZooKeeper session to expire and SessionExpiredException for /controller cannot occur there. If you have already migrated to KRaft, this guide does not apply to your cluster.

On a ZooKeeper-based cluster, each broker holds a single client session with the ensemble, kept alive by heartbeats. ZooKeeper tracks liveness using zookeeper.session.timeout.ms: if the server does not hear from a client within that window, it declares the session expired and discards it. Unlike a ConnectionLossException (a transient disconnect that the client retries on the same session), an expiry is terminal for that session — the client must establish a brand-new one.

The consequence is significant. A broker’s registration under /brokers/ids and the controller lock under /controller are ephemeral znodes — they exist only for the lifetime of the session that created them. When the session expires, ZooKeeper deletes those ephemeral nodes. The broker disappears from the cluster’s view, must re-register, and if the expired session held the controller lock, ZooKeeper triggers a controller re-election. You will see related lines such as Session expired, Unable to reconnect to ZooKeeper service, session 0x... has expired, and reconnection attempts as the broker rebuilds its session.

Common Causes

  • A long JVM garbage-collection pause on the broker that exceeds zookeeper.session.timeout.ms — the broker is frozen and cannot heartbeat, so ZooKeeper expires the session.
  • A network blip lasting longer than the session timeout, even if connectivity later recovers.
  • ZooKeeper leader election or overload, during which the ensemble briefly stops servicing heartbeats.
  • zookeeper.session.timeout.ms configured too low for the cluster’s real GC and latency profile, making normal hiccups fatal.
  • Clock skew or heavy host load, which distorts timing and starves the heartbeat thread of CPU.

How to Reproduce the Error

To observe a controlled session expiry in a lab:

  1. Start a broker with a deliberately low timeout, e.g. zookeeper.session.timeout.ms=6000.
  2. Induce a stop-the-world pause longer than 6 seconds. A blunt way is to send the broker JVM SIGSTOP for ~10 seconds and then SIGCONT: kill -STOP <kafka_pid>; sleep 10; kill -CONT <kafka_pid>
  3. When the JVM resumes, the heartbeat has long since lapsed. ZooKeeper has already expired the session, and the broker logs session 0x... has expired followed by SessionExpiredException as it tries to re-register.

The pause stands in for a real long GC. Watching /brokers/ids before and after shows the broker’s ephemeral node vanish and then reappear once it reconnects, confirming the ephemeral-node behavior.

Diagnostic Commands

All commands below are read-only and safe to run against a live cluster.

# Find the expiry and surrounding ZooKeeper events in the broker log.
journalctl -u kafka -n 500 --no-pager | grep -iE 'session expired|zookeeper'

# Hunt for long GC pauses that line up with the expiry timestamp.
grep -iE 'Total time for which application threads were stopped|Pause' \
  /var/log/kafka/gc.log | tail -50

# ZooKeeper monitoring stats: latency, outstanding requests, watch count.
echo mntr | nc zk-01 2181

# Server mode and connection/latency overview.
echo stat | nc zk-01 2181

# Confirm which brokers are currently registered (read-only ls).
zookeeper-shell.sh zk-01:2181 ls /brokers/ids

# Verify the broker can serve the Kafka API after recovery.
kafka-broker-api-versions.sh --bootstrap-server localhost:9092

# Follow ZooKeeper-side events around the expiry window.
journalctl -u zookeeper -n 200 --no-pager

If GC log timestamps show a stop-the-world pause longer than your session timeout immediately before the expiry, GC is the root cause. If echo mntr | nc shows high zk_avg_latency or many zk_outstanding_requests, the ensemble itself is overloaded.

Step-by-Step Resolution

The steps below change configuration or service state — apply them as the corrective fix once you have confirmed the cause.

  1. Increase zookeeper.session.timeout.ms so legitimate pauses do not expire the session. A common production value:

    zookeeper.session.timeout.ms=18000

    This must stay within the bounds ZooKeeper enforces. The ensemble only honors timeouts between 2 * tickTime and maxSessionTimeout (default 20 * tickTime). With the default tickTime=2000, the allowed range is 4000–40000 ms. If you push the client value above maxSessionTimeout, ZooKeeper silently caps it — so raise maxSessionTimeout on the ZK side if you need a larger window:

    # zoo.cfg on each ZooKeeper node
    tickTime=2000
    maxSessionTimeout=40000
  2. Fix the garbage collection behavior that caused the pause. Right-size the broker heap, switch to or tune G1GC, and avoid oversized heaps that lengthen pauses:

    export KAFKA_HEAP_OPTS="-Xmx6g -Xms6g"
    export KAFKA_JVM_PERFORMANCE_OPTS="-XX:+UseG1GC -XX:MaxGCPauseMillis=20"
    sudo systemctl restart kafka
  3. Reduce broker load if the host is saturated — rebalance partitions off a hot broker, throttle replication, or add capacity so the heartbeat thread always gets CPU.

  4. Stabilize the network between brokers and the ensemble. Eliminate intermittent packet loss on port 2181 and keep ZooKeeper peers on a low-latency network.

  5. Synchronize clocks with NTP on all broker and ZooKeeper hosts to remove skew that distorts timing:

    sudo systemctl enable --now chronyd
    chronyc tracking
  6. Restart the broker and confirm re-registration after applying changes:

    sudo systemctl restart kafka
    zookeeper-shell.sh zk-01:2181 ls /brokers/ids
    zookeeper-shell.sh zk-01:2181 get /controller

    The broker should reappear under /brokers/ids, and exactly one controller should be elected.

Prevention and Best Practices

  • Set zookeeper.session.timeout.ms from measured GC pause times and network latency — and keep it aligned with the ZooKeeper-side tickTime/maxSessionTimeout bounds so it is not silently capped.
  • Treat broker GC pauses as a first-class SLO: log GC, alert on pauses approaching the session timeout, and tune heap before expiries start.
  • Keep ZooKeeper on dedicated hosts with fast disks for the transaction log so leader elections and fsyncs stay fast.
  • Run NTP everywhere and monitor for clock drift across brokers and ensemble nodes.
  • Watch echo mntr | nc metrics (latency, outstanding requests, watch count) to catch ensemble overload before it expires sessions.
  • Avoid stacking other heavy workloads on broker hosts that could starve the heartbeat thread.
  • Plan a move to KRaft mode, which eliminates ZooKeeper sessions and this entire failure class. For triage on a live incident, the free incident assistant can summarize a likely cause from these logs.
  • ConnectionLossException: KeeperErrorCode = ConnectionLoss for /brokers/ids — the transient predecessor to expiry; the client lost the connection but the session is still alive and retrying.
  • Client session timed out, have not heard from server in Nms — the warning that precedes an expiry when heartbeats lapse.
  • Error while electing or becoming controller — the controller re-election fallout when the controller’s session expires.
  • NodeExistsException for /controller — a transient race during controller re-election as a new broker grabs the lock. More patterns live in the Kafka guides.

Frequently Asked Questions

What is the difference between ConnectionLoss and SessionExpired? ConnectionLoss is transient — the TCP connection dropped mid-request and the client retries on the same session, so ephemeral nodes survive. SessionExpired means the session itself is dead; ZooKeeper deleted its ephemeral nodes, and the broker must establish a new session and re-register.

Does session expiry happen on KRaft clusters? No. KRaft removes ZooKeeper, so there is no session to expire and no /controller ephemeral lock in ZooKeeper. This error is exclusive to ZooKeeper-based clusters.

Why does a session expiry sometimes cause a controller change? The controller lock under /controller is an ephemeral znode owned by the controller broker’s session. If that session expires, ZooKeeper deletes the lock and the remaining brokers race to elect a new controller, briefly disrupting metadata operations.

Can I just raise zookeeper.session.timeout.ms to a very large value? Not arbitrarily. ZooKeeper caps the effective timeout at maxSessionTimeout (default 20 * tickTime), so a huge client value is silently reduced. A very long timeout also delays detection of genuinely dead brokers. Tune it to cover real pauses, and raise the ZK-side limit deliberately if needed.

How do I confirm GC was the cause? Compare the broker’s GC log timestamps with the expiry timestamp in the Kafka log. A stop-the-world pause longer than your session timeout immediately before the session ... has expired line is the smoking gun.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.