Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Kafka By James Joyner IV · · 9 min read

Kafka Error Guide: 'WakeupException' Thrown During Consumer Poll

Understand Kafka WakeupException: why consumer.wakeup() interrupts poll(), how to handle it for clean shutdown, and how to tell intended wakeups from real failures.

  • #kafka
  • #troubleshooting
  • #errors
  • #consumer

Exact Error Message

org.apache.kafka.common.errors.WakeupException: null
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.maybeTriggerWakeup(ConsumerNetworkClient.java:514)
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:282)
	at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:481)
	at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1262)
	at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1230)
	at com.example.events.EventConsumer.run(EventConsumer.java:64)

In application logs the line that matters is often the uncaught wrapper, because the exception was not handled:

ERROR c.e.events.EventConsumer - Consumer thread terminated unexpectedly
org.apache.kafka.common.errors.WakeupException
	at ...
Exception in thread "event-consumer-1" org.apache.kafka.common.errors.WakeupException

What the Error Means

WakeupException is not a fault condition — it is a control-flow signal. It is thrown from a blocking consumer call (almost always poll(), but also commitSync(), position(), or committed()) when another thread calls consumer.wakeup(). The Kafka consumer is not thread-safe for concurrent use, so wakeup() is the one method designed to be called from a different thread. It exists to break a consumer out of a long poll(Duration) block so the owning thread can shut down cleanly.

The canonical use is a JVM shutdown hook: when the process receives SIGTERM, the hook calls consumer.wakeup(), the poll loop’s current poll() immediately throws WakeupException, your loop catches it, commits final offsets, and closes the consumer. If you do not catch it, the exception propagates, the consumer thread dies abruptly, offsets may not be committed, and consumer.close() may be skipped, delaying the group rebalance until the session times out.

Common Causes

  • Intended shutdown — a shutdown hook or supervisor calls wakeup() to stop the loop. This is correct and expected; the bug is only failing to handle it.
  • Unhandled wakeup on the poll threadwakeup() is called but the loop has no try/catch for WakeupException, so the thread crashes.
  • wakeup() called more than once or at the wrong time — a pending wakeup flag is set, and the very next blocking call throws immediately, even though you expected it to poll.
  • wakeup() from inside the poll thread — calling it from a rebalance listener or message handler on the same thread leads to confusing re-entrant throws.
  • Library/framework wrappers that call wakeup() during reconfiguration or partition revocation without your code expecting it.

How to Reproduce the Error

Run a poll loop on one thread and call wakeup() from another:

KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(List.of("events"));

Runnable loop = () -> {
    while (true) {
        ConsumerRecords<String, String> r = consumer.poll(Duration.ofMillis(Long.MAX_VALUE));
        // process...
    }
};
Thread t = new Thread(loop, "event-consumer-1");
t.start();

Thread.sleep(2000);
consumer.wakeup(); // poll() on the other thread throws WakeupException

Because the loop has no try/catch, the consumer thread terminates with the stack trace above.

Diagnostic Commands

WakeupException lives in the application, not the broker, so diagnostics focus on logs and group membership. All read-only.

# Did the consumer leave cleanly (close called) or time out? Check member list right after shutdown.
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --group event-processing --describe --members
# Group state — a Stable group right after restart means clean close; Empty for session.timeout means it crashed
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --group event-processing --state
# Correlate the wakeup with a SIGTERM/restart
journalctl -u event-consumer --since "30 min ago" | grep -iE "SIGTERM|Stopping|WakeupException|terminated"
# Find unhandled wakeups (thread death) vs handled ones (graceful shutdown log line)
grep -iE "WakeupException|Exception in thread|Shutting down consumer" /var/log/event-consumer/app.log
# Confirm offsets were committed before exit (lag should not jump on restart)
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --group event-processing --describe

Step-by-Step Resolution

  1. Decide whether the wakeup was intended. If it coincides with a SIGTERM, deploy, or scale-down, it is the shutdown path working as designed — you only need to handle it. If no shutdown occurred, find who called wakeup().

  2. Catch the exception in the poll loop. Wrap the loop and treat WakeupException as the stop signal:

    try {
        while (running.get()) {
            ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(1));
            process(records);
            consumer.commitSync();
        }
    } catch (WakeupException e) {
        // ignore — shutdown requested
    } finally {
        consumer.commitSync();   // final offset flush
        consumer.close();        // triggers prompt rebalance
    }
  3. Register the shutdown hook correctly. Call only consumer.wakeup() from the hook (it is the sole thread-safe method); do all cleanup on the poll thread’s finally block.

    Runtime.getRuntime().addShutdownHook(new Thread(consumer::wakeup));
  4. Distinguish intended from accidental wakeups. If you also use wakeup() for non-shutdown control, guard with a flag (e.g. running) so the loop knows whether to exit or continue.

  5. Avoid catching it as a generic error. Do not let WakeupException fall into a broad catch (Exception e) { retry; } block — that turns a clean stop into a restart loop.

  6. Verify a clean stop. After resolution, restart the service and confirm via --describe that lag does not jump (final commit happened) and the group rebalances immediately (close was called).

Prevention and Best Practices

  • Always pair a shutdown hook that calls wakeup() with a try/catch (WakeupException) around the poll loop and a finally that commits and closes.
  • Treat WakeupException as expected control flow, never as an error to log at ERROR or retry.
  • Keep a single thread owning each consumer; only wakeup() may cross threads.
  • Use a Duration poll timeout (not Long.MAX_VALUE) so the loop also exits promptly via the running flag even if a wakeup is missed.
  • Call consumer.close(Duration) in finally so the broker rebalances immediately instead of waiting for session.timeout.ms.
  • InterruptException — Kafka’s wrapper for InterruptedException when the consumer thread is interrupted rather than woken; handle similarly but it indicates Thread.interrupt(), not wakeup().
  • CommitFailedException — can occur in the finally commit if the member was already evicted; commit defensively.
  • RebalanceInProgressException — a final commit during shutdown may race a rebalance.
  • IllegalStateException: This consumer has already been closed — calling consumer methods after close() in a sloppy shutdown sequence.

Frequently Asked Questions

Is WakeupException a real error? No. It is a deliberate signal that another thread called wakeup(). The only mistake is failing to catch it. Treat it as “stop requested,” not as a failure.

Why did my consumer thread die silently? Because the WakeupException was uncaught. It propagated out of poll(), terminated the thread, and likely skipped your final commitSync() and close(). Wrap the loop in try/catch/finally.

Can I call wakeup() from the poll thread itself? You can, but it is rarely useful — it sets a pending flag so the next blocking call throws immediately. For shutdown, call it from a separate thread such as the JVM shutdown hook.

Does wakeup() lose messages? Not by itself. Records already returned by the last poll() are yours to process; whether they are reprocessed after restart depends on whether you committed offsets in the finally block before closing.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.