Kafka Error Guide: 'WakeupException' Thrown During Consumer Poll
Understand Kafka WakeupException: why consumer.wakeup() interrupts poll(), how to handle it for clean shutdown, and how to tell intended wakeups from real failures.
- #kafka
- #troubleshooting
- #errors
- #consumer
Exact Error Message
org.apache.kafka.common.errors.WakeupException: null
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.maybeTriggerWakeup(ConsumerNetworkClient.java:514)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:282)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:481)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1262)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1230)
at com.example.events.EventConsumer.run(EventConsumer.java:64)
In application logs the line that matters is often the uncaught wrapper, because the exception was not handled:
ERROR c.e.events.EventConsumer - Consumer thread terminated unexpectedly
org.apache.kafka.common.errors.WakeupException
at ...
Exception in thread "event-consumer-1" org.apache.kafka.common.errors.WakeupException
What the Error Means
WakeupException is not a fault condition — it is a control-flow signal. It is thrown from a blocking consumer call (almost always poll(), but also commitSync(), position(), or committed()) when another thread calls consumer.wakeup(). The Kafka consumer is not thread-safe for concurrent use, so wakeup() is the one method designed to be called from a different thread. It exists to break a consumer out of a long poll(Duration) block so the owning thread can shut down cleanly.
The canonical use is a JVM shutdown hook: when the process receives SIGTERM, the hook calls consumer.wakeup(), the poll loop’s current poll() immediately throws WakeupException, your loop catches it, commits final offsets, and closes the consumer. If you do not catch it, the exception propagates, the consumer thread dies abruptly, offsets may not be committed, and consumer.close() may be skipped, delaying the group rebalance until the session times out.
Common Causes
- Intended shutdown — a shutdown hook or supervisor calls
wakeup()to stop the loop. This is correct and expected; the bug is only failing to handle it. - Unhandled wakeup on the poll thread —
wakeup()is called but the loop has notry/catchforWakeupException, so the thread crashes. wakeup()called more than once or at the wrong time — a pending wakeup flag is set, and the very next blocking call throws immediately, even though you expected it to poll.wakeup()from inside the poll thread — calling it from a rebalance listener or message handler on the same thread leads to confusing re-entrant throws.- Library/framework wrappers that call
wakeup()during reconfiguration or partition revocation without your code expecting it.
How to Reproduce the Error
Run a poll loop on one thread and call wakeup() from another:
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(List.of("events"));
Runnable loop = () -> {
while (true) {
ConsumerRecords<String, String> r = consumer.poll(Duration.ofMillis(Long.MAX_VALUE));
// process...
}
};
Thread t = new Thread(loop, "event-consumer-1");
t.start();
Thread.sleep(2000);
consumer.wakeup(); // poll() on the other thread throws WakeupException
Because the loop has no try/catch, the consumer thread terminates with the stack trace above.
Diagnostic Commands
WakeupException lives in the application, not the broker, so diagnostics focus on logs and group membership. All read-only.
# Did the consumer leave cleanly (close called) or time out? Check member list right after shutdown.
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
--group event-processing --describe --members
# Group state — a Stable group right after restart means clean close; Empty for session.timeout means it crashed
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
--group event-processing --state
# Correlate the wakeup with a SIGTERM/restart
journalctl -u event-consumer --since "30 min ago" | grep -iE "SIGTERM|Stopping|WakeupException|terminated"
# Find unhandled wakeups (thread death) vs handled ones (graceful shutdown log line)
grep -iE "WakeupException|Exception in thread|Shutting down consumer" /var/log/event-consumer/app.log
# Confirm offsets were committed before exit (lag should not jump on restart)
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
--group event-processing --describe
Step-by-Step Resolution
-
Decide whether the wakeup was intended. If it coincides with a SIGTERM, deploy, or scale-down, it is the shutdown path working as designed — you only need to handle it. If no shutdown occurred, find who called
wakeup(). -
Catch the exception in the poll loop. Wrap the loop and treat
WakeupExceptionas the stop signal:try { while (running.get()) { ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(1)); process(records); consumer.commitSync(); } } catch (WakeupException e) { // ignore — shutdown requested } finally { consumer.commitSync(); // final offset flush consumer.close(); // triggers prompt rebalance } -
Register the shutdown hook correctly. Call only
consumer.wakeup()from the hook (it is the sole thread-safe method); do all cleanup on the poll thread’sfinallyblock.Runtime.getRuntime().addShutdownHook(new Thread(consumer::wakeup)); -
Distinguish intended from accidental wakeups. If you also use
wakeup()for non-shutdown control, guard with a flag (e.g.running) so the loop knows whether to exit or continue. -
Avoid catching it as a generic error. Do not let
WakeupExceptionfall into a broadcatch (Exception e) { retry; }block — that turns a clean stop into a restart loop. -
Verify a clean stop. After resolution, restart the service and confirm via
--describethat lag does not jump (final commit happened) and the group rebalances immediately (close was called).
Prevention and Best Practices
- Always pair a shutdown hook that calls
wakeup()with atry/catch (WakeupException)around the poll loop and afinallythat commits and closes. - Treat
WakeupExceptionas expected control flow, never as an error to log at ERROR or retry. - Keep a single thread owning each consumer; only
wakeup()may cross threads. - Use a
Durationpoll timeout (notLong.MAX_VALUE) so the loop also exits promptly via therunningflag even if a wakeup is missed. - Call
consumer.close(Duration)infinallyso the broker rebalances immediately instead of waiting forsession.timeout.ms.
Related Errors
InterruptException— Kafka’s wrapper forInterruptedExceptionwhen the consumer thread is interrupted rather than woken; handle similarly but it indicatesThread.interrupt(), notwakeup().CommitFailedException— can occur in thefinallycommit if the member was already evicted; commit defensively.RebalanceInProgressException— a final commit during shutdown may race a rebalance.IllegalStateException: This consumer has already been closed— calling consumer methods afterclose()in a sloppy shutdown sequence.
Frequently Asked Questions
Is WakeupException a real error?
No. It is a deliberate signal that another thread called wakeup(). The only mistake is failing to catch it. Treat it as “stop requested,” not as a failure.
Why did my consumer thread die silently?
Because the WakeupException was uncaught. It propagated out of poll(), terminated the thread, and likely skipped your final commitSync() and close(). Wrap the loop in try/catch/finally.
Can I call wakeup() from the poll thread itself? You can, but it is rarely useful — it sets a pending flag so the next blocking call throws immediately. For shutdown, call it from a separate thread such as the JVM shutdown hook.
Does wakeup() lose messages?
Not by itself. Records already returned by the last poll() are yours to process; whether they are reprocessed after restart depends on whether you committed offsets in the finally block before closing.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.