Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Kafka By James Joyner IV · · 9 min read

Kafka Error Guide: 'IllegalGenerationException' Generation Is Not the Current Generation

Fix Kafka IllegalGenerationException: why a stale group generation rejects commits and heartbeats after a rebalance, and how to rejoin with the current generation.

  • #kafka
  • #troubleshooting
  • #errors
  • #consumer

Exact Error Message

org.apache.kafka.common.errors.IllegalGenerationException: Generation 42 is not the current generation.
	at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:1280)
	at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:1233)
	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:1264)
	at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1517)
	at com.example.audit.AuditConsumer.run(AuditConsumer.java:90)

It usually appears alongside a rejoin log from the coordinator:

INFO  o.a.k.c.c.internals.ConsumerCoordinator - [Consumer clientId=audit-2, groupId=audit-stream]
  Offset commit failed on partition audit-5 at offset 77310: The generation of the group has changed,
  so the offset commit cannot be completed.
INFO  o.a.k.c.c.internals.AbstractCoordinator - [Consumer clientId=audit-2, groupId=audit-stream]
  Generation 42 is no longer valid; rejoining the group with the latest generation.

What the Error Means

Every consumer group carries a monotonically increasing generation id (sometimes called the group epoch). Each time the group rebalances, the coordinator bumps the generation. When a member sends a heartbeat, offset commit, or sync request, it stamps the request with the generation it last knew about. IllegalGenerationException is thrown when that stamped generation is older than the coordinator’s current generation — meaning a rebalance completed and advanced the generation while this member was still operating under the previous one.

In short: the member’s view of the group is stale. A rebalance happened (a member joined or left, or the member itself missed a heartbeat window), the generation moved from, say, 42 to 43, and this member’s commit for generation 42 is no longer valid. The coordinator rejects it to avoid committing offsets that belong to an obsolete assignment. The fix is built into the client protocol — the consumer must rejoin the group and pick up the current generation, which happens automatically on the next poll().

Common Causes

  • A rebalance completed while the member was mid-cycle — the generation advanced between the member’s last sync and its commit.
  • Missed heartbeats beyond session.timeout.ms — the member was briefly considered gone, the group rebalanced and advanced the generation, then the member tried to act on the old generation.
  • Long processing or GC pauses that delayed the member’s participation in the rebalance, so it lagged a generation behind.
  • Eager rebalancing churn where rapid joins/leaves bump the generation repeatedly, leaving slow members stale.
  • Committing offsets from a thread or callback that captured the generation before a rebalance and used it afterward.
  • Clock or network hiccups causing heartbeat delays that desynchronize the member’s generation.

How to Reproduce the Error

Force a slow member to fall a generation behind by pausing it during a rebalance:

// short session timeout so a pause causes a rebalance
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 6000);
props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, 2000);

ConsumerRecords<String, String> r = consumer.poll(Duration.ofMillis(500));
Thread.sleep(8000);          // miss heartbeats; group rebalances, generation bumps
consumer.commitSync();       // commit stamped with the old generation -> IllegalGenerationException

While this member sleeps, add or remove another member in the same group to guarantee the generation advances.

Diagnostic Commands

All read-only. Confirm the group rebalanced and find what advanced the generation.

# Group state — recent rebalance shows here; Stable means it has resettled
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --group audit-stream --state
# Members and assignments — compare across two runs to spot churn
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --group audit-stream --describe --members --verbose
# Lag — should be transient around the generation change
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --group audit-stream --describe
# Effective session/heartbeat config that governs how easily a member goes stale
grep -iE "session.timeout.ms|heartbeat.interval.ms|max.poll.interval.ms" /var/log/audit-consumer/app.log
# Correlate generation changes with rejoins/pauses
journalctl -u audit-consumer --since "1 hour ago" | grep -iE "generation|rejoin|heartbeat"

Step-by-Step Resolution

  1. Treat it as recoverable. Like RebalanceInProgressException, this is the protocol self-correcting. The client logs that generation 42 is invalid and rejoins automatically. Do not crash the thread or mark the batch committed.

  2. Catch and continue the poll loop. Let the next poll() rejoin and pick up the current generation, then re-commit after you own the partitions again:

    try {
        consumer.commitSync();
    } catch (IllegalGenerationException e) {
        // stale generation; next poll() rejoins with the current generation
    }
  3. Stop members from going stale. The root cause is usually missed heartbeats or slow rebalance participation. Ensure heartbeat.interval.ms is roughly one-third of session.timeout.ms, and keep processing under max.poll.interval.ms so the member stays an active participant.

  4. Reduce rebalance frequency. Adopt CooperativeStickyAssignor, use static membership (group.instance.id) to survive brief restarts, and stagger deploys so the generation does not churn repeatedly.

  5. Tame GC and pauses. If long stop-the-world pauses are delaying heartbeats, tune the JVM (e.g., G1/ZGC, heap sizing) so the consumer thread is not starved past the session timeout.

  6. Verify. After changes, confirm --state is Stable and the count of generation-change log lines drops.

Prevention and Best Practices

  • Set heartbeat.interval.ms to about a third of session.timeout.ms so transient delays do not trigger eviction-driven generation bumps.
  • Keep per-batch processing comfortably under max.poll.interval.ms to remain an active group member.
  • Use cooperative rebalancing and static membership to cut the number of generation changes.
  • Make commits idempotent-friendly: never assume a commit succeeded until the coordinator acknowledges it.
  • Watch for GC pauses that exceed the session timeout and tune the JVM accordingly.
  • Avoid caching the generation in application code; let the consumer client manage it.
  • RebalanceInProgressException — the rebalance is still in progress (vs. already completed and advanced the generation here); both resolve by re-polling.
  • CommitFailedException — the member was fully evicted rather than merely a generation behind.
  • UnknownMemberIdException — the coordinator dropped the member id entirely, a stronger form of staleness.
  • FencedInstanceIdException — a static-membership conflict that also invalidates the member’s session.

Frequently Asked Questions

Is IllegalGenerationException fatal? No. It is a recoverable, expected outcome of a rebalance completing while the member held an older generation. The client rejoins automatically on the next poll(). Just avoid double-counting the uncommitted batch.

How is it different from UnknownMemberIdException? IllegalGenerationException means your member is still known but a generation behind. UnknownMemberIdException means the coordinator no longer recognizes your member id at all. Both lead to rejoining, but the latter implies a longer absence.

Why does it keep recurring? Recurring generation changes point at repeated rebalances — usually missed heartbeats (session timeout too tight, GC pauses), max.poll.interval.ms violations, or frequent membership churn. Fix the underlying rebalance trigger.

Will I lose or duplicate messages? You will not lose messages, but the uncommitted batch may be redelivered after the rejoin. Idempotent processing makes that safe.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.