Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Kafka By James Joyner IV · · 9 min read

Kafka Error Guide: 'Truncating partition topic-0 to local high watermark 10042' Replica Divergence

Understand Kafka follower log truncation and high watermark mismatch after a leader change, when it is safe, and when unclean leader election causes data loss.

  • #kafka
  • #troubleshooting
  • #errors
  • #replication

Seeing a broker log Truncating partition topic-0 to local high watermark 10042 can be unsettling — it looks like Kafka is throwing away data. Most of the time it is expected, safe behavior after a leader change. But under certain configurations it signals genuine data loss. This guide explains the high watermark mechanism, how to tell safe truncation from dangerous truncation, and how to configure your cluster so divergence never costs you records.

Exact Error Message

These lines surface in the follower broker’s server.log, emitted by the log layer and the replica fetcher thread:

[2026-06-29 09:14:03,221] INFO [UnifiedLog partition=topic-0, dir=/var/lib/kafka/data] Truncating to offset 10042 (kafka.log.UnifiedLog)
[2026-06-29 09:14:03,221] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Truncating partition topic-0 to local high watermark 10042 (kafka.server.ReplicaFetcherThread)
[2026-06-29 09:14:03,244] WARN  [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Received OFFSET_OUT_OF_RANGE from leader for partition topic-0 at fetch offset 10987 (kafka.server.ReplicaFetcherThread)
[2026-06-29 09:14:03,260] WARN  [Partition topic-0 broker=2] High watermark for partition topic-0 could not be updated, leader epoch changed from 7 to 8 (kafka.cluster.Partition)

The combination of a Truncating ... to ... high watermark, an OFFSET_OUT_OF_RANGE, and a leader-epoch change is the classic signature of a follower reconciling its log with a newly elected leader.

What the Error Means

Every partition has a high watermark (HW) — the highest offset known to be replicated to all in-sync replicas and therefore visible to consumers. A replica’s log-end-offset (LEO) is the offset of its last appended record, which can be ahead of the HW.

When a new leader is elected, followers must converge to the new leader’s log. If a follower’s LEO is ahead of where the new leader’s log diverges, the follower truncates its log back to a consistent point — historically the high watermark, and in modern Kafka the precise divergence point computed from leader epochs (KIP-101). The OFFSET_OUT_OF_RANGE warning means the follower asked for an offset the new leader does not have, prompting truncation.

The critical question is whether the truncated records were ever acknowledged to producers. With a clean failover to an in-sync replica, only un-acknowledged records (above the old HW) are dropped — no acknowledged data is lost. With unclean leader election, acknowledged records can vanish.

Common Causes

  1. Normal leader failover — a leader broker restarts or loses ZooKeeper/KRaft session; a follower in the ISR is promoted and others truncate uncommitted tails. Expected and safe.
  2. Follower ahead of the new leader’s HW — the old leader had appended records past the HW that were not yet replicated everywhere; those are truncated on failover.
  3. Unclean leader electionunclean.leader.election.enable=true lets an out-of-sync replica become leader when no in-sync replica is available. Other replicas then truncate to its shorter log, discarding committed records (data loss).
  4. Log divergence — two replicas accepted different records at the same offset across a flapping leadership change; leader-epoch reconciliation resolves it by truncating one side.
  5. OFFSET_OUT_OF_RANGE handling — a follower’s fetch offset exceeds the leader’s LEO, forcing a reset and truncation.
  6. Pre-KIP-101 behavior — very old brokers truncated to the HW only, which could silently lose or duplicate data during rapid double failovers; leader epochs (KIP-101/KIP-279) fixed this.

How to Reproduce the Error

In a test cluster with a topic of replication factor 3:

  1. Produce a steady stream so the leader’s LEO stays ahead of the HW.
  2. Kill the current leader broker (systemctl stop kafka in the lab) so a follower is promoted.
  3. The remaining followers log Truncating partition topic-0 to local high watermark ... as they reconcile to the new leader.

To deliberately observe data loss, set unclean.leader.election.enable=true, stop all in-sync replicas, then start a lagging replica so it becomes leader at a lower offset. The previously in-sync replicas will truncate down to it on rejoin.

Diagnostic Commands

Inspect leadership, ISR, and partition layout:

kafka-topics.sh --bootstrap-server localhost:9092 \
  --describe --topic topic
Topic: topic  PartitionCount: 1  ReplicationFactor: 3
  Topic: topic  Partition: 0  Leader: 1  Replicas: 1,2,3  Isr: 1,2

A shrunken Isr (here 2 of 3) shows replica 3 fell behind before the failover. Review the (read-only) election preview:

kafka-leader-election.sh --bootstrap-server localhost:9092 \
  --describe --topic topic --partition 0

Compare log-end offsets across brokers to see who was ahead:

kafka-log-dirs.sh --bootstrap-server localhost:9092 \
  --describe --topic-list topic
{"broker":1,"logDirs":[{"logDir":"/var/lib/kafka/data",
  "partitions":[{"partition":"topic-0","size":10042,"offsetLag":0}]}]}
{"broker":2,"logDirs":[{"logDir":"/var/lib/kafka/data",
  "partitions":[{"partition":"topic-0","size":10987,"offsetLag":945}]}]}

Pull the truncation evidence from the logs:

grep -iE "Truncating|high watermark|OFFSET_OUT_OF_RANGE|leader epoch" \
  /var/lib/kafka/logs/server.log
journalctl -u kafka --since "1 hour ago" | grep -i "Truncating"
[ReplicaFetcher replicaId=2, leaderId=1] Truncating partition topic-0 to local high watermark 10042
[Partition topic-0 broker=2] Cached leader epoch 8, fetching from offset 10042

Step-by-Step Resolution

A worked example after observing truncation on broker 2:

  1. Determine if it was clean. Check whether the new leader came from the ISR. In kafka-topics.sh --describe, if the elected leader’s ID was present in the prior Isr, the failover was clean and only un-acknowledged records (above offset 10042) were dropped — no action needed.

  2. Confirm leader epochs are in play. Modern brokers log Cached leader epoch N, fetching from offset .... Leader-epoch-based truncation (KIP-101) is the default and is precise; you do not want HW-only truncation. No config change is required on supported versions.

  3. If it was unclean, the truncated records are gone — they cannot be recovered from a truncated follower. Identify affected offsets from the log gap (here 10042 vs 10987 = 945 records lost), and replay from the source system if you have one.

  4. Prevent recurrence by disabling unclean election. In server.properties:

    unclean.leader.election.enable=false
    min.insync.replicas=2

    With false, Kafka will keep a partition offline rather than promote an out-of-sync replica, trading availability for durability.

  5. Require durable acks on the producer side so a record is only acknowledged after it reaches all in-sync replicas:

    # producer config
    acks=all
    enable.idempotence=true
  6. Verify the partition is healthy afterward with kafka-topics.sh --describe --under-replicated-partitions (empty when followers have caught back up) and confirm the ISR returns to full size.

Prevention and Best Practices

  • Keep unclean.leader.election.enable=false (the broker default) — this is the single most important setting for avoiding truncation-induced data loss.
  • Use acks=all plus min.insync.replicas=2 with replication factor 3 so acknowledged records survive any single failover.
  • Run a recent Kafka version so leader-epoch truncation (KIP-101/KIP-279) replaces the older, lossy HW-only behavior.
  • Alert on ISR shrink and under-replicated partitions — a chronically lagging follower is the replica most likely to cause a problematic election.
  • Avoid flapping brokers — tune zookeeper.session.timeout.ms / KRaft heartbeat settings so transient pauses do not trigger needless leadership changes.
  • Treat repeated unexplained truncation as an incident and route it through your incident response workflow.
  • OFFSET_OUT_OF_RANGE — the fetch error that triggers a follower reset and truncation.
  • org.apache.kafka.common.errors.NotLeaderOrFollowerException — seen during the leadership transition window.
  • Halting because log truncation is not allowed for topic — when a broker refuses to truncate below a configured floor.
  • More replication guidance lives in the Kafka category.

Frequently Asked Questions

Q: Is log truncation always a sign of data loss? No. After a clean failover to an in-sync replica, only records above the high watermark — which were never acknowledged to producers — are truncated. Acknowledged data is preserved.

Q: How do I know if my truncation caused real data loss? Check whether the newly elected leader was in the ISR at the time of election. If it was, the truncation was clean. If unclean.leader.election.enable=true allowed an out-of-sync replica to win, compare log-end offsets (via kafka-log-dirs.sh --describe) to quantify the gap.

Q: What is the difference between the high watermark and the log-end offset? The log-end offset is the last record written to a replica’s log. The high watermark is the highest offset replicated to all in-sync replicas, and it is the boundary up to which consumers can read.

Q: Will acks=all prevent truncation entirely? It will not stop truncation of un-acknowledged tails after a failover, but combined with min.insync.replicas=2 and unclean.leader.election.enable=false, it guarantees that any record acknowledged to the producer is never truncated away.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.