Kafka In-Sync Replicas Deep Dive: When ISR Shrinks and Writes Quietly Stop
May 6, 2026
Most Kafka write-path explanations stop at "the leader replicates to the ISR." That sentence hides the entire interesting mechanism. The ISR is not a configured list. It is a live set the leader maintains every few hundred milliseconds, and its size determines whether your cluster accepts writes at all.
Start with how a replica joins. When a partition is created or reassigned, a follower opens a fetch connection to the leader and requests records from the end of its own log forward. New replicas join the ISR only once they have fully caught up to the leader's current log-end offset.
Now the crucial setting: replica.lag.time.max.ms. This is not "how many records behind" the follower can be. It is "how long since the follower last fetched a record at the leader's log-end offset." Default is 30 seconds. If a follower has not asked for or appended the latest record within that window, the leader evicts it from the ISR.
The time-based rule matters because it tolerates bursts. A follower can fall behind by ten thousand records during a traffic spike and stay in the ISR as long as it keeps fetching aggressively. The rule punishes silence, not lag depth. A follower that GC-pauses for 35 seconds and then comes back fully caught up is still kicked out, because for 35 seconds it was silent.
When a follower is evicted, the ISR shrinks. The leader continues advancing the High Watermark based only on the remaining members. The cluster keeps moving. The signal you watch is UnderReplicatedPartitions, the JMX metric that counts partitions whose ISR size is less than their replication factor. A nonzero value means you are running on fewer replicas than you provisioned for.
Now the failure mode that bites. min.insync.replicas is enforced at write time. With acks=all and min.insync.replicas=2, the broker requires at least two members of the ISR to be present before it accepts a write. If the ISR has shrunk to one (the leader itself), the broker rejects the write with NotEnoughReplicasException. The producer sees errors, latency climbs, dashboards turn red. This is the intended behavior. The alternative is to let acks=all quietly degrade into "one replica wrote it," and that is the silent path to data loss.
The production scenario worth memorizing. A disk on broker B slows down. Its follower stops keeping up. After 30 seconds it is evicted from the ISR for several partitions. Those partitions are now under-replicated. If a second follower then wobbles, min.insync.replicas=2 starts rejecting writes. Replication factor is still 3. Effective durability is 1. That is the moment to fix the disk, not to lower the threshold.
The ISR is the contract. Replication factor is the budget. The two are not the same number.
ISR membership is dynamic and time-based. A follower that stops fetching for replica.lag.time.max.ms is removed, and once ISR drops below min.insync.replicas with acks=all, the broker rejects writes. The metric you watch is under-replicated-partitions, not replication factor.
Originally posted on LinkedIn. View original.