Kafka Consumer Offsets Deep Dive: Position vs Committed

January 2, 2026

A Kafka consumer does not read from "the last committed offset" on every poll. That is the most common misunderstanding I see, and it leads to a class of duplicate-processing bugs that are very expensive when they hit production payments.

Each consumer keeps two offsets per partition. The position is an in-memory pointer that advances every time poll() returns a batch. The committed offset is a durable checkpoint stored in the __consumer_offsets topic, written periodically either by the auto-commit loop or by explicit commitSync and commitAsync calls. Position moves fast. Commits move deliberately.

On a normal poll, the consumer fetches starting from its current position, advances the position past the returned records, and your handler runs. The position keeps marching forward in memory. The committed offset only catches up when a commit fires. If the process crashes, the position is gone. On restart, the consumer reads the committed offset from __consumer_offsets and resumes from there. Everything between the last commit and the crash gets redelivered.

This is what at-least-once means in practice. The redelivery window is exactly the gap between position and committed offset.

Auto-commit (enable.auto.commit=true) is the danger zone. The default auto.commit.interval.ms is 5000ms. The auto-commit loop runs on every poll() if the interval has elapsed, but it commits whatever the current position is, regardless of whether your handler finished processing those records. You can have a five-second window of work that is "done" from Kafka's perspective but not from your application's, or vice versa.

A payments consumer I worked with used auto-commit. The handler called Stripe to charge a card, then logged the result. Throughput was fine until a deploy crashed a pod mid-batch. The pod had polled 50,000 messages, auto-committed at the four-second mark, then crashed at second six while still processing the back half of that batch. On restart, the new consumer read the committed offset, which was ahead of where processing had stopped, and skipped 30,000 charges. Worse, an earlier crash had gone the other way, redelivering 50,000 messages that Stripe had already charged. The service was not idempotent. Customers got double-billed.

The fix has two parts. Turn off auto-commit. Call commitSync only after your side effects are durable. Then, because at-least-once means duplicates can still happen on the unhappy path, attach an idempotency key to every side effect downstream. Stripe accepts an Idempotency-Key header. Use it.

Distinct from rebalancing and lag. This is the offset semantics itself.

Key takeaway

Position is the in-memory pointer Kafka advances on every poll. Committed offset is the durable checkpoint that survives a crash. The gap between them is your duplicate-processing window, and auto-commit makes that gap five seconds wide.

Originally posted on LinkedIn. View original.