How can I get the offset value in KStream
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Kafka Streams makes day to day stream processing pleasant because the DSL focuses on keys and values, not broker metadata. That convenience also means offsets are not passed into every map, filter, or foreach callback. If you need the offset for logging, auditing, or debugging, you have to step down to a context-aware API.
Why the DSL Does Not Hand You the Offset
A KStream record comes from a Kafka topic partition, and every record in that partition has a monotonically increasing offset. The plain DSL intentionally hides most of that metadata. In a simple operation like mapValues, Kafka Streams wants you to think in terms of transformed data, not transport details.
That is why code like this is not possible in a regular DSL lambda:
If you need topic, partition, timestamp, headers, or offset, use a transformer or processor that receives a ProcessorContext.
Reading the Offset with a Transformer
For many applications, the cleanest option is a ValueTransformerWithKey. It still plugs into the DSL, but Kafka Streams injects a ProcessorContext during initialization. That context exposes offset().
This pattern is useful when you want to keep using the DSL but need record metadata for a specific step. The offset you read is the offset of the current input record, not the offset of an output record written later in the topology.
Using a Processor for Lower-Level Control
If your logic is more operational than transformational, a processor can be a better fit. A processor is explicit about handling each input record and is a natural place for logging or custom side effects.
You would attach that processor to a KStream with process. This is heavier than a normal DSL step, but it gives you direct access to metadata and forwarding behavior.
When the Offset May Be Missing
Offset access is tied to an input record. If code runs outside normal record processing, the metadata may not exist. In those cases Kafka Streams can return -1 for the offset.
That matters in two common cases:
- Punctuation callbacks are time-driven, not tied to a specific source record.
- Some optimized paths, especially in parts of the
KTableAPI, do not always have stable record metadata.
So treat offsets as contextual metadata, not as a permanent business identifier.
Common Pitfalls
One common mistake is expecting offsets inside every DSL lambda. Most high-level callbacks do not expose them, so reaching for mapValues or peek alone will not solve the problem.
Another mistake is storing offsets as if they were globally unique. Offsets are only unique within a single topic partition. If you persist them, persist the topic and partition alongside them.
A third mistake is assuming context.offset() is always valid. During punctuation or some internal optimizations, Kafka Streams may not have a current source record. Guard for -1 and decide how your application should behave in that case.
Finally, avoid using the consumed offset as proof that downstream work succeeded. Processing, forwarding, and committing are related but not identical steps. If you need end-to-end guarantees, rely on Kafka Streams processing semantics and your sink behavior, not the offset alone.
Summary
- A plain
KStreamDSL callback does not usually expose the Kafka offset. - Use
ValueTransformerWithKey,Transformer, or the Processor API when you needProcessorContext. - Read the current record offset with
context.offset(). - Expect
-1when code is not running against a real input record. - Treat offsets as partition-scoped metadata, not as business IDs.

