Is Kafka timestamp order corresponding to the offset?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Apache Kafka is a distributed event streaming platform capable of handling trillions of events a day. Initially conceived as a messaging queue, Kafka is based on an abstraction of a distributed commit log. Since it deals with streams of records, the concepts of time and ordering are fundamentally significant, especially when it comes to understanding the relationship between event timestamp and record offset.
Understanding Offsets and Timestamps in Kafka
Within a Kafka cluster, topics are multi-partitioned. Each partition is an ordered, immutable sequence of records that is continually appended to—structured as a commit log. Each record in a partition is assigned a sequential id called an offset. Offsets are unique per partition and are used to uniquely identify a record within a partition.
On the other hand, a timestamp is a metadata field in a Kafka record that denotes the time at which the event occurred or when it was appended to the Kafka log. Kafka supports two types of timestamps:
- Creation time: Applied when a record is sent to the broker by the producer.
- Log append time: Applied when the record is appended to the log by the broker.
Does Timestamp Order Correspond to Offset Order?
In theory, the order of offsets within a Kafka partition is guaranteed; that is, if record A has a lower offset than record B, then record A was appended before B. However, the correlation between timestamps and offsets can vary depending on the timestamp type used and scenarios in distributed environments:
- Creation Time (Producer Timestamps): Since timestamps are assigned when a record is created by the producer, they are susceptible to clock skew across different producers. This means that if two records
AandBare produced by different producers, andAis produced earlier but with a clock skew,Amight have a higher timestamp thanBwhile having a lower offset. - Log Append Time: Timestamps are assigned when a record is appended to the log. Hence, in this scenario, the order of timestamps generally corresponds with offset order. However, slight variances might still occur due to the concurrency in log appending processes, especially in high-throughput scenarios.
Detailed Example
Consider a scenario with two Kafka producers, Producer 1 and Producer 2, with their system clocks out of sync:
- Producer 1 sends record
Aat system time12:00:00, synchronously followed by recordBat12:00:02. - Producer 2, whose clock is 5 seconds ahead, sends
Cat its local12:00:03(which is11:59:58global time).
If timestamps are producer-based (creation time), records will have timestamps that may not align with their offsets. The offsets will accurately reflect the sequence in which records are appended to the log:
- Record
Cmight be stored beforeAandBbecause it arrives at the broker first, despiteProducer 2’s clock being ahead.
Summary Table
| Criteria | Offset Order | Timestamp Order-Creation Time | Timestamp Order-Log Append Time |
| Ordering Guarantee | Absolute order guaranteed by Kafka | Potentially affected by clock skew | Closely corresponds to offsets |
| Uniqueness per partition | Unique | Non-unique (potential duplicates) | Unique |
| Dependency | Dependent only on Kafka | Dependent on producer clocks/system | Dependent only on Kafka |
Conclusion
While offsets are a reliable source of record ordering within a Kafka partition, the correlation of timestamps to offsets can vary significantly, especially under the creation time configuration. This variance highlights the importance of configuring Kafka and producers appropriately, bearing in mind the specific requirements and characteristics of the system in use, such as synchronization of system clocks if opting for creation time timestamps.
In applications requiring precise time-based ordering of records, log append time is preferred as it offers a stronger correlation between the order of offsets and timestamps, consequently ensuring that temporal queries yield more predictable results.

