Kafka 0.8.1
Delete.retention.ms
Topic Creation
Data Retention
Message Queuing

delete.retention.ms at the time of creating topic in kafka on 0.8.1 version

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Apache Kafka 0.8.1, a distributed streaming platform, is used primarily for building real-time data pipelines and streaming apps. It is capable of handling trillions of events a day. One of the main configurations involved in Kafka topic management is delete.retention.ms, which aids in managing the retention of data within the Kafka topics.

Understanding delete.retention.ms

In Kafka, data within a topic is organized into partitions and each partition is an ordered, immutable sequence of records that is continually appended. The retention setting determines how long data is kept before being deleted, which is essential for managing disk space and ensuring that the dataset in Kafka remains manageably sized.

delete.retention.ms is a configuration that determines the amount of time Kafka retains a record that has been marked for deletion but has not yet been removed. This setting primarily interacts with the log cleaner, which is responsible for compacting logs and ensuring that the deleted records are actually removed from the disk after the specified retention period.

How delete.retention.ms Works in Kafka 0.8.1

In version 0.8.1, when cleanup.policy is set to delete, records in a topic are retained based on the retention.ms setting. Conversely, if cleanup.policy is set to compact, Kafka will keep all records for a key forever, unless the key itself is set to null. When a record key is set to null (tombstone message), this indicates that the record should be deleted during a compaction process.

The delete.retention.ms setting then defines the period for which these tombstone messages are retained before finally being purged from the log during the subsequent compaction phase. This delay in deletion helps ensure that all consumers have had an adequate time to receive the tombstone event, which marks the message's removal for consumers acting on real-time updates and state changes.

Examples and Implication

To clarify, let's assume you set delete.retention.ms to 24 hours (86400000 milliseconds). If a message key is marked as null at noon today, it will remain in the log, marking its deletion until noon tomorrow. This ensures that all consumers or systems that are lagging by less than a day still learn about the deletion of this key.

Key Points Table

Here is a table summarizing the key details about delete.retention.ms:

Key PointDescription
Default Value86400000 (24 hours)
Impact of SettingControls how long tombstone records are kept before actual deletion during log compaction.
Relevant Configurationscleanup.policy (delete, compact), retention.ms
Version IntroducedThe concept exists in Kafka 0.8.1 and has been carried forward with minor modifications.

Conclusion

Understanding and accurately configuring delete.retention.ms alongside other retention parameters in Kafka is crucial for effective log management, especially in systems needing to manage state or those sensitive to the timeliness of message deletion. Maintaining this configuration helps balance between efficient storage management and ensuring data consistency across all consumer applications.

Additional Notes

In Kafka configuration, delete.retention.ms is a part of a more extensive configuration framework that involves determining how long data is stored in a topic (retention.ms), how logs are cleaned (cleanup.policy), and how permanent deletions are handled. It's a foundational concept for those looking to manage storage and data lifecycle within Kafka effectively.


Course illustration
Course illustration

All Rights Reserved.