kafka logs + how to limit the logs size
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Apache Kafka, a popular distributed streaming platform, uses logs as a core component of its architecture to store and disseminate data. Kafka logs should not be confused with application logs; in Kafka, logs specifically refer to the records (messages) that are stored within Kafka's topics.
Understanding Kafka Logs
Each topic in Kafka is divided into partitions, and each partition is essentially a log (ordered set of messages). Each message within a partition is assigned a unique sequential ID called an offset. Kafka maintains messages in these partitions (logs) over an array of servers to ensure fault tolerance and high availability.
The messages stored in a Kafka log are immutable, which simplifies the architecture and makes data handling consistent. Once the data is written to a partition, it can only be read or deleted but not updated.
Log Retention and Management
Given the immutable nature of the logs and potentially high volumes of data being ingested, managing log size is crucial. Kafka offers several configurations to help manage and limit the size of logs:
1. Time-Based Retention
Logs can be configured to keep messages for a specific amount of time. Once the set period is over, older messages are purged. The relevant configuration parameters include:
log.retention.hourslog.retention.minuteslog.retention.ms
By default, Kafka might use a retention period of 7 days, but this can be adjusted as needed.
2. Size-Based Retention
Apart from time, logs can also be managed by their size. After a log reaches a specified size limit, older messages are discarded to make room for new messages. Configurations include:
log.retention.bytesretention.bytesper topic
3. Log Compaction
Log compaction is a feature that retains only the last message for each key within a partition log, regardless of the retention policy by time or size. This is particularly useful for topics that reflect state changes where only the latest state is relevant.
4. Segment Files
Kafka stores logs across multiple files called segments. Managing these segments effectively also plays a role in controlling the size of logs. Configuration options include:
log.segment.bytes(size of each log segment file)log.segment.ms(time after which Kafka will close the current segment file)
Configuring Log Retention Policies
To limit the log size, adjust Kafka's broker or topic-specific settings. For example, to set the maximum log size to 1GB and retain logs for only 3 days, the configuration in Kafka’s server properties file (server.properties) would look like:
Alternatively, for topic-specific settings:
Log Management Best Practices
Practices that contribute to effective log management include:
- Estimate Data Growth: Understand the potential growth in data volume to set adequate log retention and segment sizes.
- Monitor Disk Usage: Regularly check disk usage and adjust log retention and compaction policies if required.
- Use Log Compaction: For topics that benefit from having a history of state changes, use log compaction instead of relying solely on size or time-based retention.
Summary Table
| Configuration Key | Description | Default Value | Use Case |
log.retention.hours | Maximum time to retain log data in hours | 168 (7 days) | Time-based log retention |
log.retention.bytes | Maximum size of log before deletion | -1 (unlimited) | Size-based log retention |
log.segment.bytes | Maximum log segment file size | 1,073,741,824 bytes | Segmentation of log files |
log.cleaner.enable | Enable log compaction | true | Keeping only the latest records |
Understanding and manipulating Kafka's logging system through configurations like retention policies, segment management, and log compaction are critical in maintaining system efficiency and ensuring that the data volume is predictable and manageable. This guarantees the sustainability of the Kafka system in production environments, balancing performance and storage requirements.

