Kafka - Broker Message size too large
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Apache Kafka is a distributed streaming platform capable of handling trillions of events a day. Originally designed by LinkedIn and subsequently open-sourced, Kafka is widely used to collect and deliver high volumes of data with low latency. As part of its operation, Kafka can sometimes encounter an issue known as "Message size too large", which can affect the performance and reliability of data transmission.
Understanding Kafka's Message Size Issue
Kafka works by allowing producers to send messages to topics, from where consumers can read these messages in real-time or as needed. Each message consists of a key, a value, and optionally some headers. Kafka stores and transmits these messages in batches to optimize network and disk usage.
However, each Kafka broker has a limit on the maximum size of the messages it can receive and send. If a message exceeds this size, the broker will reject it, leading to errors in both producing and consuming applications. The usual error that appears is "RecordTooLargeException" or "MessageSizeTooLargeException".
Causes of Large Message Sizes
- Bulk Data Transmission: Sometimes, applications try to send large amounts of data in a single message rather than breaking it down into smaller, manageable chunks.
- Configuration Settings: Insufficiently tuned broker and producer configurations can lead to unexpected large message size issues.
- Serialization Formats: The choice of serialization (e.g., JSON, Avro, Protobuf) might result in unexpectedly large messages if not handled properly.
Configuring Kafka to Handle Larger Messages
Kafka provides several configuration options that can be adjusted to accommodate larger messages:
message.max.bytes: The largest record batch size allowed by the broker.replica.fetch.max.bytes: The maximum amount of data the broker will replicate per partition.fetch.message.max.bytes: Controls the maximum number of bytes a consumer can fetch in a single request.
To handle large messages, these configurations should be increased accordingly, but it should be done carefully to avoid excessive memory use and potential out-of-memory errors.
Example: Configuring Kafka Broker
Suppose you need to adjust your Kafka broker to handle messages up to 10MB. You would modify the server configuration (server.properties) as follows:
Best Practices for Managing Large Kafka Messages
Managing large messages efficiently involves more than just tweaking configurations. Here are some recommendations:
- Chunking: Break large datasets into smaller messages if possible.
- Compression: Use compression algorithms like GZIP or Snappy to reduce the size of the messages being transmitted.
- Monitoring: Regularly monitor the sizes of the messages being produced and consumed to proactively manage potential issues.
Summary Table
| Configuration Key | Purpose | Default Value (Bytes) | Suggested Large Setting (Bytes) |
message.max.bytes | Maximum size of a message that the broker can receive | 1,000,000 | 10,000,000 |
replica.fetch.max.bytes | Maximum data per partition the broker will replicate | 1,000,000 | 10,000,000 |
fetch.message.max.bytes | Maximum data a consumer can fetch per request | 1,048,576 | 10,000,000 |
Conclusion
Handling large messages in Kafka requires careful configuration and a good understanding of both the system's capabilities and the nature of the data being processed. By adjusting Kafka's configuration and adhering to best practices for data management, you can ensure that your Kafka setup continues to operate efficiently and effectively, even with larger messages.

