KafKa partitioner class, assign message to partition within topic using key
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Apache Kafka is a distributed streaming platform capable of handling trillions of events a day. Initially conceived as a messaging queue, Kafka is based on an abstraction of a distributed commit log. Since Kafka is a distributed system, topics are partitioned and replicated across multiple nodes.
What is a Kafka Partitioner?
A Kafka Partitioner is responsible for determining which partition within a Topic a message is sent to. When producing a message to a Kafka topic, you can specify a key. The partitioner uses this key to assign the message to a specific partition, ensuring that all messages with the same key end up in the same partition. If no key is provided, the partitioner assigns messages in a round-robin fashion to balance the load across partitions.
Why Partition Messages?
Partitioning messages in Kafka has several benefits:
- Scalability: Partitioning allows Kafka to scale as more partitions can be spread over multiple brokers.
- Ordering: Within a partition, Kafka guarantees that messages are in the order they were received. This is crucial for certain use cases such as financial transactions or log monitoring.
- Parallel Processing: Multiple partitions allow for parallel processing of data, which can significantly boost performance.
How Does the Partitioner Work?
When a message is produced, the producer specifies a topic and optionally a key. The default partitioner provided by Kafka works as follows:
- If a key is specified: The partitioner applies a hash function to the key and uses this result to determine the partition.
- If no key is specified: The messages are assigned in a round-robin manner to balance the load across available partitions.
Custom Partitioner
To implement custom logic for partition assignment, you must implement the org.apache.kafka.clients.producer.Partitioner interface, which includes three methods:
configure(Map<String, ?> configs): This method allows you to retrieve any necessary configuration.partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster): This method contains the core logic for assigning a partition based on the key.close(): This is used to clean up resources.
Example of a Custom Partitioner:
Here's an example of how one might implement a custom partitioner that assigns messages to partitions based on the hash code of the key modulo the number of partitions.
In this example, we provide special handling for messages from a specific "speed sensor," which always go to the last partition, while other messages are distributed based on their key hash.
Summary Table
| Feature | Description |
| Scalability | Kafka partitions enhance scalability by spreading data across brokers. |
| Ordering | Messages in the same partition are guaranteed to be in order. |
| Parallel Processing | Different partitions can be read in parallel, increasing throughput. |
| Default Partitioning | If no key is specified, messages are assigned round-robin. |
| Custom Partitioning Logic | Allows implementing complex logic based on scenario needs. |
Conclusion
Kafka partitions are a powerful feature for managing data distribution and achieving high throughput and scalability. By using or customizing partitioners, developers can control how messages are distributed across partitions to meet specific requirements of their application’s architecture and performance needs.

