How change topic leader or remove partition after some broker down?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
When operating a Kafka cluster, managing brokers and partitions efficiently is crucial for maintaining system reliability and performance, especially in cases where a broker goes down. This article explains how to change the topic leader or remove a partition after some brokers become unavailable.
Understanding Kafka Leadership and Partition Management
Apache Kafka is a distributed streaming platform that uses a cluster of brokers to store and manage records in a fault-tolerant manner. Each topic in Kafka is split into partitions for scalability and each partition has one leader and zero or more followers. The leader handles all read and write requests for the partition, while the followers replicate the leader to provide redundancy.
When a broker goes down, it's essential to ensure that the partitions for which it was a leader have their leadership transferred to another broker and to rebalance the cluster accordingly.
Changing the Topic Leader
When a broker that is a leader for a partition fails, Kafka’s controller will automatically try to elect a new leader among the available followers. However, there are instances where manual intervention might be required to optimize performance or troubleshoot issues.
Manual Leader Election
Kafka provides tools under its bin directory to manually control the leadership of partitions. Here’s how you can change the leader:
- Identify partitions on the down broker: First, identify which partitions had their leader on the downed broker. You can use the tool
kafka-topics.shto list all topics and partitions along with their current leader:
- Elect a new leader: Use the
kafka-leader-election.shtool to perform a preferred replica election, where Kafka will attempt to elect a leader from the preferred replicas list that is updated periodically by the controller:
This action triggers Kafka to reassess the leader based on available replicas.
Removing a Partition
There might be scenarios where a partition needs to be removed entirely, perhaps because it's no longer needed or for decommissioning purposes. Kafka does not support deleting individual partitions directly; rather, the entire topic must be deleted. However, careful planning and operations can effectively remove partitions by shrinking a topic:
- Decommissioning a partition: This generally involves reassigning all messages to other partitions and then possibly deleting the old topic. Tools like Kafka's
kafka-reassign-partitions.shcan help redistribute data across the remaining partitions. - Delete the entire topic: Once the partition is emptied and its data redistributed, you could delete the topic entirely if it's no longer needed.
Summary Table
Here is a summary of key commands and actions to consider:
| Action | Command | Description |
| Describe Topics | bin/kafka-topics.sh --describe --bootstrap-server [your-broker-list] | Lists all topics, partitions, and their current leaders. |
| Leader Election | bin/kafka-leader-election.sh --bootstrap-server [your-broker-list] --election-type preferred --topic [topic-name] --partition [partition-number] | Manually forces a preferred replica election for a specific partition. |
| Delete Topic | bin/kafka-topics.sh --delete --topic [topic-name] --bootstrap-server [your-broker-list] | Deletes a topic from the Kafka cluster. |
Additional Points to Consider
- Monitoring and Alerts: Always monitor your Kafka cluster using tools like Apache Kafka’s JMX metrics, Prometheus, or other monitoring software. Setting alerts for broker downtimes can help in proactive management.
- Replication Factors: Always maintain an appropriate replication factor for topics to ensure there are enough follower replicas to take leadership in case a broker goes down.
- Regular Maintenance: Periodically perform maintenance like leader elections and partition reassignments to balance the load across the cluster.
Handling Kafka broker downtimes effectively requires understanding of Kafka internals as well as proactive and reactive management strategies. Following the outlined procedures helps in maintaining Kafka's performance and availability even during broker failures.

