Kafka Cluster
ZooKeeper Node
System Administration
Network Configuration
Distributed Systems

Adding new ZooKeeper node in Kafka cluster?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Apache Kafka, a distributed streaming platform, relies heavily on Apache ZooKeeper for managing its configuration settings and for distributed coordination. When scaling Kafka clusters, sometimes it is necessary to add a new ZooKeeper node to handle the increased load or improve fault tolerance. This article explains the steps and considerations for adding a new ZooKeeper node into an existing Kafka cluster.

Understanding ZooKeeper in Kafka

ZooKeeper functions as a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All Kafka nodes – brokers, producers, and consumers – rely on the ZooKeeper ensemble for high availability and consistency across the cluster.

Prerequisites

Before adding a new node to a ZooKeeper ensemble, ensure the following:

  • All existing ZooKeeper nodes are operational and healthy.
  • Network connectivity between all nodes is ensured.
  • Sufficient disk space and memory are available on the new node.
  • The new node's hardware and software environment matches the existing nodes.

Steps to Add a New ZooKeeper Node

1. Install ZooKeeper

Setup ZooKeeper on the new node with the same version as used in the existing cluster. Configure the node by copying the configuration settings (zoo.cfg) from an existing node and modify the following:

  • dataDir set to a local, empty directory.
  • clientPort assign the port for clients to connect to this node.
  • server.X the existing cluster configuration along with this node. Note that X should be a unique id not currently used.

Example configuration snippet:

 
1server.1=zk1.example.com:2888:3888
2server.2=zk2.example.com:2888:3888
3server.3=zk3.example.com:2888:3888
4server.4=newNode.example.com:2888:3888

2. Update Existing Nodes

Add the configuration of the new node (server.4) to every existing ZooKeeper server’s configuration file.

3. Restart Existing Nodes

Restart each existing ZooKeeper node one at a time. This process helps in recognizing the new server by the existing nodes.

4. Initialize the New Node

On the new ZooKeeper node, create a myid file in the dataDir with the node's identifier. For example, if the node is server.4 in zoo.cfg, then myid should simply contain:

 
4

5. Start the New Node

Turn on the ZooKeeper service on the new node. Ensure it connects successfully to other nodes by checking the logs for any warnings or errors.

Verifying the ZooKeeper Node

Once the new node starts, check the logs to ensure there are no unusual errors and it forms a part of the quorum. Use the following command to check the ensemble status:

 
echo stat | nc newNode.example.com 2181

Look for the Mode: field in the output, which should indicate if the node is a leader, follower, or standalone.

Maintenance and Monitoring

After integration, monitor the ZooKeeper ensemble and Kafka cluster closely for any performance issues or errors. Regularly check disk usage, CPU, and memory consumption, while also keeping the ensemble's software up to date.

Summary Table

AspectDetail
ZooKeeper Node Addition Steps1. Install and configure 2. Update existing nodes 3. Restart existing nodes 4. Initialize new node 5. Start new node
Configuration Changesserver.X=hostname:port:port in zoo.cfg and myid file
VerificationUse echo stat | nc host 2181 to confirm proper status and mode
Monitoring NeedsCheck logs, disk usage, CPU, and memory consumption

By following these detailed steps, you can successfully scale your Kafka cluster's ZooKeeper ensemble, enhancing the cluster's performance and reliability.


Course illustration
Course illustration

All Rights Reserved.