How does Raft deals with delayed replies in AppendEntries RPC?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
In the Raft distributed consensus algorithm, managing communication and ensuring consistent state across all participating nodes in the face of network delays and partitions is crucial. One of the common scenarios where Raft demonstrates robustness is in handling delayed replies to the AppendEntries Remote Procedure Call (RPC), a critical part of its log replication process.
Understanding AppendEntries RPC
In Raft, the AppendEntries RPC is used primarily for two purposes:
- Log replication: The leader sends
AppendEntriesmessages to replicate log entries across all follower nodes. - Heartbeat: It serves as a heartbeat to maintain authority over the cluster and prevent new leader elections.
Each AppendEntries RPC includes:
- The term number: to assert the leadership term,
- Log entries: entries to be stored on the followers,
- The index and term of the predecessor log entry: to ensure consistency and log matching,
- The leader’s commit index: to inform followers about committed entries.
How Raft Handles Delayed Replies
Immediate Impact
When an AppendEntries reply is delayed, several immediate issues might arise:
- Inconsistent state: Log entries might not be replicated timely across all followers.
- Failed commitments: Delayed replication might temporarily prevent certain log entries from being committed.
Raft's Strategy for Managing Delays
Retrying RPC Calls
Raft makes it the leader's responsibility to retry AppendEntries operations until all followers eventually store all log entries and any delayed state is synchronized. Retries continue even in the case of network delays or failures.
Handling Out-of-Order Responses
If responses to AppendEntries are received out of order, the leader uses the log index to place them correctly. For example, if a delayed response to an earlier log entry arrives after a response to a subsequent log entry, the leader will still use the information from both responses to ensure the log consistency.
Using Timeouts and Heartbeat Intervals
Raft uses timeouts and heartbeat intervals strategically:
- Heartbeat interval: This is essential for understanding whether delays are simply due to slow communication or actual failure. If a follower doesn't respond in time, the leader continues to send subsequent
AppendEntriesRPCs, effectively acting as heartbeats. - RPC time-out: If there is a timeout before a reply is received, the RPC is considered lost or delayed, and the leader retries the
AppendEntriesRPC.
Incrementing the Leader’s Commit Index
The commit index is a critical component in Raft, indicating the log position up to which the entries are safe to be applied to the state machines. If AppendEntries responses are delayed, the commit index advancement might be stalled. Once the responses are finally received and if they are successful, allowing preceding indexes to be committed, the leader increments its commit index to the highest known replicated index among a majority of nodes.
Example Scenario
Consider a situation where a leader A sends an AppendEntries RPC to two followers B and C. Assume the RPC to C is delayed, but B responds immediately. As long as the leader has received successful responses from a majority (here, B alone suffices if counting the leader A, in a three-node cluster), it will proceed with the operations predicated upon those logs being replicated. When C's response finally arrives, assuming it is successful, the leader absorbs it into the log consistency process.
Summary Table
| Issue | Strategy | Purpose |
| Delayed RPC Responses | Retry AppendEntries; Handle out-of-order replies | Ensure eventual consistency and update followers |
| Network Partitions | Use Heartbeats | Detect and recover from lost nodes |
| Ensuring Commitment | Increment commit index | Commit entries known to be stored on a majority |
Raft’s robust handling of delayed AppendEntries RPC replies ensures that even under unreliable network conditions, the consistency and availability of the cluster are preserved, thereby maintaining the integrity of the distributed system it governs.

