Kubernetes
Rolling Update
maxUnavailable
Autoscaling
Deployment Issues

Kubernetes Rolling Update not obeying 'maxUnavailable' replicas when redeployed in autoscaled conditions

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Kubernetes is the leading orchestration platform for containerized applications, offering features such as scaling, resource management, and smooth application deployments. Among these features, the rolling update strategy is critical for ensuring zero downtime during application upgrades. However, when deploying applications in environments that use autoscaling, Kubernetes' rolling update feature can sometimes behave unpredictably, particularly in relation to the maxUnavailable setting. This article explores the nuances of this complex behavior and illustrates why Kubernetes rolling updates might not always adhere to maxUnavailable limits under autoscaled conditions.

Understanding Rolling Updates and maxUnavailable

The Basic Rolling Update Process

Kubernetes Deployments support a rolling update strategy to transition from one version of an application to another smoothly. During a rolling update, new replica sets are gradually introduced while the existing replica sets are phased out, minimizing disruption. The rolling update process relies heavily on two key parameters:

  • **maxUnavailable **: Defines the maximum number of Pods that can be unavailable during the update process. This setting helps ensure application availability.
  • **maxSurge **: Specifies how many additional Pods can be created to facilitate the update beyond the desired number of Pods.

The Role of Autoscaling

Autoscaling in Kubernetes adjusts the number of Pod replicas according to current load conditions. The use of a Horizontal Pod Autoscaler (HPA) modifies the replica count based on metrics like CPU and memory utilization, providing elasticity to the system. Although powerful, this dynamism introduces complexities during rolling updates.

The interplay between rolling updates and autoscaling can lead to situations where the maxUnavailable constraint is not respected as expected. This is particularly true when the Deployment is subjected to changes in resource demand, requiring immediate attention from the Autoscaler.

When maxUnavailable

is Ignored

Illustration of the Problem

Suppose you have a Deployment with:

  • Desired replicas: 10
  • maxUnavailable : 2
  • maxSurge : 2

Under normal, static conditions, Kubernetes should ensure that no more than two Pods are unavailable during the update. However, in an autoscaled environment, consider the following scenario:

  1. An HPA scales replicas to 15 in response to high CPU usage before the update starts.
  2. The rolling update proceeds, mindful of the maxUnavailable parameter.
  3. As the load decreases, the HPA scales down, potentially triggering deletions that overlap with rolling update Pod terminations.

During the concurrent actions of scaling down and rolling update, more than the allowed number of Pods might become unavailable because the Autoscaler is not aware of rolling update internals. In extreme cases, this can lead to downtime that exceeds the anticipated limits, defeating the purpose of having specified maxUnavailable .

Technical Explanation

Kubernetes separates the concerns of autoscaling and rolling updates, running them as independent controllers. This decomposition leads to non-coordinated actions:

  1. RollingUpdateStrategy: Operates purely on the ReplicaSet, without considering changes induced by HPA. Although it respects maxSurge and maxUnavailable , it doesn't interact with HPA operations occurring simultaneously.
  2. HorizontalPodAutoscalerController: Acts based on metric thresholds and makes scaling decisions without insight into ongoing rolling update workflows.

The lack of a shared, stateful interaction model results in multi-controller adjustments that fail to obey the strictures defined by either controller independently. This becomes particularly problematic when sudden scale-in events occur during update phases.

Mitigating the Impact

While there's no built-in solution to perfectly coordinate autoscaled deployments with rolling updates, certain strategies may help mitigate the potential for increased downtime:

  • Prioritize Metrics Stabilization: Allow time for metrics driving the HPA to stabilize before initiating rolling updates, reducing the likelihood of aggressive scaling actions.
  • Pre-emptive Manual Interventions: Temporarily disable autoscaling during critical updates to maintain control over scaling behavior explicitly.
  • Custom Controllers: Develop custom controllers or deploy operators that can orchestrate between HPA and Deployment updates.
  • Segmented Updates: Use stages or canary updates to isolate potential disruptions to specific segments before a full rollout.

Conclusion

The behavioral overlap of Kubernetes rolling updates and autoscaling presents challenges that need meticulous management. While Kubernetes is architected for flexibility, the absence of cross-controller communication can skew expected behaviors such as adherence to maxUnavailable . Identifying and addressing these interactive components ensures that Kubernetes remains robust and effective, even in dynamically scaling environments.

Here's a table summarizing the key concepts and strategies discussed:

ConceptExplanation
Rolling UpdateIncremental upgrade method ensuring service availability
maxUnavailable
Limits Pods that can be removed during an update
maxSurge
Defines additional Pods beyond the original count
AutoscalingAutomatic adjustment of the number of replicas based on load metrics
HPAController that modifies replica count using metric thresholds
Potential IssuesmaxUnavailable
might be ignored due to separate scaling operation dynamics
Suggested MitigationsMetrics stabilization, custom workflows, partial disabling of autoscaling

This article has delved into the complex interaction between rolling updates and autoscaling, highlighting areas where disparities arise. By refining understanding of these behaviors, administrators can tailor their orchestrations for maximum efficiency and reliability.


Course illustration
Course illustration

All Rights Reserved.