Active-Passive Jenkins Setup in Kubernetes

Jenkins

Kubernetes

Active-Passive Setup

CI/CD

DevOps

Active-Passive Jenkins Setup in Kubernetes

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Jenkins is a stateful controller, so high availability for Jenkins looks different from high availability for a stateless web service. In Kubernetes, an active-passive design can reduce recovery time, but only if you are disciplined about how controller state is stored and promoted.

Decide Whether You Really Need Active-Passive

Before building two controllers, remember that Kubernetes already restarts failed pods and can reschedule them onto healthy nodes. For many teams, a single Jenkins controller backed by a persistent volume, backups, and configuration-as-code is enough.

An active-passive setup becomes more interesting when controller restart time is too slow, node maintenance windows are frequent, or recovery objectives require a warm standby plan. Even then, the passive instance should not be writing into the same live Jenkins home as the active controller.

Why Jenkins Is Not Active-Active

Jenkins keeps job definitions, plugins, credentials metadata, and runtime state under JENKINS_HOME. Running two controllers against the same writable home directory is risky because file locks, plugin writes, and build metadata updates are not designed for concurrent multi-writer access.

That is why the safer pattern is one active controller, one passive controller definition, and a clear failover process. The passive controller starts only when needed or stays scaled to zero until promotion.

A Safer Kubernetes Layout

A practical pattern looks like this:

Store controller configuration in Jenkins Configuration as Code and source control.
Keep build execution on external agents, not on the controller.
Back up JENKINS_HOME and the controller key material regularly.
Define a passive deployment that can be scaled up during failover.

yaml

1apiVersion: apps/v1
2kind: Deployment
3metadata:
4  name: jenkins-active
5spec:
6  replicas: 1
7  selector:
8    matchLabels:
9      app: jenkins
10      role: active
11  template:
12    metadata:
13      labels:
14        app: jenkins
15        role: active
16    spec:
17      containers:
18        - name: jenkins
19          image: jenkins/jenkins:lts-jdk17
20          ports:
21            - containerPort: 8080
22          volumeMounts:
23            - name: jenkins-home
24              mountPath: /var/jenkins_home
25      volumes:
26        - name: jenkins-home
27          persistentVolumeClaim:
28            claimName: jenkins-home
29---
30apiVersion: v1
31kind: Service
32metadata:
33  name: jenkins
34spec:
35  selector:
36    app: jenkins
37    role: active
38  ports:
39    - port: 8080
40      targetPort: 8080

That service points traffic only at the active controller. The passive definition can exist separately, using restored data or a fresh volume populated from backup when failover is triggered.

Failover Procedure

The operational part matters more than the YAML. Your team should know exactly how to stop the failed active controller, restore or attach the most recent consistent Jenkins state, promote the passive controller, and repoint traffic.

bash

kubectl scale deployment/jenkins-active --replicas=0
kubectl scale deployment/jenkins-passive --replicas=1
kubectl patch service jenkins -p '{"spec":{"selector":{"app":"jenkins","role":"passive"}}}'

The exact commands vary by implementation, but the process should be rehearsed. If failover only exists on a diagram, it will fail when you actually need it.

Reduce the Amount of State

The best active-passive setup is the one with the least controller state to recover. Use ephemeral build agents, store pipeline definitions in source control, and manage Jenkins configuration declaratively. The more you can recreate automatically, the less you depend on a fragile manual restore.

It also helps to separate backups from failover. Backups protect you from corruption and operator mistakes. Failover protects you from availability loss. You usually need both.

Common Pitfalls

Mounting the same writable JENKINS_HOME into two live controllers can corrupt state.
Building an active-passive design without a tested promotion runbook creates false confidence.
Keeping builds on the controller makes recovery slower and riskier.
Treating Kubernetes pod restart as the same thing as controller disaster recovery misses storage and configuration concerns.
Ignoring backup validation means the passive controller may start with unusable or incomplete data.

Summary

Start with the question of whether a single resilient controller already meets your recovery goals.
Jenkins should be active-passive, not active-active, because controller state is not safe for concurrent writers.
Use configuration as code, external agents, and tested backups to simplify failover.
Route traffic only to the active controller and promote the passive instance deliberately.
Rehearse the failover procedure so recovery is operationally real, not theoretical.