How to manage persistent connections in kubernetes

Kubernetes

persistent connections

connection management

networking

container orchestration

How to manage persistent connections in kubernetes

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Kubernetes is good at replacing Pods, rescheduling workloads, and routing traffic, but it does not preserve long-lived connections for you. If your application depends on database pools, WebSocket sessions, or other persistent TCP connections, you need to design for Pod churn explicitly.

The goal is not to keep one socket alive forever. The goal is to make connection loss rare, expected, and recoverable during rolling updates, scaling events, and node failures.

What Kubernetes Changes About Long-Lived Connections

A Pod is disposable. During a rollout or eviction, Kubernetes can stop accepting new traffic, send SIGTERM, wait for the termination grace period, and then kill the process. Any open connection that the application has not drained by then will be dropped.

That means persistent connections need help from three layers:

the application, which must retry and shut down gracefully
the Pod lifecycle, which must stop new traffic before termination
the client or load balancer, which must reconnect correctly

If you only solve one layer, connection handling stays fragile.

Use Readiness and Graceful Shutdown Together

Readiness is the first control point. When a Pod is no longer ready, Kubernetes removes it from Service endpoints so new requests stop landing there. A preStop hook and a reasonable terminationGracePeriodSeconds give the process time to drain existing work.

yaml

1apiVersion: apps/v1
2kind: Deployment
3metadata:
4  name: chat-api
5spec:
6  replicas: 3
7  selector:
8    matchLabels:
9      app: chat-api
10  template:
11    metadata:
12      labels:
13        app: chat-api
14    spec:
15      terminationGracePeriodSeconds: 30
16      containers:
17        - name: api
18          image: example/chat-api:1.0.0
19          ports:
20            - containerPort: 8080
21          readinessProbe:
22            httpGet:
23              path: /ready
24              port: 8080
25          lifecycle:
26            preStop:
27              exec:
28                command: ["/bin/sh", "-c", "sleep 10"]

The sleep 10 pattern is not magic. It simply creates a buffer so endpoint removal can propagate before the process exits. Your application still needs to stop accepting work and finish in-flight requests.

Make the Process Drain Connections Cleanly

Here is a minimal Go HTTP server that reacts to SIGTERM and shuts down without abruptly cutting active requests:

1package main
2
3import (
4    "context"
5    "log"
6    "net/http"
7    "os"
8    "os/signal"
9    "syscall"
10    "time"
11)
12
13func main() {
14    srv := &http.Server{Addr: ":8080"}
15
16    http.HandleFunc("/ready", func(w http.ResponseWriter, r *http.Request) {
17        w.WriteHeader(http.StatusOK)
18        _, _ = w.Write([]byte("ok"))
19    })
20
21    http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
22        time.Sleep(2 * time.Second)
23        _, _ = w.Write([]byte("response"))
24    })
25
26    go func() {
27        sigs := make(chan os.Signal, 1)
28        signal.Notify(sigs, syscall.SIGTERM, syscall.SIGINT)
29        <-sigs
30
31        ctx, cancel := context.WithTimeout(context.Background(), 25*time.Second)
32        defer cancel()
33        if err := srv.Shutdown(ctx); err != nil {
34            log.Printf("shutdown error: %v", err)
35        }
36    }()
37
38    if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
39        log.Fatal(err)
40    }
41}

This pattern matters more than any Kubernetes-specific trick. Without graceful shutdown in the process itself, readiness and preStop only reduce the odds of a broken connection.

Choose the Right Connection Strategy

For outbound connections such as PostgreSQL, Redis, or Kafka, keep a pool inside each Pod and let the client library reconnect. Do not try to share one connection across Pods.

For inbound sticky user sessions, decide whether you actually need stickiness. Many systems can move session state to Redis or a database and stay stateless at the HTTP layer. If you truly need affinity, configure it deliberately at the ingress or load balancer instead of assuming Kubernetes Services will preserve client-to-Pod routing.

Use StatefulSet only when Pod identity matters, such as clustered databases or brokers. It does not solve graceful draining by itself.

Common Pitfalls

A frequent mistake is assuming keep-alive means reliable across rollouts. It only keeps a connection open while both endpoints remain alive.

Another mistake is relying on a long terminationGracePeriodSeconds without changing readiness. If a Pod still looks ready, traffic keeps arriving while you are trying to shut it down.

Teams also overuse sticky sessions. Affinity can hide state-coupling problems and make scaling uneven. Prefer external session storage unless the protocol truly requires stable backend identity.

Finally, do not ignore client reconnect behavior. Even perfect Pod draining cannot prevent every dropped connection during node loss or network partitions.

Summary

Kubernetes does not preserve long-lived connections automatically.
Use readiness probes, preStop, and a realistic termination grace period to drain traffic.
Handle SIGTERM in the application so active requests and sockets close cleanly.
Keep outbound connections pooled per Pod and let clients reconnect.
Use stickiness or StatefulSet only when the protocol or architecture really requires stable identity.