Kubernetes Pod fails with CrashLoopBackOff
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
CrashLoopBackOff means Kubernetes started a container, the container exited, and Kubernetes is backing off before trying again. The backoff is a symptom, not the root cause. To fix it, you need to find why the process inside the container keeps terminating.
Start with Logs and Events
The first commands should be:
describe shows events, restart counts, probe failures, and exit reasons. logs --previous is especially important because the current container instance may be too new to show the crash that triggered the restart.
Check the Container Exit Reason
Common patterns include:
- the application process throws an exception and exits
- the command or entrypoint is wrong
- a required file, secret, or environment variable is missing
- the liveness probe kills the container repeatedly
- the process is OOM-killed because memory limits are too low
kubectl describe pod usually shows clues such as OOMKilled, probe failures, or non-zero exit codes.
Example of a Probe-Induced Crash Loop
A misconfigured liveness probe can restart a healthy process before it finishes startup.
If the application needs 20 seconds to boot, this probe is too aggressive. A safer version might be:
startupProbe prevents premature liveness failures during slow startup.
Verify the Command, Image, and Config
If the container exits instantly, inspect:
- '
commandandargs' - image tag correctness
- mounted config files and secrets
- required environment variables
- application working directory assumptions
A pod can crash forever because of one missing environment variable.
If app-secret or database_url is wrong, the app may fail before it ever serves traffic.
Watch for Resource Problems
If the pod is killed with OOMKilled, raise the memory limit, reduce memory usage, or both.
Do not guess here. Check metrics and application behavior. If Java, Node.js, or Python workloads are close to the memory limit, brief spikes can trigger repeated crashes.
Use kubectl exec Only After the Container Stays Up Long Enough
Sometimes the natural instinct is to kubectl exec into the pod and inspect the filesystem or environment. That is useful only if the container remains alive long enough to attach. If it crashes instantly, use logs, pod events, and a debug container or a temporary command override instead of waiting for a shell that never becomes available.
That distinction saves time during incident response.
Common Pitfalls
- Treating
CrashLoopBackOffas the root cause instead of the restart symptom. - Looking only at current logs and missing
kubectl logs --previous. - Misconfiguring liveness probes for slow-starting apps.
- Ignoring
OOMKilledand focusing only on application logs. - Shipping an image with the wrong entrypoint or missing runtime dependency.
- Assuming Kubernetes is broken when the real problem is application exit behavior.
Summary
- '
CrashLoopBackOffmeans the container keeps exiting and Kubernetes is backing off.' - Start with
describe, current logs, and previous logs. - Check exit reasons, probes, env vars, secrets, and entrypoints.
- Use
startupProbewhen boot time is longer than liveness timing. - Fix the process exit cause, not the backoff symptom.

