Kubernetes - Pod which encapsulates DB is crashing
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
When a Kubernetes pod that runs a database keeps crashing, the right question is not "Why is Kubernetes unstable?" but "What is the database process or its environment doing at startup?" Crash loops around stateful workloads usually come from storage, permissions, resource limits, probes, or database configuration mismatches.
Databases are more sensitive than stateless web containers because they care about durable storage, startup ordering, write permissions, and clean shutdown. That is why the debugging approach should be more like infrastructure diagnosis than like ordinary app log inspection.
Start with the Fastest Signals
The first commands should be:
These tell you:
- whether the pod is in
CrashLoopBackOff, - whether it was OOM-killed,
- whether a probe is failing,
- what the database process logged before exiting.
For stateful workloads, --previous is especially useful because the current container may restart too quickly to capture the original failure.
Common Root Causes
The most common causes of database pod crashes are:
- missing or corrupted persistent volume data,
- filesystem permission problems on the mounted volume,
- resource limits that are too small,
- bad startup flags or environment variables,
- liveness probes that kill the database before it is ready.
These show up differently in logs, but they are the first places to look.
Storage and Permissions
Many database images need write access to a specific data directory. If the mounted volume is owned by the wrong user or mounted read-only, the database may fail immediately.
Typical symptom:
- the pod starts,
- the entrypoint tries to initialize or open the data directory,
- the process exits with a permissions or filesystem error.
That is why securityContext, file ownership, and PVC health matter so much for stateful pods.
Resource Limits and OOMKills
Databases are memory-hungry. If the container memory limit is too small, Kubernetes may terminate it with an OOM kill.
Check the pod description for signals like:
- '
OOMKilled,' - restart count climbing rapidly,
- termination reason pointing to memory pressure.
A database that needs time to warm caches or replay logs can also look unhealthy if CPU limits are too strict and startup becomes too slow.
Probe Configuration
Bad liveness probes are a classic cause of database crash loops. If the database needs 40 seconds to initialize but the liveness probe starts killing it after 10, the pod never gets a chance to become healthy.
A safer pattern is:
- use a generous startup probe for slow database initialization,
- use readiness to gate traffic,
- keep liveness conservative.
That way Kubernetes does not mistake "still starting" for "broken forever."
StatefulSet and Volume Design
For real databases, a StatefulSet is usually more appropriate than a plain Deployment. It gives stable identity and stable storage attachments, which match database expectations much better.
A simple debugging rule is: if your database pod is attached to persistent data, inspect the PVC and StatefulSet behavior just as seriously as the container logs.
Common Pitfalls
- Looking only at
kubectl get podsand never checking logs or pod events. - Running a database in a plain
Deploymentwithout thinking through stable storage needs. - Using aggressive liveness probes that kill slow startup sequences.
- Forgetting volume permissions and ownership requirements.
- Treating a database container like a stateless app and underallocating memory or disk.
Summary
- A crashing database pod is usually failing because of storage, permissions, probes, resources, or startup configuration.
- Start with
describe, current logs, and previous logs. - Check for OOM kills, PVC issues, and filesystem access problems early.
- Probe timing is critical for slow-starting databases.
- Stateful databases fit
StatefulSetand persistent-volume patterns better than generic stateless deployment patterns.

