Kubernetes, simple SpringBoot app OOMKilled
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
OOMKilled in Kubernetes means the Linux kernel terminated your container after memory usage exceeded the pod limit. In Spring Boot services, this often happens because total JVM memory includes heap and non-heap regions, not just -Xmx. A durable fix combines Kubernetes resource sizing, JVM container tuning, and application-level memory discipline.
Confirm the OOM Signal First
Start with evidence, not assumptions. Check pod events, restart reason, and previous container logs.
Look for:
- '
Reason: OOMKilled' - restarts after traffic spikes
- memory usage near or above limit before restart
If the process exits for another reason, JVM tuning alone will not fix it.
Understand Java Memory Inside Containers
A common mistake is setting heap too close to container limit. Java process memory includes:
- heap
- metaspace
- thread stacks
- JIT code cache
- direct buffers and other native allocations
If pod limit is 768Mi and heap can grow near that value, non-heap overhead will push the process over the limit and trigger kill.
Set Requests and Limits Explicitly
Avoid relying on defaults. Define both requests and limits in deployment manifests.
Requests influence scheduling quality. Limits enforce the hard cap that the kernel uses.
Tune JVM for Container Budgets
Use container-aware JVM options so heap leaves space for non-heap memory.
An alternative is fixed heap sizing:
Choose one strategy and measure under realistic load. For simple services, percentage-based limits are often easier to maintain across environments.
Reduce Application Memory Pressure
Sometimes the app itself drives memory spikes. Common Spring Boot causes:
- large JSON payload buffering
- unbounded caches
- loading entire datasets into memory
- high thread counts
- expensive object mapping on hot paths
Practical mitigations:
- stream large responses where possible
- configure cache size and eviction policy
- process batch jobs in chunks
- tune Tomcat thread pool for expected concurrency
A small code and config adjustment can eliminate kills without increasing pod limits.
Add Observability Before and After Changes
Without metrics, tuning is guesswork. Track at minimum:
- JVM heap used and max
- non-heap used
- GC pause duration and frequency
- container memory working set
- pod restart count
Example with Spring Boot Actuator and Prometheus:
After each change, compare restart frequency and memory headroom over similar traffic windows.
Diagnose Leak Versus Bad Sizing
Not every OOM is a memory leak. Leak patterns usually show monotonic growth with poor recovery after GC. Bad sizing patterns often show oscillation near the limit and kills during bursts.
If a leak is suspected:
- capture heap dump on OOM
- inspect dominant object retainers
- check cache eviction behavior
If sizing is the issue, adjust pod limit and JVM percentages with load-test feedback.
Rollout Strategy
Apply memory changes gradually. Use canary or one replica first, observe for one traffic cycle, then roll out.
Fast global rollout of untested memory settings can turn a partial problem into full outage.
Common Pitfalls
- Setting heap too high relative to container memory limit.
- Increasing pod limits without checking application memory hotspots.
- Diagnosing OOM from current logs only and missing previous-container evidence.
- Ignoring non-heap memory when estimating Java process footprint.
- Rolling memory config changes to all replicas without staged validation.
Summary
- '
OOMKilledis a container memory cap event, not just a JVM exception.' - Size Kubernetes requests and limits explicitly for your workload.
- Leave headroom for non-heap memory when tuning JVM options.
- Reduce app-level memory spikes through streaming, bounded caches, and batch chunking.
- Validate each change with metrics and staged rollout before full deployment.

