Concurrent LRU cache implementation
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
An LRU cache needs two things at once: fast key lookup and fast recency updates. In a concurrent setting that becomes harder, because reads also mutate recency state, so a "thread-safe map plus thread-safe list" is not enough unless the operations across both structures stay consistent.
The Core Data Structure
The standard LRU design combines:
- a hash map from key to node
- a doubly linked list ordered by recency
The map gives O(1) lookup. The list lets you move a recently used node to the front and evict the least recently used node from the back.
In single-threaded code, the algorithm is straightforward:
get(key)looks up the node and moves it to the frontput(key, value)inserts or updates, then moves the node to the front- if capacity is exceeded, evict the tail node
Concurrency complicates this because map and list must be updated together.
Why ConcurrentHashMap Alone Is Not Enough
A common mistake is to think:
- use
ConcurrentHashMapfor the keys - use some concurrent deque for recency
- done
That is not enough for a real LRU policy. The problem is that recency and value updates must be atomic with respect to each other. If two threads both read and move the same key at the same time, or one thread evicts while another updates, the cache can become inconsistent.
For correctness, treat the map-plus-list pair as one shared structure.
A Practical Thread-Safe Design
The simplest correct design is:
- a normal hash map
- a normal doubly linked list
- one lock protecting the entire cache
This is not the most scalable design under very high contention, but it is easy to reason about and usually the right first implementation.
This keeps the map and list in sync because all state transitions happen under one lock.
Read Performance Tradeoff
The downside is that get is also a write to recency order, so it needs the same lock. That can limit throughput when the cache is read-heavy.
There are a few common strategies:
- accept the single lock because it is simple and correct
- relax exact LRU semantics and batch recency updates
- shard the cache into multiple smaller LRU segments
- use a production-grade library that already solved the concurrency tradeoffs
Exact concurrent LRU is harder than it looks. Many high-performance caches use approximations of LRU because strict recency under concurrency is expensive.
Eviction Race Conditions
Eviction is where broken implementations usually show up. If one thread decides to evict while another thread is still promoting or updating the same node, you can get:
- stale map entries
- list corruption
- lost updates
That is why any eviction path must hold the same synchronization that protects list movement and map mutation.
When to Use a Library
If you are building infrastructure code for production rather than learning the algorithm, prefer a mature cache library. A good library will handle:
- concurrency
- eviction correctness
- memory visibility
- performance tuning
Writing your own concurrent LRU cache is a valuable exercise, but it is rarely the first operational choice when a production library is available.
Common Pitfalls
- Combining a concurrent map with an independent queue or list and assuming the result is automatically a correct LRU cache.
- Treating
getas a read-only operation even though it changes recency state and therefore needs synchronization. - Implementing eviction without the same lock that protects map and list updates.
- Over-engineering a lock-free design before proving a simple locked design is insufficient.
- Reimplementing production cache infrastructure when a maintained library would provide better correctness and performance characteristics.
Summary
- A correct LRU cache needs both fast lookup and consistent recency ordering.
- In concurrent code, the map and linked-list updates must stay synchronized as one unit.
- A single-lock implementation is the simplest correct design.
- Exact LRU under high concurrency is expensive, so production systems often use approximations or mature libraries.
- If you implement your own, focus on correctness first, especially around
getand eviction races.

