Caching & Performance Databases Distributed Systems

Hot Partitions and Hot Keys: The Tax You Pay for Going Viral

January 31, 2026

Hot partitions are what success looks like in a partitioned system. You hashed the key space evenly, the cluster runs cool, and then one user gets featured on the homepage or one product goes viral. Every request for that key hashes to the same shard. That shard's CPU pegs, its queue fills, retries pile on top of retries, and the other nine nodes in the cluster sit at 12 percent utilization watching the fire.

Partitioning by key promises two things. Load spreads across shards, and ordering is preserved per key. It promises nothing about the distribution of traffic across keys. The moment one key carries a disproportionate share of reads or writes, that promise is broken.

Concrete example. A social app keys timeline writes by user_id. A creator with eight million followers posts a video that goes viral. Every follower's timeline fan-out writes to the same shard that owns the creator's outbox. The shard's write queue blows past its commit-log throughput, replication falls behind, leader lease expires, the partition fails over. The failover does nothing useful because the new leader inherits the same hot key. The site looks broken to everyone, not just to the creator's followers.

Four fixes, ranked roughly by how I reach for them.

Salt the key. Rewrite user#42 as user#42#0 through user#42#N on the write path and pick a bucket at random. Writes now hash to N different partitions. Reads pay the cost: you have to scatter-gather across the buckets and merge. Ordering becomes eventually ordered. This is the right call when the hot key is write-heavy and the read path can tolerate aggregation.

Cache the hot keys. Put a small in-memory cache in front of the data store, keyed on the few thousand items that actually matter. A hot key has high temporal locality by definition, so the hit rate will be excellent. This is the fastest win for read-heavy hot keys, and the only cost is invalidation discipline.

Coalesce duplicate reads. When 50 thousand requests for the same key arrive in the same second, only one of them needs to touch the store. Singleflight in front of the database collapses them into one query and broadcasts the result.

Split or move the partition. Detect persistent skew, then split the offending range or relocate it to a dedicated shard. This is expensive and operationally noisy, so save it for when the heat is structural rather than transient.

The mental model is short. Hot key causes hot partition. Hot partition causes localized overload. The fix is to spread the load, shield the data store, or smooth the arrival rate. Match the technique to whether the heat is reads or writes, and you will not melt one shard while the cluster idles.

Key takeaway

A hot partition is not a bug, it is the cost of one key being more interesting than the others. Spread the load with salted keys or read replicas, shield it with a small front cache, and smooth writes with buffered batches. Pick the fix that matches whether the heat is read-shaped or write-shaped.

Originally posted on LinkedIn. View original.