Cache Miss Amplification: Why a 99% Hit Rate Can Still Take You Down

May 17, 2026

A cache hit is one read. A cache miss is, almost always, many. That asymmetry is the most important thing to internalize about caching. The hit path returns a serialized blob in a hundred microseconds. The miss path, in a typical product API, opens a database connection, runs a query with three or four joins, hits a secondary index, fans out to an auth service, calls a recommendations service, and writes the result back. That can easily be 50ms of work and a dozen downstream calls for one missed read.

Define miss amplification as the cost of the miss path divided by the cost of the hit path. At 10x, you can tolerate a 90% hit rate and not feel pain. At 100x, even a 99% hit rate means your origin sees the same load as if the cache were not there at all. People forget this because the dashboards show hit rate as a nice flat 99%, and the conclusion is that the cache is working. The cache is working. The origin is still drowning.

This is a different failure mode from a cache stampede. Stampedes are about one key being rebuilt many times in parallel. Miss amplification is about every miss being structurally expensive, regardless of concurrency. You can have neither stampede nor cold cache and still take down the database if the miss path is heavy enough.

The production failure I want to name is one I have seen twice. A team ran a product API with a 99% hit rate, holding a steady 30k requests per second. A deploy shipped a small refactor that changed the cache key prefix from v2:product: to v3:product: for one rare code path, maybe 1% of requests. Hit rate moved from 99% to 98%, which looked fine on the dashboard. But each miss did a six-join query against an already-busy primary. The database connection pool saturated within two minutes, lock waits cascaded, and the API's p99 climbed from 40ms to 4 seconds while the cache hit rate stayed at 98%. Engineers stared at a green cache dashboard while the platform burned.

The lesson: track origin load and miss path latency alongside hit rate. Treat any cache key change as a production migration, with shadow reads and a slow rollout. And before you trust a 99% hit rate, multiply your miss cost by your miss rate and ask whether your origin can hold that number on its worst day.

Key takeaway

Hit rate is the wrong metric on its own. What matters is amplification: the cost ratio between a miss and a hit. At 100x amplification, 1% miss rate is 100% origin load.

Originally posted on LinkedIn. View original.