Write Amplification: The Hidden Multiplier That Kills SSDs Early
February 6, 2026
Write amplification is the ratio nobody on the application side wants to look at. The definition is simple: bytes written to the underlying storage divided by bytes written by the application. A WA of 1 means your application wrote a megabyte and the disk wrote a megabyte. A WA of 20 means your application wrote a megabyte and the disk wrote 20.
LSM trees are the worst offenders by design. Writes land in a memtable, flush to a level 0 SSTable, and then compaction rewrites them as they cascade through levels. Each level holds roughly 10x the data of the level above, and reaching level 6 means the same key bytes were rewritten six or more times. Production RocksDB tunings commonly measure write amplification between 10 and 30, depending on workload skew and compaction strategy.
B-trees look better at the storage layer but worse than they appear. A typical B-tree write touches the leaf page (usually 8 KB or 16 KB even if you wrote 50 bytes), then doubles to the WAL, then triggers full-page writes after a checkpoint. The realistic WA is 4 to 6.
Then there is the layer most teams forget. SSDs internally relocate data through their flash translation layer to handle wear leveling and garbage collection. That FTL adds another 2x to 3x on top of whatever the database is already doing. The 20 the database thinks it has is closer to 50 at the flash cells.
I watched a team learn this the hard way. They migrated from a B-tree backed database to an LSM engine because the marketing said "built for write-heavy workloads." Their effective application-to-flash write amplification went from about 4 to about 22. The SSDs in the fleet were rated for four years at their previous duty cycle. They started crossing endurance thresholds at 11 months. The replacement bill for the fleet dwarfed any throughput gain the new engine had produced.
Two things would have caught this in design review. First, measure. The ratio of iostat write bytes to application write bytes over a representative day tells you exactly what you are paying. Second, pick the compaction strategy that matches the constraint that actually matters. Leveled compaction trades space for lower write amp. Tiered compaction trades write amp for space. If your bottleneck is SSD endurance, leveled is the answer. If your bottleneck is disk capacity, tiered might be.
Storage engine choices look like throughput decisions on the slide deck. In production they are endurance decisions. Measure WA before you migrate, not after the replacement invoice lands.
Write amplification compounds across layers. Application bytes become storage engine bytes become FTL bytes. Measure the ratio before you swap storage engines, or you will pay for the change in burned SSDs instead of throughput.
Originally posted on LinkedIn. View original.