Databases

Animated explainers on databases.

All Caching & Performance Databases Kafka & Streaming Distributed Systems System Design Networking & Load Balancing ML & Agentic AI

Databases

May 16, 2026

Distributed Query Execution: One Slow Shard Owns Your P99

A SELECT across shards becomes a fanout plan with parallel scans, shuffles, and a coordinator merge. Tail latency is dominated by the slowest shard, and cross-shard joins quietly break linear scaling.

Databases

May 13, 2026

Why Your Database Choice Shapes Every Other Design Decision

Caching, queueing, replication, partitioning: every system design choice bends to your database. Here is why databases shape everything downstream of them.

Databases

May 4, 2026

Event Sourcing: Store the Events, Derive the State

Event sourcing turns your database into an append-only log of facts. State becomes a function of history. The cost is schema evolution forever and replay times that grow with your business.

Databases

Apr 29, 2026

The Six Database Questions Every System Design Eventually Asks

Writes, reads, indexing, transactions, replication, partitioning. The technology stack changes every five years, but the six core database questions never do.

Databases

Apr 12, 2026

Why One User Write Turns Into a Graph of Side Effects in Modern Backends

A single Place Order request looks like one write. In production it fans out into a graph of cache invalidations, index updates, events, and audit logs.

Databases

Apr 7, 2026

Pick the Database That Matches the Workload, Not the One on Your Resume

OLTP, OLAP, key value, document, time series, columnar, search. Each storage engine is optimized for a specific access pattern. Match the workload first, the brand name second.

Databases

Apr 4, 2026

Read After Write: When Your Own Update Vanishes on Refresh

Async replication scales reads but lets users see their own writes disappear. Read-your-writes, monotonic reads, session pinning, and the failures that follow.

Databases

Mar 26, 2026

Inverted Indexes: How Full-Text Search Stays Fast at a Billion Documents

An inverted index flips the problem: instead of scanning documents for terms, you look up terms and get the documents. That is the whole trick behind millisecond search.

Databases

Mar 25, 2026

Multi-Leader Replication: The Conflict Problem Nobody Wants to Own

Multi-leader cuts cross-region write latency, then hands you conflicts. LWW silently loses data, app-level merge takes work, and CRDTs only fit certain shapes.

Databases

Mar 22, 2026

Leader Follower Replication: The Default That Hides a Cliff

Leader follower replication is the boring default for Postgres, MySQL, and MongoDB. Sync vs async, read scaling, failover, and the lag-driven data loss waiting in production.

Databases

Mar 21, 2026

Multi-Tenancy: Three Isolation Models and the Tenant That Stalls Them All

Shared schema, schema per tenant, database per tenant. Each model trades operational cost for blast radius. Row-level security alone does not stop the noisy neighbor.

Databases

Mar 19, 2026

Time-Series Databases: Why a Generic Row Store Falls Over at Metric Scale

Time-series data is append-only, write-heavy, and queried by time range. That access pattern demands columnar layout, delta-encoded timestamps, and downsampling, not a B-tree.

Databases

Mar 8, 2026

MongoDB: When to Embed, When to Reference

Embed for one-to-few, bounded, read-together data. Reference for shared or unbounded data. The 16MB document limit and write hotspots decide the rest.

Databases

Feb 24, 2026

Relational, Document, Graph: Pick the Database That Matches Your Relationships

Database choice is really a relationship-shape choice. Joins want relational. Aggregates want document. Traversal wants graph. Forcing the wrong one shows up as recursive CTE hell.

Databases

Feb 20, 2026

Database Indexes: What They Cost You for the Speed They Buy

B+tree indexes give O(log n) reads via sorted pages, with leftmost prefix rules for composite keys and covering indexes that skip the heap. Every index also taxes every write, so prune before you add.

Databases

Feb 19, 2026

The Three Join Algorithms Every Query Planner Picks Between

Nested loop, hash join, merge join. Each wins in a different shape of data. Reading EXPLAIN tells you which one the planner chose and whether it was right.

Databases

Feb 19, 2026

Time-Series Database Internals: Why TSDBs Look Nothing Like Postgres

How InfluxDB, TimescaleDB, and Prometheus actually store metrics. Time-chunked files, columnar layout, downsampling, retention, and why insert-only is the whole game.

Databases

Feb 16, 2026

Database Partitioning Strategies: Range, Hash, List, and the Composite You Wish You Had Picked

Range, hash, and list partitioning each optimize for different access patterns. The trap is picking a key that distributes writes well but destroys read locality.

Databases

Feb 14, 2026

Partition Pruning: The Optimization That Vanishes When You Touch the Key

Partition pruning lets the planner skip irrelevant partitions, but only when it can see the partition key in the predicate. Wrap the column in a function and the optimization disappears.

Databases

Feb 13, 2026

The Query Planner Is Guessing, and Stale Statistics Make It Guess Wrong

The cost-based optimizer picks scan type, join order, and join method from row count estimates. When the estimates are off by five orders of magnitude, it picks nested loops on a million rows and your query runs for 25 minutes.

Databases

Feb 12, 2026

Write-Ahead Logging: The Real Boundary Between Committed and Lost

WAL appends every change to a sequential log before touching data pages. fsync on that log is the durability line. Group commit, synchronous_commit, and battery-backed caches all live on this line.

Databases

Feb 11, 2026

Transaction Isolation Levels: The Anomaly Each One Actually Stops

Read Uncommitted, Read Committed, Repeatable Read, and Serializable each ban a specific anomaly. Postgres Repeatable Read is snapshot isolation and still allows write skew. Knowing which level stops what is the whole game.

Databases

Feb 11, 2026

B-Trees vs LSM-Trees: How the Write Path Differs

B-trees do in-place page updates with random I/O. LSM-trees buffer in a memtable and flush sorted SSTables. The write path is where the two structures diverge.

Databases

Feb 9, 2026

Postgres VACUUM and Bloat: Why Your Hot Table Quietly Gets Slow

MVCC keeps old row versions until no transaction can see them. VACUUM reclaims that space. Long transactions, idle-in-transaction sessions, and replication slots pin dead tuples and let tables bloat 10x or more.

Databases

Feb 4, 2026

Bloom Filters: A Probabilistic Permission Slip to Skip Work

K hash functions, one bit array, zero false negatives. Bloom filters answer maybe or definitely not, which is exactly what LSM reads, CDN caches, and crawlers need to skip expensive checks.

Databases

Feb 1, 2026

LSM Trees: Why Writes Are Almost Free and Reads Pay the Bill

Memtable plus WAL plus SSTable levels. The LSM write path is O(1) sequential I/O, while reads climb levels guarded by bloom filters and sparse indexes. Compaction is the price.

Databases

Feb 1, 2026

MVCC: Why Readers Never Block Writers

MVCC gives every transaction its own snapshot by keeping multiple row versions. Postgres tuples plus VACUUM, MySQL undo log. Readers and writers stop fighting.

Databases

Jan 29, 2026

Partitioning vs Indexing: Two Layers, Two Problems, One Common Confusion

Partitioning bounds the working set, indexing accelerates lookups within it. You usually need both, and local vs global indexes is the question that decides whether your partitioned table is actually fast.

Databases

Jan 28, 2026

ACID Letter by Letter: Which Failure Each One Actually Prevents

Atomicity stops partial commits. Consistency is an app invariant, not a DB guarantee. Isolation stops concurrent transactions from corrupting each other. Durability survives the crash. Most apps lose money at the I, not the A.

Databases

Jan 24, 2026

LSM Compaction Strategies: Size-Tiered, Leveled, and the Workload That Picks for You

Size-tiered compaction trades space for write throughput. Leveled compaction trades write amplification for predictable reads. Universal sits in the middle. Pick by workload, not by default.

Databases

Jan 23, 2026

LSM Reads: Bloom Filters and Sparse Indexes Are What Make GETs Survivable

An LSM GET could scan every SSTable on every level. It does not, because two structures filter the work: a bloom filter per file and a sparse index inside each file. Both are non-negotiable.

Databases

Jan 21, 2026

B-Tree vs LSM: Stop Comparing Features, Start Comparing Disk Shapes

B-trees mutate pages in place. LSM trees append immutable files and merge later. Once you see the disk shape, write amplification and read latency stop being mysterious.

Databases

Jan 20, 2026

Write Amplification in B-Trees and LSM Trees: The Ratio Math

B-trees amplify writes through page rewrites and the WAL. LSM trees amplify through compaction across levels. The math behind each, and why neither is free.

Databases

Jan 14, 2026

Redis Cluster Slot Migration: How Traffic Keeps Flowing While Slots Move

Redis Cluster shards by 16384 hash slots, not by keys. Here is how MOVED and ASK redirects keep clients on the right node during a live resharding, and how one bad client setting can loop traffic forever.

Databases

Jan 12, 2026

Redis Durability: What You Actually Lose on Crash

Redis is honest about durability. RDB, AOF, and replication solve different problems and lose different amounts of data. Here is exactly what is gone after a crash under each mode.

Databases

Jan 12, 2026

Redis Durability: RDB, AOF, and the Lie of 'Just a Cache'

RDB snapshots are fast but lossy. AOF with everysec costs a second of writes on crash. Replication protects availability, not data. The mode you choose decides what you lose.

Databases

Dec 23, 2025

Elasticsearch Indexing vs Search: The Two Paths That Define the System

Elasticsearch has two completely different code paths. The index path builds inverted indexes through analyzer chains. The search path fans out, scores, and merges. The refresh interval is the seam between them.