System Design Fundamentals
Networking & APIs
Partitioning, Replication & Consistency
Caching & Edge
Messaging & Streaming
Reliability & Operability
Security & Privacy
SQL vs NoSQL
Relational databases have survived every hype cycle for 50 years because they solve one problem exceptionally well: they guarantee that your data is correct, even under concurrent access, even after crashes, even when business logic spans multiple tables. That guarantee is ACID.

Why ACID Matters
Atomicity means a transaction either fully completes or fully rolls back. Transferring $100 between accounts either debits one and credits the other, or does neither. There is no state where the money disappears.
Consistency means the database enforces constraints (foreign keys, unique indexes, check constraints) at the end of every transaction. You cannot insert an order referencing a user that does not exist.
Isolation (via MVCC in PostgreSQL) means concurrent transactions do not see each other's intermediate state. Two users buying the last item in stock both see "1 available," but only one transaction succeeds.
Durability means committed data survives crashes. Write-ahead logging ensures that even if the server loses power mid-transaction, committed changes are recovered on restart.
Normalization and Joins
Relational databases store data in normalized form: each fact lives in exactly one place. A user's email is stored once in the users table, not duplicated across every order. This eliminates update anomalies (change the email in one place, not 10,000 order rows) but requires joins to reassemble the data at query time.
This query touches two tables, filters, groups, and aggregates in a single statement. Relational databases are built for this kind of ad-hoc analytical query.
Scaling Relational Databases
The traditional scaling path is vertical first: bigger CPU, more RAM, faster disks. A well-tuned PostgreSQL instance handles tens of thousands of transactions per second on modern hardware. Beyond that, add read replicas for read-heavy workloads, then partitioning (range, hash, or list) to split large tables. True horizontal write scaling requires sharding, which introduces complexity (cross-shard joins, distributed transactions) that most applications never need.
The real advantage of SQL is not any single feature. It is the combination of ACID guarantees, ad-hoc query power, and a 50-year ecosystem of tooling. When you choose a relational database, you get transactions, joins, migrations, ORMs, monitoring, and backup tools that have been battle-tested across millions of production deployments.
Relational databases assume you know your data shape before you write the first row. That assumption breaks in two common scenarios: when the product iterates so fast that the schema changes weekly, and when write volume exceeds what a single server can handle.
NoSQL databases relax the relational model's constraints to solve these problems. They trade joins and strong consistency for schema flexibility and horizontal scalability.
Schema Flexibility
In a relational database, adding a field to the users table requires an ALTER TABLE migration that touches every row. In a document database, you just start writing documents with the new field. Old documents without the field continue to work. This is not "no schema." The schema exists in the application code, but it eliminates the migration step that slows down fast-moving teams.
Horizontal Write Scaling
Relational databases scale reads with replicas but struggle to scale writes beyond a single primary. NoSQL databases like Cassandra distribute writes across a cluster of equal peers, and adding a node increases write capacity linearly. This matters for workloads like IoT telemetry (millions of writes per second) or event logging where a single primary becomes a bottleneck.
The ACID vs BASE Trade-off
SQL databases default to ACID: strong consistency, every transaction sees the latest committed state. NoSQL databases often default to BASE (Basically Available, Soft state, Eventually consistent): writes propagate asynchronously, and different replicas may temporarily disagree. The application must handle stale reads.
This is not a weakness. It is a deliberate design choice. For a social media feed, showing a post 500ms late is invisible to users but enables a system that handles millions of writes per second across global data centers. For a bank ledger, that same 500ms of inconsistency would be unacceptable.
When to Choose NoSQL
Choose NoSQL when your access pattern is simple (key lookups, append- only writes), your data shape is evolving, or your write volume demands horizontal scaling. Choose SQL when you need transactions, complex queries, or strong consistency. Many production systems use both, and this is called polyglot persistence.
Redis is the simplest NoSQL model: every operation maps a key to a value. But Redis is not just a key-value store. It is an in-memory data structure server that supports strings, hashes, lists, sets, sorted sets, streams, and more. This makes it far more versatile than a simple cache.

Why Redis Is Fast
Data lives entirely in RAM. A single Redis instance handles 100,000+ operations per second with sub-millisecond latency. The single-threaded event loop eliminates lock contention, so every operation is atomic by default. For durability, Redis optionally persists to disk via RDB snapshots (point-in-time dumps) or AOF (append-only file logging every write).
Data Structures and Use Cases
The power of Redis is matching the right data structure to the problem:
- Strings: cache a rendered HTML fragment, store a session token with TTL.
SET session:user:123 <token> EX 3600creates a session that auto-expires in one hour. - Hashes: store an object with multiple fields without serializing to JSON.
HSET user:123 name "Asha" plan "pro" logins 42lets you read or update individual fields atomically. - Sorted Sets (ZSET): maintain a ranked collection.
ZADD leaderboard 1520 "player:42"inserts a score.ZREVRANGE leaderboard 0 9returns the top 10 players. Leaderboards, priority queues, and time-ordered feeds all use ZSETs. - Lists: implement queues.
LPUSH queue:emails <msg>enqueues;BRPOP queue:emails 0blocks until a message arrives. Simple job queues without Kafka overhead. - Streams: append-only log with consumer groups. A lightweight alternative to Kafka for moderate throughput event streaming.
Scaling Redis
A single Redis instance is bounded by RAM. For larger datasets, Redis Cluster shards keys across multiple nodes using hash slots (16,384 slots distributed across nodes). For high availability, each primary node has one or more replicas with automatic failover via Redis Sentinel or Cluster mode.
In interviews, the most common Redis mistake is treating it as a primary database without considering the RAM constraint. A 500GB dataset does not fit in one server's memory. Redis is ideal for hot data (sessions, caches, counters, leaderboards) where the working set fits in RAM. For cold data, use a disk-based database and cache the hot subset in Redis.
MongoDB stores data as JSON-like documents (internally BSON) where each document is a self-contained record with its own structure. Unlike relational tables where every row has the same columns, MongoDB documents in the same collection can have different fields. This flexibility is the core value proposition.

Document Model
A product document in MongoDB might look like this:
The variants array is embedded directly in the product document. In a relational database, variants would be a separate table with a foreign key back to products. Embedding eliminates the join: a single read returns the product with all its variants.
Embedding vs Referencing
The most important modeling decision in MongoDB is when to embed and when to reference.
Embed when the related data is read together, has a one-to-few relationship, and does not grow unboundedly. Product variants, shipping addresses, and order line items are good candidates. Embedding gives you single-document reads and atomic writes (MongoDB guarantees atomicity at the document level).
Reference when the related data is shared across documents, has a one-to-many or many-to-many relationship, or grows without limit. User profiles referenced by orders, tags shared across products, and comment threads that grow to thousands of entries should use references (store the _id and query separately).
The rule of thumb: if it fits in 16MB (MongoDB's document size limit) and you always read it together, embed it. If it can grow unboundedly or is shared, reference it.
Indexing and Querying
MongoDB supports secondary indexes on any field, including fields inside nested documents and arrays. You can index variants.color to find all products available in black without scanning every document. The aggregation pipeline provides GROUP BY, JOIN-like lookups ($lookup), and transformations, though cross-collection operations are slower than relational joins.
Scaling MongoDB
MongoDB scales horizontally via sharding. A shard key (e.g., user_id) determines which shard stores each document. Reads and writes route through mongos (the query router), which directs operations to the correct shard. Each shard is a replica set (primary + secondaries) for high availability. Choosing the right shard key is critical: a key with low cardinality (e.g., country) creates hot shards; a high-cardinality key (e.g., user_id) distributes evenly.
The biggest MongoDB anti-pattern is treating it like a relational database. If you normalize everything into separate collections and use $lookup for every query, you lose all the benefits of the document model while gaining none of the relational guarantees (no cross-document transactions until MongoDB 4.0, and even then they are expensive). Design your documents around your access patterns: embed what you read together.
Cassandra is built for one thing: handling massive write volumes across globally distributed nodes with no single point of failure. It achieves this through a peer-to-peer architecture, tunable consistency, and a write-optimized storage engine.

Data Model: Partitions and Clustering
Cassandra organizes data around the partition key and clustering columns. The partition key determines which node stores the data (via consistent hashing). Clustering columns determine the sort order within a partition.
Here, (device_id, day) is the partition key and reading_time is the clustering column. All readings for device "sensor-42" on "2025-10-01" live in the same partition, sorted by timestamp. This makes range queries within a partition fast: "give me all readings for sensor-42 today between 2pm and 3pm" is a single sequential read.
Write Path: Why Cassandra Is Write-Optimized
Every write follows the same path: commit log (sequential append for durability) then memtable (in-memory sorted buffer). When the memtable fills, it flushes to an SSTable (Sorted String Table) on disk. SSTables are immutable. Updates and deletes write new entries (tombstones for deletes), and periodic compaction merges SSTables to reclaim space.
This append-only design means writes never require a disk seek or read-before-write. A single Cassandra node handles 10,000-50,000 writes per second, and a 10-node cluster scales linearly to 100,000-500,000 writes per second.
Tunable Consistency
Cassandra replicates data to N nodes (the replication factor, typically 3). For each read or write, you choose a consistency level:
- ONE: fastest, returns after one replica responds. Risk of stale reads.
- QUORUM: majority (2 of 3 replicas). Balances consistency and latency.
- ALL: slowest, waits for every replica. Strongest consistency but any node failure blocks the operation.
The formula: if W + R > N (where W = write consistency, R = read consistency, N = replication factor), reads always see the latest write. QUORUM + QUORUM = 2 + 2 > 3, so quorum reads and writes guarantee strong consistency.
Partition Key Design: The Critical Decision
A bad partition key creates hot partitions. If you partition IoT data by device_id alone and one device sends 100x more data, that partition becomes a hot spot. Compound partition keys like (device_id, day) bound partition size (one day of data per partition) and distribute writes across more partitions.
Wide partitions (millions of rows) degrade read performance because Cassandra must scan the entire partition to find specific rows. Keep partitions under 100MB as a rule of thumb.
Time-series data has a distinctive pattern: writes are almost always appends (new data points arriving continuously), reads are almost always range scans over a time window, and old data becomes less valuable over time. General-purpose databases can store time-series data, but purpose-built databases like InfluxDB optimize specifically for this access pattern.

The Write Pattern
Sensors, servers, and applications emit metrics at fixed intervals (every second, every 15 seconds). This creates a firehose of append-only writes with monotonically increasing timestamps. InfluxDB's storage engine (TSM, Time-Structured Merge tree) is optimized for this: like Cassandra's SSTable approach, it buffers writes in memory, then flushes sorted blocks to disk. Compression is highly effective because adjacent timestamps and similar values compress well (often 2-10x compression ratios).
Data Model
InfluxDB organizes data into measurements (like tables), tags (indexed metadata for filtering), and fields (the actual values):
Tags (host, region) are indexed and support fast filtering. Fields (value) are the measured data. The timestamp is nanosecond precision. This model is optimized for queries like "average CPU usage for all servers in us-east over the last hour."
Retention Policies and Downsampling
Raw data at 1-second resolution generates 86,400 points per metric per day. Over months, this volume becomes enormous. InfluxDB solves this with retention policies that automatically delete data older than a configured duration, and continuous queries (or tasks in InfluxDB 2.x) that aggregate raw data into lower-resolution summaries.
A typical setup: keep raw 1-second data for 7 days, downsample to 1-minute averages and keep for 90 days, downsample to 1-hour averages and keep forever. This reduces storage by orders of magnitude while preserving the ability to zoom into recent data at full resolution.
Downsampling is the key insight in time-series database design. Raw data at 1-second resolution for 1,000 metrics over 1 year is 31.5 billion data points. Downsampled to 1-minute averages, that drops to 525 million, a 60x reduction. The recent data where you need full resolution is small enough to keep, and the historical data where you only need trends is compressed by orders of magnitude.
Each database type makes different trade-offs. The comparison below distills the key dimensions that matter when choosing between them.
| Dimension | PostgreSQL | Redis | MongoDB | Cassandra | InfluxDB |
| Data model | Tables with fixed columns | Key-value with typed structures | JSON documents | Wide-column with partition keys | Time-series (measurement, tags, fields) |
| Schema | Rigid (DDL migrations) | None (app-defined) | Flexible (per-document) | Rigid per table (CQL) | Semi-rigid (measurement schema) |
| Consistency | Strong ACID | Strong (single instance) | Strong (single doc), tunable (multi-doc) | Tunable (ONE to ALL) | Strong (single node) |
| Write scaling | Vertical + partitioning | Cluster (hash slots) | Sharding (shard key) | Linear horizontal (peer-to-peer) | Vertical + clustering |
| Read pattern | Ad-hoc SQL, joins, aggregates | Key lookup, O(1) | Document lookup, aggregation pipeline | Partition-key lookup, range scans | Time-range scans, GROUP BY time |
| Sweet spot | Transactions, complex queries | Caching, sessions, leaderboards | Flexible catalogs, content management | High-volume writes, IoT, event logs | Metrics, monitoring, sensor data |
When Each Database Wins
PostgreSQL wins when you need multi-table transactions, complex ad-hoc queries, or regulatory compliance that demands strong consistency. Financial systems, inventory management, and any domain where "eventually correct" is not good enough.
Redis wins when you need sub-millisecond response times for simple access patterns. Session storage, caching, rate limiting, real-time leaderboards, anywhere the working set fits in RAM and the access pattern is key-based.
MongoDB wins when your data shape varies by entity type (product catalogs with different attributes per category), you need flexible schema evolution, and your access patterns align with document boundaries (read the whole document, not pieces across collections).
Cassandra wins when write volume overwhelms a single node and you need linear horizontal scaling with no single point of failure. IoT telemetry, event sourcing, activity feeds, all write-heavy workloads with predictable query patterns.
InfluxDB wins when the data is purely time-series with append-only writes, time-range reads, and a need for automatic downsampling and retention management.

The decision framework is not "SQL or NoSQL." It is "which database best fits each workload in my system." Start with these questions:
1. What are your consistency requirements?
If a stale read causes financial loss, data corruption, or safety issues, you need strong ACID consistency. PostgreSQL is the default choice. If stale reads are invisible to users (social feeds, analytics, cached content), eventual consistency is acceptable and unlocks horizontal scaling options.
2. What is your access pattern?
- Key lookups with sub-millisecond latency: Redis
- Document reads with flexible schema: MongoDB
- High-volume appends with time-range scans: InfluxDB or Cassandra
- Complex ad-hoc queries with JOINs: PostgreSQL
3. What is your write volume?
Under 10K writes per second, PostgreSQL handles it on one node. Between 10K-100K, consider MongoDB sharding or Cassandra. Above 100K, Cassandra's linear horizontal scaling is designed for this range.
4. How does your schema evolve?
If the schema is stable and well-understood, relational models give you the strongest guarantees. If it changes weekly with different entities having different shapes, document models avoid migration overhead.
5. Do you need multiple databases?
Polyglot persistence (using the right database for each workload) is the norm in production systems. PostgreSQL for transactions, Redis for caching, Cassandra for event logs, InfluxDB for metrics. The operational cost of running multiple databases is real (deployment, monitoring, backups), but it is usually less than the cost of forcing one database to serve all access patterns poorly.
In interviews, never say 'I would use MongoDB' or 'I would use PostgreSQL' without explaining the access pattern that drives the choice. The interviewer wants to see that you match the database to the workload, not that you have a favorite. Start with the access pattern and consistency requirement, then derive the database choice.
The SQL vs NoSQL debate is a false dichotomy. Production systems rarely use one database for everything. The real skill is matching each workload to the database designed for it.
PostgreSQL is the default starting point for most applications because ACID transactions, ad-hoc queries, and a mature ecosystem handle a wide range of workloads. You add specialized databases when PostgreSQL hits a wall: Redis when you need sub-millisecond latency, MongoDB when your schema evolves faster than migrations allow, Cassandra when write volume exceeds a single node, and InfluxDB when time-series data needs purpose-built storage and retention.
The decision framework is always the same: start with the access pattern (key lookups, document reads, range scans, complex joins), then consider the consistency requirement (strong ACID vs eventual), then evaluate the write volume (single node vs distributed). The database choice follows from these three inputs, not from team preference or industry trends.
Polyglot persistence (using multiple databases, each for its strength) is not over-engineering. It is the standard architecture for systems that serve diverse workloads at scale. The operational cost of running multiple databases is real, but it is almost always less than the cost of forcing one database to handle workloads it was not designed for.
The best database is the one that matches your access pattern. Not the newest, not the most popular, not the one your team already knows. Start with the workload, derive the requirements, and let the requirements choose the database.