Design Dropbox - System Design

Requirements

Functional Requirements:

Allow users to upload files to the system.
Enable users to download uploaded files.
Ensure synchronization of files between local and server storage.
Batch upload & download

Non-Functional Requirements:

app availability 99.99%, 52min downtime in a year
file/data eventual consistency
scalability, when a server fails, the failover
security, only user with permission can upload/view. encryption for file object
durability, needs replication for each component.

world 8B, 10% users from pop = 0.8B

system DAU will be 10% of users = 80M

R:W = 10:1

W DAU

API Design

1x1

POST /v1/file {file info}-> return {status, error code, location url}

batch

POST /v1/files {files info}-> return {status, error code, location url}

download a file

GET /v1/file/id

GET /v1/files/id={id1, id2,..}

get file info, to finalize upload

Get /v1/fileinfo/id

all above API using JWT as user security token to access endpoints

High-Level Design

fileInfo: fileId, name, author, chunkIds:[], updated, created

chunkInfo: fileId, chunkId, chunkUrl

Client → API Gateway

Client calls:

upload(fileId)

API Gateway responsibilities:
- Route request to available upload servers
- Use least-connections load balancing
- Handle failover when a node goes down

👉 Design intent

Prevent hotspot
Ensure high availability at entry point

2. Raw Object Storage (S3)

Upload server:
- Directly streams raw file → S3 (raw bucket)

👉 Why first write to S3?

Avoid coupling with metadata or processing
Durable storage immediately (no data loss)

3. Metadata Registration (FileInfo Service)

Client sends:

POST upload(fileInfo)

FileInfo Service:
- Stores metadata into DB

fileInfo {
fileId,
name,
author,
created,
updated
}

👉 Design intent

Metadata and file storage are decoupled
Enables independent scaling

4. Async Processing via Kafka

System publishes event:

topic: file-uploaded

👉 This triggers downstream processing pipeline

5. Object Processing Pipeline (Async Consumers)

A dedicated Object Handler Service consumes Kafka events:

Pipeline stages:

Raw S3 Object
↓
Split into chunks
↓
Fraud Detection
↓
Compression Service

👉 Why pipeline?

Each stage is independently scalable
Failures are isolated (retry per stage)

6. Chunk Storage

After processing:
- Each chunk uploaded to Chunk S3

👉 Guarantees:

All chunks successfully uploaded
Idempotent writes (important for retries)

7. Chunk Notification (Kafka)

Publish event:

topic: chunk-created

8. Chunk Metadata Service

Chunk Service consumes event
Stores chunk metadata:

chunkInfo {
fileId,
chunkId,
chunkUrl
}

👉 Design intent

Enables parallel download later
Avoids scanning S3 during reads

Client → API Gateway

2. CDN Check

First check:
- CDN cache (edge)

👉 If hit → return immediately

👉 If miss → go backend

3. File Metadata Lookup

Request goes to FileInfo Service
Service fetches:
- fileInfo DB
- chunkInfo DB

4. Return Chunk URLs

Response contains:

[
{chunkId, chunkUrl},
...
]

5. Client Parallel Download

Frontend:
- Downloads chunks in parallel
- Reconstructs file locally

👉 Why client-side merge?

Reduces backend load
Maximizes bandwidth usage

Detailed Component Design

Fault Tolerance

Raw file already in S3 → no data loss
Kafka enables retry
Chunk processing is idempotent

Scalability

Chunking enables:
- parallel processing
- parallel download
Each service scales independently
Faster uploads

Failover / Replication Design

1. Local-region design

Write path

Writes go to the primary node
The primary replicates data to local replicas
The write is considered successful only after:
- primary write succeeds, and
- at least one replica acknowledges the write

Read path

Reads are served from local replicas whenever possible
This reduces read latency and offloads traffic from the primary

Why this design

Compared with waiting for all replicas, waiting for one replica ack gives a better balance:
- better durability than primary-only ack
- lower latency than full synchronous replication

2. Multi-region design

Replication strategy

Use quorum-based replication across regions

For example:

total replicas = 5
write quorum = 3

Then:

upload is considered successful once 3 out of 5 replicas acknowledge the write

Why quorum

It allows the system to tolerate node or region failures
As long as quorum is preserved, the system can continue serving writes safely

3. Failover strategy

During failover, the system promotes the most up-to-date replica as the new primary, typically the one with the smallest replication lag or the latest committed log position.

Performance Optimization

CDN for hot files
Chunk-based parallel download

Extensibility

Pipeline can easily add:
- virus scan
- AI tagging
- preview generation