Design Dropbox - System Design

System requirements

Functional:

User can load/upload files.
Modify content.
Share files with permissions (Read-only/Write-only etc.).

Non-Functional:

Availability
Durability.
Scalability.
Consistency.
Security.
Low latency.

Capacity estimation

. Blob Storage (File Content)

Average file size: 10 MB
Uploads per user per day: 10
Active users: 100,000
Daily upload volume: 10 TB/day
Annual storage needed: ~3.65 PB/year

Metadata Database

a. Files table:

Rows/year: ~365 million
Row size: ~300 bytes
Total: ~110 GB

b.File versions table:

Rows/year: ~550 million (assuming 1.5 versions per file)
Row size: ~400 bytes
Total: ~220 GB

c. Other tables

Estimated size: ~100–150 GB

API design

File APIs:

POST /files – Create a new file upload (get upload URL)

PUT /uploads/{upload_id}/chunks/{part_number} – Upload a chunk

POST /uploads/{upload_id}/complete – Finalize chunked upload

GET /files – List user’s files

GET /files/{file_id}/download – Download a file

DELETE /files/{file_id} – Delete a file

Versioning:

GET /files/{file_id}/versions – List versions of a file

GET /files/{file_id}/versions/{version_id} – Get a specific version

Sync & Devices:

POST /devices – Register a new device

GET /sync – Get list of files needing sync

POST /sync/ack – Acknowledge synced files

Jobs:

GET /jobs/{job_id} – Get job (e.g. virus scan) status

Database design

Database tables:

Table users {

user_id <-key

}

Table files {

file_id <- key

user_id <- reference to user

name

type

size

version

created_at

updated_at

}

Table file_version {

version_id <- key

version

file_id <- refernce to files

blob_path

status

checksum

uploaded_at

user_id <- reference to users

}

Table chunks{

chunk_id <-- key

version_id <- reference to vestions

number

blob_path

created_at

size

}

High-level design

CDN
API Gateway - load balancer, router, auth, rate limiter and ssl termination.
File uploader - writes metadata of file to Metadata DB and returns URL to chunk uploader.
Chunk uploader - uploads data in blob store.
Blob store - stores user content (AWS S3).
Metadata - stores file metadata (Postgres).
Sync service - synchronizes data on the server and on the client device.
Async job - notify all devices.

Request flows

Entry Point

Client interacts with the system.
CDN (Content Delivery Network) is used for caching and accelerating delivery of static assets or downloads.
API Gateway serves as the main entry point, routing requests to appropriate services.

Upload Flow

Client initiates an upload request via the API Gateway.
API Gateway routes the request to the File Upload Service.
File Upload Service:
- Registers the upload and generates an uploadID.
- Calculates chunking strategy based on file size.
- Returns chunk upload URLs (or endpoint) to the client.
Client uploads file chunks directly to the Chunk Uploader.
Chunk Uploader:
- Stores chunks in the Blob Store.
- Records upload metadata (e.g., file name, size, parts) in the Metadata DB.
Once upload is complete:
- Metadata and event notifications are sent to the Sync Service and/or published to the Message Queue.

Download Flow

Client sends a download request via the API Gateway.
API Gateway routes the request to the File Download Service.
File Download Service:
- Fetches file metadata from the Metadata DB.
- Coordinates with Chunk Downloader to retrieve individual file chunks from the Blob Store.
Chunks are reassembled and streamed back to the client.

Syncing & Background Processing

Sync Service maintains consistency across devices, triggering updates or syncing actions as needed when files change.
Async Jobs perform background processing such as:
- Virus scanning
- Preview/thumbnail generation
- File indexing or OCR
These jobs are triggered via events published to the Message Queue, ensuring non-blocking and scalable task execution.

Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...

Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...

Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?