Design Dropbox - System Design

System requirements

Functional:

User can upload a file to the service
User can create folders, navigate folders
User can view uploaded files
User should be able to download a file
User can share files with others through links
File versioning should be supported for files to track changes and revert to previous version if needed
Allow real-time collaboration for multiple users on a documents

Non-Functional:

Support 100M DAU
Support both web and mobile clients
The latency should be as low as possible for uploading, viewing and collaboration
Consider availability over consistency, we'll ensure eventual consistency for the file during collaboration

Capacity estimation

Upload:

we assume on average a user uploads 2 files per day. Then there are total 200M file per day, the file meta data needs to be store in DB, so QPS = 200M / 24 / 3600 = 2350
We assume that 5% of user create a new folder, then the QPS for create folder is 100M * 5% / 24 / 3600 = 58
Assuming the average file size is 50MB, then daily storage consumption is 50MB * 200M = 10000 TB

View:

We assume each user view 5 files per day, then the QPS for read is: 5 * 100 M / 24 / 3600 = 5800 QPS

Collaboration:

We assume 10% of users collaborates on file per day and each user makes 20 edit updates
Then QPS for edit updates: 100M * 20 * 10% / 24/ 3600 = 2350

We also consider peak QPS would be as twice as these numbers. Based on the QPS and storage size needed, we'd use distributed system architecture to design the system.

API design

Upload

Upload file
post v1/upload
body { user_id, file_blob, path}
authorization: auth_token
Create folder
post v1/folder/create
body {user_id, folder_name, path}
authorization: auth_token

View files - navigation

Get file list for a given path
get v1/path/:user_id
authorization: auth_token

Share a file
get v1/share/:file_id?user_id=xxx
authorization: auth_token
Response: a share url

File versioning

Create new version of the file
post v1/version/new
body {file_id, user_id}
authorization: auth_token
View version list
get v1/version/:file_id
authorization: auth_token
Revert to a version
post v1/version/revert
body {file_id, user_id, version_id}
authorization: auth_token

Collaboration

Edit file
post v1/collab/edit/
body {file_id, user_id, file_position, content}
View other's position
post v1/collab/users/:file_id
body {position}

Database design

user table

user_id
user_name

file table

file_id
file_name
description
folder_id
upload_status
file_blob_liink
created_by
create_time
version_id
collab_users
share_link

file blob storage - Store the actual file blobs, accessed with file_blob_link

key: file_id

folder table

foler_id
folder_name
create_time
created_by
path_text

version table

version_id
file_id
created_by
create_time
diff_list

High-level design

Client

End user client to navigate and upload file

Load balancer

Determine which server to handle the request

API Gateway / webapp server

Webapp server build the web page and return to the user
API Gateway redirect api request to corresponding service
API gateway also manages rate limiting
API gateway also handles authentication check

Upload service

Handle file upload request

UserDB

Store user table

File DB

Store file metadata table

Folder DB

Store folder table

File blob

Store the actual file blob

Upload worker

Handle upload request being posted from upload service
Notify upload finished to upload service

Upload message queue

Receive and publish file upload message from upload service
Receive and publish file upload succeeded message

Navigation service

Handle navigation request, return file list under a folder

Navigation cache

Cache file list under folders to reduce navigation latency

Share service

Handle share request, return share link to user as response

Version service

Handle version api request, add a new version into version DB for the file_id

Version DB

Store version table

Collab service

Handle collab request, also generate new version for the file during collab

Collab file cache

Cached file to reduce collab edit latency

Request flows

File upload

User initiate a file upload request from client
The api request will route through load balancer and API gateway and lands on a upload service machine
The service write file metadata into file table, mark the file status to be 'uploading', then initiate the blob upload on blob storage
The service then publish a file upload message to message queue, then it can return response back to the client to notify the uploading, user don't need to keep waiting on the client until the full upload finishes
Upload work subscribes the file upload message, it then continue to handle the upload request in the blob storage
if the upload finishes, the worker update the status in file table to be 'uploaded'

Create folder

Client initiate a create folder request, the v1/folder/create request is sent to upload service
Upload service then write a new folder data entry into folder table
The service return succeeded response to client

Navigation

When user clicks a folder from the UI, a navigation api request is sent to navigation service
The service first checks the target folder_id is in navigation or not, if it's in the cache, the request send the file list data back as response
If cache miss, the service query folder table and file table to get file list for the folder to return to the client as response, the service also put the list into the cache, for cache eviction, the least recently used entry is evicted

When user click share button from the UI, a share api request is sent to share service
The service checks the file table to see whether the current file has share link or not, if so, it returns the link as response
If there's no share link yet, the service generate a share link with UUID and store it in the share link field, UUID can make sure the generated share link is unique for each file

Versioning

User can create a new version of the file from the UI, this triggers v1/version/new request to version service
The version service query the file table, version table and file blob, it detects the diffs of current version with previous version, then write a new entry into the version table for the file to track diffs as structured diff list
When reverting, the user can choose a version from the client, it triggers v1/version/revert request to version service, the service choose the version and diff list out from version table, then apply it to the file blob, it also updates the version_id in file table then return response to the client

Collab

During collab, the client periodically, e.g. 10s send api request v1/collab/user to get users position on the file to show to the current user. The request also send the current user's position to the service.
During collab, when user update the file, the position and content update is sent to collab service with api v1/collab/edit request, the service first check whether the file is in collab file cache or not, if not, the service also bring the file into file cache. The edit is applied to the file, the service also generates a new version for the file and store it into version table. It also update version_id in file table.
The cache if configed to sync back the update to the file blob

Detailed component design

Collab

During collab, the client periodically, e.g. 10s send api request v1/collab/user to get users position on the file to show to the current user, which is handled by collab service, since the position data is small and only needed during collab session, it can be stored inside collab service's memory, though this makes the service stateful to the collab session, since each file collab session's max user number is not so large (normally smaller than 100), this won't cause a bottleneck. If the collab machine is down, user may lose the current user positions temporarily, however, once a new machine is up to serve the collab session, positions will be recovered.
If the collab cache goes down the current edit may loss for the session, to prevent data loss, we need to have master-slave replication for the cache, if master goes down, a slave replica is promoted to master to keep serving the session

DB Sharding

Since the system handles large QPS and large amount of data and We can shard the database, we can shard the DB to increase availability
For user table, folder table, file table, version table we can shard based on user_id.

Trade offs/Tech choices

File merge during collab

When multiple users update the same position of the file , it will cause a merge conflict of the file. There are different types merge conflict solutions. We choose last write wins strategy to handle conflict, as it's simple, straight forward and can work well for most scenarios. In the last write wins solution, the collab service compare the edit timestamp for conflicted edits and choose the edit that has latest timestamp to apply to the file.

Failure scenarios/bottlenecks

If the collab cache goes down the current edit may loss for the session, to prevent data loss, we need to have master-slave replication for the cache, if master goes down, a slave replica is promoted to master to keep serving the session
If one of the navigation cache machine goes down, the navigation service would experience an increased latency as cache miss rate goes higher

Future improvements

File merge during collab

We can improve the merge by using the google docs' merge strategy, which is open documented, it can prevent data loss more efficiently than last write win