Design Dropbox - System Design

System requirements

Functional:

List functional requirements for the system (Ask interviewer if stuck)...

Users can upload a file
Users can download their file
Users can delete their file
Users can modify their file (versioning, potentially revert).
Users can search their files.
Users get notified for file updates to their own data.

Non-Functional:

List non-functional requirements for the system...

Highly durable data
Eventual consistency since immediate read after write is not required
High availability, service should always be running.

Capacity estimation

Estimate the scale of the system you are going to design...

100 uploads per user per month

200 downloads per user per month

5 GB max file size

Average file size is 10 MB

UploadsPerSecond: 100 WPmonthPusers * 100,000 users= 10,000,000 W/month / 30 / 100,000 secondsPday = 3.33 WPS

downloadsPerSecond: 6.66 RPS

API design

Define what APIs are expected from the system...

Users can upload a file

ID WriteFile(file, name)

Users can download their file

File ReadFile(ID)

Users can delete their file

Status DeleteFile(ID)

Users can modify their file (versioning, potentially revert).

Status ModifyFile(ID, new_file)

Users can search their files.

[File] SearchForFiles(string)

Users get notified for file updates to their own data.

Can consider polling, long polling, SEE, etc.

Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...

We agreed that eventual consistency is ok, we don't need ACID requirements, and with continuous data growth, could consider NoSQL in case of growth. Currently the WPS is low enough that SQL would probably be an ok option for some time.

A document or column oriented NoSQL DB would probably lend itself well here. Since users often open a page that contains their user information and files uploaded, maybe a document based DB makes more sense here than a NoSQL DB (which would be more optimized for cases where we only query for certain columns).

User
ID string
files [File]
File
ID string
name string
current_version int
cloud_storage_path string
past_versions [File]

Storing file metadata within a user implies a 1:1 relationship between the file and user. This may not always be true down the road, what if a file could be owned by multiple users? This would imply we should actually have a separate table for files and then just have a list of file ID pointers in the users table. This is probably the better DB schema. Would look something like:

User
ID string -bprimary index
file_ids [string]
File
ID string - primary index
name string
current_version int
cloud_storage_path string
past_versions [File]

High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design...

Load Balancer: A LB is probably not necessary for this design at all considering the WPS/RPS but considering there's consistent user growth, eventually it may be needed.

FrontendService: Handles all RPC requests.

MessageQueue: Allows synchronization of clients to happen asynchronously

ObjectStorage: Allows for durable object storage

Database: NoSQL DB that stores metadata for files stored for a particular user. Contains pointers to object storage for files.

Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...

WriteFile:

Request gets to frontend service
Writes file to object storage
After successful, writes metadata to DB
After successful, asynchronously sends request to MQ for sync processing. This will contain the file ID that was updated so the client can fetch the latest data.
Return success to client.

Note: it's possible for a client to get out of sync with the service if we fail at step 4. Metadata will have been successfully written, but not sent to the sync service. We should have a polling system in the client to query for metadata state to catchup for failing cases here.

if step 3 fails, we could have data stored in object storage but no metadata attached. We should have a job that searches for these cases and garbage collects them.

ReadFile:

Request gets to the frontend service
Queries DB for metadata for the ID. Returns the object storage link
Frontend service downloads the bytes from object storage and streams the bytes back to the client as read. Could be done through a message buffer in code.
Returns success when all data sent back.

ModifyFile:

Request gets to the frontend service
Writes new file to object storage
Updates metadata to point to the new file: updates the version of the file metadata, and changes the object storage pointer.

SearchFile:

MongoDB offers search indexing on fields. We could have one for the file name filtered by user.

Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...

Performance Optimizations

If there are certain objects that are read more often, we can introduce a cache in front of object storage or even consider a CDN depending on needs. For the cache or CDN approach, we should consider an eviction policy of LFU (so only the most frequently retrieved objects will be stored). Since we want to maintain high write throughput, we might want to use a write back cache approach which trades off some reliability for faster throughput but if an object is read many times, it's ok if a few counts are dropped.
Even though the frontend service has a low WPS, dropbox is a service that serves traffic around the world and we should consider a global deployment to maintain high write and read throughput. If we do this, we should consider whether objects should be stored in the location of the original write, or should be replicated around the world (more expensive). We probably don't need to globally replicate objects that are only read from one location so maybe we can introduce some user costs here for where the object should be stored so they can pay themselves for faster read throughput. Or if we didnt want to expose this to the user (which dropbox doesn't I think), we could figure out an algorithm to determine where traffic is coming from most often and then distribute the data to those hotter locations retroactively. For objects with a lot of traffic, the system will eventually replicate the data to hotspots.
You could directly expose a download or upload link from cloud storage to the client with signed URLs and skip a middle layer where frontend service streams bytes back to the client.

Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...

See conversation on replication of objects and global vs regionalized deployment of frontend service. Same conversations could be had for the database though we probably want the data to be globally distributed in this case since costs are likely lower.
The client could also just poll (pull only model) instead of have this sync service push + pull model. I believe the push pull model allows for lower pull requests and more push requests, which is a more efficient system since there can be many pulls without any data updates, especially given the WPS.

Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

Already addressed above, bottlenecks are likely in how data is replicated around the world and where the service is deployed.

Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?

We need logging in the frontend service for debugging

We need an analysis of the push pull model, how often do clients have to re-pull data and get out of sync. This can be done through metrics.

We need an analysis of how effective the caching mechanism introduced is.

We need an analysis of latency at various components to identify slow components, what if DB queries get slow, do we need to consider indexing on keys?