System requirements


Functional:

  1. Upload File: Users can upload files and receive a URL for the file.
  2. Download File: Users can download files using the provided URL.
  3. List Files with Pagination: Users can list all their uploaded files, supporting pagination for better performance and usability.
  4. Search for a File: Users can search for specific files by name or metadata.
  5. Share a file with other users or the general public



Non-Functional:

  1. Highly available
  2. Eventually consistent for list and search views. Upload should be atomic. Restricting Permissions should be strongly consistent
  3. Durable - data should be replicated geographically to avoid loss
  4. Retention - based on policy, keep for X days post deletion
  5. Bandwidth capped at 1GBPs per user
  6. Latency under 1s


Capacity estimation

1B active users

avg 100MB

1PB storage


300 downloads/s


Aggregate bandwidth for upload/download 300 GBps




API design

Define what APIs are expected from the system...






Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...






High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...






Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...






Detailed component design

Multi-part uploads for large files

Smart client that scales threads to upload multipe files in parallel4

Checkpoint storage


Archive infrequently accessed files


Scaling metadata service


Handling deletes and permission changes in a consistent fashion - use leases on metadata servers



Trade offs/Tech choices

Tradeoff performance vs reliability vs cost - use RAID10 for enterprise, RAID0 for retail







Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?