System requirements


Functional:

List functional requirements for the system (Ask the chat bot for hints if stuck.)...

User needs to be able to upload files

User needs to be able to download files

User needs to be able to share files with other users and view shared files


Non-Functional:

List non-functional requirements for the system...


System prioritizes consistency over availability - user can't download file before edits if file is shared


Scale to handle millions of users


System needs to have low latency for upload/download and file viewing




Capacity estimation

Estimate the scale of the system you are going to design...


~700 million users

50GB of files for each user


~35000000000 GB total




API design

Define what APIs are expected from the system...


uploadFile(file, userID)

viewFile(fileID, userID)

shareFile(fileID, shareWith)

downloadFile(fileID)




Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...


Amazon S3 for videos/images or any unstructured data


Amazon EFS for actual files etc.



High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...



Client will make a request, request goes through CDN for regional support. Will reach load balancer to distribute load across our fleet of servers/api gateways.


Will have microservice architecture. uploading, downloading and viewing.


for fileuploading, we will have a kafka stream to help queue up file uploading. this ensures scalability when millions of users are trying to upload files at the same time. Helps support our low latency non functional requirement.


for downloading, we will also have a kafka in between. For when millions of users are requesting downloads this will help efficiently queuing of requests so database isn't overwhelmed. Files will go through kafka and returned to user.


We will use s3(blob storage) because it best supports nonstructured data sine users are able to upload videos and pictures. Will use amazon efs for file storage.




Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...






Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...






Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?