System requirements


Functional:

List functional requirements for the system (Ask the chat bot for hints if stuck.)...


1) User should be able to upload file from any device.

2) User should be able to download file from any device.

3) User should be able to share files with other users and view the files shared with them.

4) User shhould be able to autosync across the devices


Non-Functional:

List non-functional requirements for the system...

1) System needs to be available(availability > consistency) as immediate/strong consistency is not the requirement here.

2) System should support large file uplods such as 50 GB

3) System should be secure as possible in terms of file sharing and recovering corrupt or lost files.

4) Upload/Download/Sync latency should be as low as possible.



Capacity estimation

Estimate the scale of the system you are going to design...






API design

Define what APIs are expected from the system...

Initial API Design(Subject to modified moving forward) -


POST /v1/files/upload


Request :

{

File,

FileMetadata

}


GET /v1/files/download/{file_id} : File


POST /v1/files/share/{file_id}

Request:

{

User[]

}


Fetching changes in file


GET /v1/files/changes/{file_id}: FileMetadata[]





Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...

3 core entities

-File

-user

-FileMetadata


FileMetadata -

file_id

size in bytes

file_name

uploaded_at

uploaded_by(user)

mime_type

status




High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...


1) UploadClient -> This client is responsible for calling upload api to uplload file on the backend

2 DownloadClient -> This client is responsible for calling download api to download file from the backend

3) API GAteway and LOad Balancer -> Load balancer like AWS Elastic Loadbalancer is responsible for routing request to appropriate healthy API gateway.


API Gateway is responsible for auth, rate limiting and routing request to file server instance.


4) File Server -> Horizontally scalable file server responsible for uploading file/downloading/synching .


5) FileMEtadata DB -> This database will store the Metdata related to uploaded file as described in the entities above.


6) BlobStorage DB -> This database like AWS S3 and Google Cloud Storage where the actual file will be stored.


7) CDN -> Content Delivery Newtork like AWS Cloudfront which will be responsible for storing/caching more recently fecthed files so that they can be served to users with low geographical latency.





Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...






Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...






Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?