System requirements


Functional:

Upload file(text file, photos, videos)

Share file(each file has its sharable link) - only consider if the user can view the file or not.


Collaborating editing

Multi-versioning

Multi-level access control(view, comment, edit)


Non-Functional:

High availability

Low latency

Data storage backup - no data loss.



Capacity estimation

Assumen 100,000 ADU - upload file every minute

100,000 / 60 = 13000 TPS ~= 10K TPS

QPS = 5*TPS = 65000 QPS ~= 100K QPS

File size: 100KB max

10K * 60 * 60 * 24 = 864M Files / day = 86.4 TB/year - huge storage - BLOB storage for static file.



API design

GET openFile {file_link, user_id} :

return the file if user has the access.

return 403 Forbidden if user doesn't have the access to this file.

return 500 if server is down, or any other internal error.


POST uploadFile {file(BLOB), user_id} :

return the file_link if file is uplodaded successfully.

return 413 if file is larger than the limit.

return 500 if server is down, or the file failed to upload or any oter internal error.


PUT updateAccess {file_link, access(Object)}:

return 200 if the access is updated successfully.

return 500 if server is down, or any other internal error.


access: {

type: ACCOUNT, TEAM, ALL

type_id: id

}


Database design

BLOB storage: {

url: String

file: BLOB

}


SQL:

file_metadata: {

file_url: String (PK)

owner_id: String (FK)

created_at: timestamp

updated_at: timestamp

name: String (Search Index)

}


user: {

user_id: Long(PK)

name: String

team_id: Long(FK)

}


team: {

team_id: Long(PK)

team_name: String

}


NoSQL:

access_graph: {

file_url: (Search Index)

user_id: String (Search Index)

}



High-level design

Please see High-Level Diagram





Request flows

openFile:

After the request arrives in load balancer, it will route to access service by following the load balancing algorithm. Access service will check if the current user_id has the access to this file_url. If have, request file service to download the file. If doesn't have access, return 403 Forbidden.


uploadFile:





Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...






Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?