System requirements
Functional:
- Users should be able to create, manage directories and sub-directories. By default, user has root directory.
- Users should be able to CRUD files inside directories
- Users should be able to share files with other users
- Users might have many devices, so we need to synchronize files between devices
- Clients: mobile apps or web
- User tiers: free, paid, enterprise
- Quota:
- free - 10Gb
- paid/enterprise - based on payment (up to 2 Tb)
- File size: up to 100 Gb
- Conflict resolution
- Versioning
Non-Functional:
- Make sure that once file saved we don't lose it, can read back
- Scalable
- High Availability
Capacity estimation
100 millions of total users globally
1 million daily users
avg each user save 1 file per day
avg 2 devices per user: will need 1 write and 1 read
avg 10Gb per user
avg 10 Mb per file
---
Disk space needed:
100 mil users * 10 Gb = 1 mil Tb for assets
Plus some storage for metadata
Traffic/Bandwidth:
1 mil DAU / 24*60*60 = 12 rps
12 rpc * 10 Mb = 120 Mb/sec or 960 MiB/sec
API design
GET /api/v1/assets?dir=
Response:
{
dirs: [{"id", "name", "created", "size"}],
files: [{"id", "name", "created", "updated"}]
}
POST /api/v1/assetPutRequest
POST /api/v1/assetPut
PUT /api/v1/assetPutRequest/{assetId}
DELETE /api/v1/assetPutRequest/{assetId}
POST /api/v1/dir
PUT /api/v1/dir/{dirId}
DELETE /api/v1/dir
Database design
Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?