System requirements
Functional:
- Web interface
- User has accounts
- Can create folders
- Upload any files into folders
- Download file
- Delete files / folders
- Only the owning user can view and download the file
- Handle parallel uploads with conflicting file names -- just rename (i.e. add suffix)
- Descoped
- Permission for sharing
- Versioning of files (upload same name file, keep older versions)
Non-Functional:
- Durability -- uploaded files should not be lost
- Security -- ensure files stay private and only owner can access
- Not latency sensitive
- Low cost to store and upload/download large amounts of files
Capacity estimation
Each user uses 50GB
10,000,000 users
500,000,000 GB total storage --> 500 Petabytes
10% users active per day reading and uploading
1,000,000 uploads per day --> 41k per hour -> 683 per minute -> 12 per second
10,000,000 reads per day
API design
CreateFolder
- parent_folder_id
ListFolder
UpdateFolder
DeleteFolder
/* or one at a time, b/c limiting factor is the upload. Client can rate limit how many parallel uploads to support */
RequestUpload
- this return a signed S3 URLs to multi-part upload the files to (assuming we don't need to pre-process the upload in our server)
CreateFile
- parent_folder_id
- uploaded_s3_key
GetFile
- return object includes a signed S3 URL that can be used to access the actual file
UpdateFile
DeleteFile
Database design
folders table
_id
parent_folder_id
user_id
name
metadata
primary key _id
items table
_id
parent_folder_id
user_id
name
other metadata
unique index on (parent_folder_id, name)
primary key _id
users table
_id
auth0_user_id
...
Database used for object metadata. Actual files stored in Cloud Object Store (something like S3).
Common queries
- List folder and files in a folder
- Get a file by id
High-level design
Use authN/authZ service (like auth0). On successful client token creation, create a record in our own user table if one doesnt exist.
- User clicks login in frontend
- Redirect to auth0 for authN
- User goes through auth flow and gets back some authorization token that can be used to authorize API requests
- Our API endpoints accept the authorization token, validate it, and can load user information from DB or auth0
Load balancer for horizontal scaling and zero downtime deploys
- In multiple AZs
- Single region
(potentially separate service) API gateway to validate authorization token.
- Separate if using services under the hood (i.e. Stripe has different services per API endpoint)
- Also useful for multi-region routing
Application server
- business logic
- Single region, but in multiple AZ
Database
- Sharded (shard by parent_folder_id, could also use user_id but more at risk for hot shard)
- Each shard has its own replicaset for availability
- One primary, 2 or more secondaries
- Secondaries can also offload read load -- if we are ok with eventually consistent reads
- Writes will write to primary (for consistency, esp w/ network partition, can write to majority -- sacrifice latency)
- Reads can be from single secondary (if ok with eventually consistent) or from majority to be strongly consistent
- Probably ok to be eventually consistent -- may have slight UI quirks if user refreshes page immediately after creating/deleting a file and before replication to secondary occurs
Cloud object store (i.e. S3) for cheap storage
- If our servers don't need to process the files (maybe virus scanning?), then can have browser directly upload to S3 to avoid costs on our servers (network and storage).
No CDN for file uploads
But can use CDN for webapp resources
Request flows
CreateFile
- Webapp makes api request to requestUpload
- Load balancer sends it to available application server
- Authorize API request with auth token
- Application server will
- will generate a unique key for S3 -- can store all files in flat bucket
- create signed S3 url that frontend can upload to
- Webapp starts multifile upload to S3 with signed url and manages upload status / retries
- Once upload completes, make api request to createFile
- App server will
- create new record in DB
- if there is a clash in file name, rename by appending suffix and retrying
- Return new file metadata
What happens if createFile fails?
- Orphaned S3 item.
- Create a background job that occassionaly scans through S3 bucket and deletes orphaned items (not in DB)
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?