System requirements
Functional:
- User can load/upload files.
- Modify content.
- Share files with permissions (Read-only/Write-only etc.).
Non-Functional:
- Availability
- Durability.
- Scalability.
- Consistency.
- Security.
- Low latency.
Capacity estimation
Estimate the scale of the system you are going to design...
API design
Define what APIs are expected from the system...
Database design
Database tables:
Table users {
user_id <-key
}
Table files {
file_id <- key
user_id <- reference to user
name
type
size
version
created_at
updated_at
}
Table file_version {
version_id <- key
version
file_id <- refernce to files
blob_path
status
checksum
uploaded_at
user_id <- reference to users
}
Table chunks{
chunk_id <-- key
version_id <- reference to vestions
number
blob_path
created_at
size
}
High-level design
- CDN
- API Gateway - load balancer, router, auth, rate limiter and ssl termination.
- File uploader - writes metadata of file to Metadata DB and returns URL to chunk uploader.
- Chunk uploader - uploads data in blob store.
- Blob store - stores user content (AWS S3).
- Metadata - stores file metadata (Postgres).
- Sync service - synchronizes data on the server and on the client device.
- Async job - notify all devices.
Request flows
Entry Point
- Client interacts with the system.
- CDN (Content Delivery Network) is used for caching and accelerating delivery of static assets or downloads.
- API Gateway serves as the main entry point, routing requests to appropriate services.
Upload Flow
- Client initiates an upload request via the API Gateway.
- API Gateway routes the request to the File Upload Service.
- File Upload Service:
- Registers the upload and generates an uploadID.
- Calculates chunking strategy based on file size.
- Returns chunk upload URLs (or endpoint) to the client.
- Client uploads file chunks directly to the Chunk Uploader.
- Chunk Uploader:
- Stores chunks in the Blob Store.
- Records upload metadata (e.g., file name, size, parts) in the Metadata DB.
- Once upload is complete:
- Metadata and event notifications are sent to the Sync Service and/or published to the Message Queue.
Download Flow
- Client sends a download request via the API Gateway.
- API Gateway routes the request to the File Download Service.
- File Download Service:
- Fetches file metadata from the Metadata DB.
- Coordinates with Chunk Downloader to retrieve individual file chunks from the Blob Store.
- Chunks are reassembled and streamed back to the client.
Syncing & Background Processing
- Sync Service maintains consistency across devices, triggering updates or syncing actions as needed when files change.
- Async Jobs perform background processing such as:
- Virus scanning
- Preview/thumbnail generation
- File indexing or OCR
- These jobs are triggered via events published to the Message Queue, ensuring non-blocking and scalable task execution.
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?