System requirements


Functional:

  • User can load/upload files.
  • Modify content.
  • Share files with permissions (Read-only/Write-only etc.).



Non-Functional:

  • Availability
  • Durability.
  • Scalability.
  • Consistency.
  • Security.
  • Low latency.




Capacity estimation

Estimate the scale of the system you are going to design...






API design

Define what APIs are expected from the system...






Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...






High-level design


  • CDN
  • API Gateway - load balancer, router, auth, rate limiter and ssl termination.
  • File uploader - writes metadata of file to Metadata DB and returns URL to chunk uploader.
  • Chunk uploader - uploads data in blob store.
  • Blob store - stores user content (AWS S3).
  • Metadata - stores file metadata (Postgres).
  • Sync service - synchronizes data on the server and on the client device.
  • Async job - notify all devices.




Request flows

Entry Point

  • Client interacts with the system.
  • CDN (Content Delivery Network) is used for caching and accelerating delivery of static assets or downloads.
  • API Gateway serves as the main entry point, routing requests to appropriate services.


Upload Flow

  1. Client initiates an upload request via the API Gateway.
  2. API Gateway routes the request to the File Upload Service.
  3. File Upload Service:
    • Registers the upload and generates an uploadID.
    • Calculates chunking strategy based on file size.
    • Returns chunk upload URLs (or endpoint) to the client.
  4. Client uploads file chunks directly to the Chunk Uploader.
  5. Chunk Uploader:
    • Stores chunks in the Blob Store.
    • Records upload metadata (e.g., file name, size, parts) in the Metadata DB.
  6. Once upload is complete:
    • Metadata and event notifications are sent to the Sync Service and/or published to the Message Queue.


Download Flow

  1. Client sends a download request via the API Gateway.
  2. API Gateway routes the request to the File Download Service.
  3. File Download Service:
    • Fetches file metadata from the Metadata DB.
    • Coordinates with Chunk Downloader to retrieve individual file chunks from the Blob Store.
  4. Chunks are reassembled and streamed back to the client.


Syncing & Background Processing

  • Sync Service maintains consistency across devices, triggering updates or syncing actions as needed when files change.
  • Async Jobs perform background processing such as:
    • Virus scanning
    • Preview/thumbnail generation
    • File indexing or OCR
  • These jobs are triggered via events published to the Message Queue, ensuring non-blocking and scalable task execution.







Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...






Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?