System requirements


Functional:

  • User can load/upload files.
  • Modify content.
  • Share files with permissions (Read-only/Write-only etc.).



Non-Functional:

  • Availability
  • Durability.
  • Scalability.
  • Consistency.
  • Security.
  • Low latency.




Capacity estimation


. Blob Storage (File Content)

  • Average file size: 10 MB
  • Uploads per user per day: 10
  • Active users: 100,000
  • Daily upload volume: 10 TB/day
  • Annual storage needed: ~3.65 PB/year

Metadata Database

a. Files table:

  • Rows/year: ~365 million
  • Row size: ~300 bytes
  • Total: ~110 GB

b.File versions table:

  • Rows/year: ~550 million (assuming 1.5 versions per file)
  • Row size: ~400 bytes
  • Total: ~220 GB

c. Other tables

  • Estimated size: ~100–150 GB






API design

File APIs:

POST /files – Create a new file upload (get upload URL)

PUT /uploads/{upload_id}/chunks/{part_number} – Upload a chunk

POST /uploads/{upload_id}/complete – Finalize chunked upload

GET /files – List user’s files

GET /files/{file_id}/download – Download a file

DELETE /files/{file_id} – Delete a file


Versioning:

GET /files/{file_id}/versions – List versions of a file

GET /files/{file_id}/versions/{version_id} – Get a specific version


Sync & Devices:

POST /devices – Register a new device

GET /sync – Get list of files needing sync

POST /sync/ack – Acknowledge synced files


Jobs:

GET /jobs/{job_id} – Get job (e.g. virus scan) status








Database design


Database tables:

Table users {

user_id <-key

email

}


Table files {

file_id <- key

user_id <- reference to user

name

type

size

version

created_at

updated_at

}


Table file_version {

version_id <- key

version

file_id <- refernce to files

blob_path

status

checksum

uploaded_at

user_id <- reference to users

}


Table chunks{

chunk_id <-- key

version_id <- reference to vestions

number

blob_path

created_at

size

}







High-level design


  • CDN
  • API Gateway - load balancer, router, auth, rate limiter and ssl termination.
  • File uploader - writes metadata of file to Metadata DB and returns URL to chunk uploader.
  • Chunk uploader - uploads data in blob store.
  • Blob store - stores user content (AWS S3).
  • Metadata - stores file metadata (Postgres).
  • Sync service - synchronizes data on the server and on the client device.
  • Async job - notify all devices.




Request flows

Entry Point

  • Client interacts with the system.
  • CDN (Content Delivery Network) is used for caching and accelerating delivery of static assets or downloads.
  • API Gateway serves as the main entry point, routing requests to appropriate services.


Upload Flow

  1. Client initiates an upload request via the API Gateway.
  2. API Gateway routes the request to the File Upload Service.
  3. File Upload Service:
    • Registers the upload and generates an uploadID.
    • Calculates chunking strategy based on file size.
    • Returns chunk upload URLs (or endpoint) to the client.
  4. Client uploads file chunks directly to the Chunk Uploader.
  5. Chunk Uploader:
    • Stores chunks in the Blob Store.
    • Records upload metadata (e.g., file name, size, parts) in the Metadata DB.
  6. Once upload is complete:
    • Metadata and event notifications are sent to the Sync Service and/or published to the Message Queue.


Download Flow

  1. Client sends a download request via the API Gateway.
  2. API Gateway routes the request to the File Download Service.
  3. File Download Service:
    • Fetches file metadata from the Metadata DB.
    • Coordinates with Chunk Downloader to retrieve individual file chunks from the Blob Store.
  4. Chunks are reassembled and streamed back to the client.


Syncing & Background Processing

  • Sync Service maintains consistency across devices, triggering updates or syncing actions as needed when files change.
  • Async Jobs perform background processing such as:
    • Virus scanning
    • Preview/thumbnail generation
    • File indexing or OCR
  • These jobs are triggered via events published to the Message Queue, ensuring non-blocking and scalable task execution.







Detailed component design

  1. API Gateway – Routes traffic, handles SSL, rate limiting.
  2. Auth Service – Verifies users/devices via JWT.
  3. File Upload Service – Manages file creation, versions, metadata.
  4. Chunk Uploader – Handles actual file uploads to blob storage.
  5. Blob Store (S3) – Stores raw file data (blobs).
  6. Metadata DB – Stores file info, versions, sync state, jobs.
  7. Sync Service – Tracks device sync state, lists deltas.
  8. Message Queue (Kafka/SQS) – Handles async jobs (scan, sync).
  9. Worker Services – Process background tasks like virus scanning.
  10. Device Manager (optional) – Manages user devices.
  11. Notification Service (optional) – Pushes updates to devices.






Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

1. Delta Sync (Block-Level)

  • Sync only changed parts of large files (e.g., using Rsync or binary diff).
  • Greatly reduces upload/download costs.

2. End-to-End Encryption

  • Encrypt files on client side before upload.
  • Improves privacy even from backend access.

3. Content Deduplication

  • Avoid storing duplicate file content using hash-based checks (e.g. SHA-256).
  • Saves storage and bandwidth.

4. Cold Storage Tiering

  • Move old versions to cheaper, slower storage (e.g., S3 Glacier).
  • Reduces cost for long-term retention.

5. Preview & Thumbnail Generation

  • Auto-generate image previews, PDF pages, video snapshots.
  • Improves UX on web/mobile clients.

6. Real-Time Sync with WebSockets

  • Use WebSockets or gRPC streams for instant device updates instead of polling.

7. Multi-Region Sync Support

  • Replicate blob data and metadata across regions for better latency and availability.

8. User & Team Sharing

  • Add sharing permissions, folder collaboration, and public links.

9. Audit Logging

  • Log user activity: uploads, downloads, deletes, syncs.
  • Useful for enterprise use cases.

10. Admin Dashboard & Analytics

  • File stats, usage trends, sync failures, storage consumption.