System requirements


Functional:

  1. Upload and download files from any device
  2. Share files or folders to other users
  3. Automatic synchronization between devices
  4. Files are up to 1 GB
  5. ACID-ity
  6. Offline editing


Non-Functional:

  1. System must always be available
  2. System must not be latent


Extended requirements: Snapshotting of data so that users can go back to any version of the files


Capacity estimation

  1. Assume read to write is 100:1
  2. Assume in 1 month there are 5M new files, 500M reads
  3. Traffic
  4. New files: 5M / ( 30 * 24 * 60 ) = 360K files a minute
  5. Reads: 500M / (30 * 24 * 60 ) = 3.6M reads a minute
  6. Storage
  7. 5M * 1GB = 5TB a month
  8. Bandwidth
  9. 360K files * 1 GB = 160MB a minute to write
  10. 3.6M files * 1 GB = 3TB a minute to read


API design

  1. createFile
  2. params:
  3. userKey: string
  4. fileName: string
  5. body: content
  6. returns:
  7. 200: fileId, fileUrl
  8. 400: Bad request
  9. updateFile
  10. params:
  11. fileId: int
  12. Body:
  13. File content
  14. Returns
  15. 200: Successful update
  16. 400: Bad request
  17. getFile
  18. params
  19. fileId: int
  20. versionId: int
  21. Returns:
  22. 200: file url: string
  23. 400: Bad request


Database design

  1. File storage DB - Blob storage
  2. We will use a blob storage to store files because it can easily be horizontally scaled.
  3. User and file metadata DB - SQL
  4. We will use a relational database so that we can have ACID-ty
  5. It allows us to query for all the user's files and get accurate data for when files were last updated


File table

  • fileId:int
  • fileName:string
  • userId:int
  • createDate:datetime
  • lastUpdateDate:datetime
  • parentFolderId:int


Version table

  • versionId:int
  • fileId:int
  • updatedDate:datetime
  • fileUrl: string


User table

  • userId: int
  • createDate: datetime


Device table

  • deviceId:int
  • ipAddress:string
  • lastOnline:datetime
  • userId:int


High-level design

  1. Client -> Application service
  2. Application service -> Metadata table
  3. Application service -> File storage




Request flows

  1. Uploading files:
  2. The client will make a createFile request and include the file in the body. The API will respond with 200 if it was successfully uploaded, or a 400 if it was unable to have a successful upload.
  3. Download files:
  4. The client will make a getFile request and include the fileId in the params. The API will return the url of the file and the client can make another request to the url to get the content.
  5. To share files or folders:
  6. The client can make a getFile request to get the url of the folder/file.
  7. Automatic synchronization:
  8. When the client tries to download a file, it will check the file metadata for when it was last updated and read the latest from the file storage.
  9. Offline editing:
  10. Once the user comes online, we check for any changes in the files in the current device. If any is found, we make a updateFile request and upload the file to the server, which will update the file storage and the metadata.
  11. The client will also fetch for any updated files while it was offline. We can capture when the user last went offline and see if there are any files that were updated between the time.
  12. What happens if a file was updated in the file storage and on a offline device? We will keep the file that was last updated based on the timestamp of the upload.




Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...






Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?