System requirements


Functional:

List functional requirements for the system (Ask the chat bot for hints if stuck.)...

  • For the purposes of this mock, we will assume there is a full fleshed out auth system for now. We can dive deeper into this if it affects the design.
  • For now, we will also skip the "local storage" part of these kinds of applications, and focus mainly on the cloud storage part.
  • Users can upload files
  • Users can replace files
  • Users can delete files
  • Users can download files available to them
  • Users can specify who this file is shared with - private (only them), specific users, or publicly available. Public files do not show up in a particular user's list of files, but can be accessed by the URL by anyone.
  • Users can remove/change shared settings for particular files.
  • When a file is shared with a user, it is added to their application, and they can download it.



Non-Functional:

List non-functional requirements for the system...

  • The system should be prioritize availability over consistency - if it takes extra time for a file to get uploaded, and a shared file to show up in the shared user's space, it is okay.
  • The system should be able to a high amount of traffic of uploads and downloads.


Capacity estimation

Estimate the scale of the system you are going to design...

  • We'll assume 100M DAU for now, but we won't focus too much on this unless it ends up affecting our design. We just need to know we need to handle "a lot" of users.




API design

Define what APIs are expected from the system...

  • getFiles()
  • getSignedUrl(data)
  • uploadFile(fileData)
    • This will actually be a combination of getSignedUrl() from our blob storage, and then a call to uploadFile() with that URL once the upload is complete
  • replaceFile(fileId, fileData)
    • similar here, this will first upload a file from getSignedUrl()
  • deleteFile(fileId)
  • shareFileToUser(fileId, userId)
  • changeFileShareType(fileId, shareType)



Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...


We'll use a RDB for our database, since we will have tightly coupled relationships between users and their files


User

  • userId: uuid


File

  • id: uuid
  • owner_id: fk(user)
  • url: blob URL
  • share_type: enum (private | users | public)


File_Share

  • file_id: fk(file)
  • user_id: fk(user)


We will also have some blob storage option for our files - when a file is uploaded,




High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...


FileService

  • have the base logic for handling our CRUD operations for files
  • Will handle basic auth for who can perform these actions, but we won't focus on that in this interview.


FeedCache

  • This will cache the files available to a particular user.
  • When a new file is added, we will prioritize updating the uploader's feed, then processing it for any shared users


Events

  • File Shared
  • File ShareType Changed


Feed Worker

  • We will have an async worker that will process events from a message queue to update the "feeds" of shared users. If a file gets added or removed from that user's "feed" this worker will go an update their permission.




Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...






Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...


FeedWorker

  • This will populate the FeedCache, which will update the list of files for our users
  • When a file is created/shared/updated/removed, it will trigger an async message to this worker, which will update the in memory cache for our users
  • For the caller, we will update this in real time, going directly to the DB, so that the UI reflects this change immediately
  • But for "shared" users, based on the shared list this file will show up in their list of files, and will be updated by the async process for eventual consistency.


Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...


One "tradeoff" is that if the user refreshes their feed after uploading their file, it is possible that the feed worker has not processed their update yet, and their feed will show the old one. One way we could mitigate this is to update the feed cache or invalidate the cache for the original user, so that on a refresh it will fetch the most updated information.




Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.


  • MessageQueue goes down
    • Here, this can cause the feed cache to be outdated quickly. We could mitigate this by having a shorter TTL on the cache, so that if the cache gets outdated, we will refresh it occasionally.
    • We can also use a "outbox" pattern of events, which store the events in our DB. That way,. the Queue will be able to process any "missed" messages once it comes back online.
  • Cache does down
    • We have already addressed this by falling back to the DB, but the problem would be the large spike in calls to the DB if this occurs. We can mitigate this by have multiple cache fallbacks, as well as DB infrastructure that can handle the




Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?


I think we can consider a CDN for "publicly" available files to make that more readily available everywhere regionally.