System requirements


Functional:

We need to support the following functionality:

  • user creates and posts a tweet
  • user start/end following another user
  • home feed with tweets from users followed by the user


Registrations, authorisation, notification functionalities are also very important, but we'll leave them out of scope for now.



Non-Functional:

We want the system to

  • have high availability
  • be scalable and handle peak loads
  • have low latency



Capacity estimation

  • 100k DAU
  • Peak values can reach x10+
  • 5 tweets per day on average
  • may scale along with the DAU (so x10+ also)


Let's assume a tweet is ~70 characters long, so it takes about 140B of storage, and every now and then (let's say 1/5 of all tweets) users post a photo (~5MB) then will need

10 ^ 5 * 4 * 70 = 28MB of storage per day for text content right now and up to x20 later on when DAU base and their activity has grown.

10 ^ 5 * 5 * 10 ^ 6 = 5 * 10 ^ 11 = 500GB of storage per day for storing photos, which we can reduce by preprocessing and optimising the original files.




API design

  • POST /api/v1/tweets/new - returns a status code with some metadata about the new tweet or an identifier for a processing status requests
  • PUT /api/v1/tweets/{tweet_id}/like - returns a status code
  • POST /api/v1/users/{user_id}/follow - returns a status code along with some metadata about the user followed
  • GET /api/v1/tweets/user_feed/{user_id} - returns a paginated collection of tweets for a specific user, according to a business logic (e.g. top K popular/newest)


Database design

We'll have the following entities:




High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...







Request flows

  1. User posts a tweet
  2. a request with a tweet text is sent to a Tweet service
  3. It either adds the tweet to the database, or sends photo content for processing and returns some identifier to the user which they can use to check processing status
  4. [Optional] Notification sent via Notification service about a new tweet
  5. User follows another user
  6. a follow request is sent to the Follow service
  7. the relation between the users is updated in the DB
  8. [Optional] Notification sent via Notifications service about a new follower
  9. User likes a tweet
  10. a request is sent to the Likes service
  11. likes state is updated in the DB
  12. [Optional] Notification is sent about new like
  13. User requests their feed
  14. a request is sent to the Feed service
  15. Feed service constructs a feed and returns it to the user




Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...






Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?