System requirements


Functional:

  • Get a feed of tweets aggregated from a list of followers and popular posts
  • be able to follow and unfollow a user
  • post pictures or videos potentially?
  • search for tweets
  • secure login



Non-Functional:

  • high availability
  • quick reads
  • eventual consistency on posts



Capacity estimation

  • assume 100 million daily active users
  • assume 1 out of 10 users post tweets per minute
  • 1 tweet will have max 160 b we should have around around 160 gb worth of tweets a day
  • each tweet will have another 160b or so of meta data for uid, likes, retweets, and etc, doubling the amount of storage a day approximately
  • band with we would want at least 1 mpbs per user to load multiple tweets and tweets meta data
  • if we have photos or videos then we would have to consider a blob storage based on limits we set on media posts



API design

  • postTweet(uid, content)
  • post request
  • getTweetFeed(uid)
  • get request
  • postTweetMedia(uid, content, fileType)
  • post request
  • searchTweet(content)
  • get request
  • getFollowers(uid)
  • get request
  • followUser(uid, followingUid)
  • put request



Database design

  • For the database we can have a followers table
  • uid | followingUid
  • We can have a posts table
  • uid | tweetid | content | media link to blob storage
  • likes & retweets table
  • uid | likes | retweets|
  • User profile table
  • uid | email | age |





High-level design

We will have some algorithm figuring out which posts have the highest engagement hourly and pull that into the cache for users to to pull into their own feed. Their feed will also have cached a list of who they follow indexed on uid for our followers table. That way we can quickly figure out who they follow and pull latest posts from them.


Another thing we can do is for popular users with many followers we can cache their posts as well to fan out to users since they have a large amount of followers these posts being cached will save us a lot of time for when their followers log on and pull posts.




Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...






Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...






Trade offs/Tech choices

Main trade off we have is for fast reads vs writes, since most of of our users will be reading tweets rather than creating them. Its fine if the tweet isn't shown to everyone right after they post.




Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?