System requirements


Functional:

  • post tweet
  • view updates of other users
  • like
  • comment
  • notifications
  • follow user



Non-Functional:

  • fast read than write
    • cache for timeline
  • availability > consistency => eventual consistency
    • async update for writing
  • scalability
    • load balancer
  • fault tolerance
    • replication of server and db



Capacity estimation

  • 1M users
  • 280 per a tweet
  • 10 tweet per a day
  • =1M * 10 * 280 = 2.8GB/day


API design

POST /create/tweet

{

"content" : String,

"createBy" : timestamp,

"userID" : String

}


GET /get/timeline?userID=


POST /like/tweet?

{

"postID" : String,

"userID" : String

}


POST /comment/tweet?

{

"postID" : String,

"userID" : String,

"comment" : String

}


POST /follow/user?

{

"follower" : String,

"followee" : String

}



Database design

[User]

ID, String, primary key

name, string

address, string


[tweet]

ID, String, primary key

content, String

createdBy, String

createdAt, timestamp

updatedAt, timestamp

like, integer


[comment]

ID, String, primary key,

tweetID, String,

content, String,

createdBy, String,

createdAt, timestamp,

updatedAt, timestamp


[follow]

Follower, String

Follewee, String


High-level design


I draw in high level diagram





Request flows

read flow

  • user->load balancer->server->cache->database

write flow

  • user->load balancer->server->worker->database and notification service -> user





Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...






Trade offs/Tech choices

cache aside

  • user can view stale data until cache ttl expired






Failure scenarios/bottlenecks

worker failture

  • can't update to database
  • user can't lose his content

cache failture

  • can access through database

cache bottlenecks

  • large request to cache and store data can be too big


Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?