System requirements


Functional:

send tweets

check feeds

add likes to tweets



Non-Functional:

availability

scalability

low latency



Capacity estimation

1 billion users

daily active users 10%, so 100 million

every user send 2 tweets per day, so 200 million tweets per day

every user follows 1000 users, so 5000 billion follow relationships

every user likes 10 tweets per day, have 1 billion likes per day



API design

POST v1/user_id/tweets/tweet_id

GET v1/user_id/feeds

POST v1/user_id/tweet_id/is_liked

POST v1/user_id/following_user_id/is_following




Database design

Use graph database to store relationship between users and followers

User relational database to store user metadata and tweets metadata

User cache for popular tweets





High-level design

request first go through a load balancer, before reaching the server. For reads, it checks cache first, before reading from database, also checks CDN for static contents. For writes, write to database



Request flows

request first go through a load balancer, before reaching the server. For reads, it checks cache first, before reading from database, also checks CDN for static contents. For writes, write to database



Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...






Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?