System requirements
Functional:
send tweets
check feeds
add likes to tweets
Non-Functional:
availability
scalability
low latency
Capacity estimation
1 billion users
daily active users 10%, so 100 million
every user send 2 tweets per day, so 200 million tweets per day
every user follows 1000 users, so 5000 billion follow relationships
every user likes 10 tweets per day, have 1 billion likes per day
API design
POST v1/user_id/tweets/tweet_id
GET v1/user_id/feeds
POST v1/user_id/tweet_id/is_liked
POST v1/user_id/following_user_id/is_following
Database design
Use graph database to store relationship between users and followers
User relational database to store user metadata and tweets metadata
User cache for popular tweets
High-level design
request first go through a load balancer, before reaching the server. For reads, it checks cache first, before reading from database, also checks CDN for static contents. For writes, write to database
Request flows
request first go through a load balancer, before reaching the server. For reads, it checks cache first, before reading from database, also checks CDN for static contents. For writes, write to database
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?