Requirements
Functional Requirements:
- Allow users to tweet messages up to 140 characters.
- Enable users to follow other users.
- Allow users to like tweets from other users.
- Display tweets from followed users in the home feed.
- Show top K popular tweets in the home feed based on likes and followers.
Non-Functional Requirements:
- High availibility
- Low latency for critical user interactions
- scalibility
Capacity Estimation
Lets say we have 100M daily active users.
Assume 5% of the users tweet on average 1 tweet per day
So we have 5M tweets per day being posted
So write throughput for tweets is 5M/(24*60*60) = 60 tweets per second
Storage throughput per day = 5M * 100 bytes per tweet = 500M = 0.5GB per da
Lets say on average we 5 replies and 20 favorites and 100 reads per day
replies throughput = 300 per second
favorites throughput = 20*60 = 1200 per second
Reads throughput = 60k qps
API Design
Authentication : -
/login
/signup
/verify
Tweet service :
/post
/{userId}/getTweets
High-Level Design
Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.
Database Design
Since we require large scalibility and transactional support is not needed we are better off going off with NoSQL.
Within NoSQL if we have to choose a database we need to make sure it is easily scalable and is better off handling heavy read throughput
Options :-
MongoDB - requires sharding for write throughput scaling but very easy to scale for read throughut. 60 tweets per second should be handled by one master node but I am worried about data storage scaling.
Amazon DynamoDB - cloud managed scaling
Cassandra - very easily scalable but more optimized for write throughput
If we have access to aws I will go with dynamoDB instead we will go with mongodb and do manual sharding
lets go for mongodb
query patterns :
Find tweets posted by a specific user
Get all followers of a user:
Mark tweet X as favorited by user U
Get number of favorites
collections:
User collection:
{
id,
name,
metadata : {},
following:[], // list of userIds
followers:[], // list of usersIds
tweets:[] // list of tweetIds
}
Tweets collection :
{
id,
createdDate,
modifiedDate,
content,
likedBy : [] , // list of userIds
}
users collection:
Detailed Component Design
Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.