System requirements
Functional:
- Post new tweet
- Add comment to tweet, comments on comments,
- Like/dislike tweet
- Follow someone
- Create/delete/login account, manage account profile.
- Support multi media tweet, text, emoj, video, image,
- Personal discovery page: ranking top tweets from following account,
- Ranking of comments,
- Notification system for new tweet, new follower, new comments
- Private message between users.
Optional & Advanced:
- Identity verification
- Fraud account detection
- Fake news & suspicious activity detection and action
- Ads: personal profile, ads platform, recommendation system, billing, ....
- Hate speech and other violence detection.
- Web client, iOS client, Android client
- Support re-tweet
- Support sharing,
- Support Hashtag
Non-Functional:
- High availability
- Eventual consistency
- High scalability
- logging
- Performance monitoring metrics: latency, utilization, traffic monitoring, DAU/MAU,
Capacity estimation
The design should meet billion of users' need globally. Thus scalability, region, are considered
Assumption: 1% of the users have a lot of followers, read heavy/comment heavy/like heavy, the celebrities contribute to most of the tweets.
API design
Request: PUT twtr/v1/tweet
{tweet_content, user}
Response:
{tweet_id, status}
Error: too long; compliance_error
Request: PUT twtr/v1/comment
{tweet_id, comment_content}
Response: {comment_id, status}
Error: too long; compliance_error
Request: POST twtr/v1/like
{tweet_id, comment_id}
Request: PUT twtr/v1/follow
{follower_id, followee_id}
Request: GET twtr/v1/GetUpdates
{user_id}
Response: a list of tweets
Request: GET twtr/v1/GetUser
{user_id}
response: detail about the user: # of follower, # of followee, recent activities, status: followed or
Request GET twtr/v1/GetTweet {tweet_id}
Request Get twtr/v1/Search {keyword, user_id, hashtag}
Database design
We can use Amazon Dynamo DB: both key value store and document DB, which is also highly available and high scalable, easy to deploy globally.
We can also use PostgreSQL to store user table
Use ElasticSearch to index tweets and comments for keywords search
Tables:
Tweets:
tweet_id unique primary key
tweet_content
Comment:
comment_id: primary key
tweet_id FK
commented_id: FK
Like:
comment_id or tweet_id primary key FK
enum_type: like,dislike
enum: comment, tweet
User:
user_id:
phone number
Following:
follower FK
followee FK
(follower, followee) unique
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
Detailed component design
GetUserDetail: periodical update for top 1% of users about their follower counts. Could be stale for some time but accepatble to avoid expensive DB aggregation.
GetUpdates: offline calculation for top active users; realtime calculation for users who doesn't often read.
Ranking of Tweet: many signals: # of views, # of likes & dislike,
personal signals: clustering based on similar users; personal correlation with hashtags; previous activities with the followee;
PostProcessTweet: use ElasticSearch for keywords search; add system labels; add signals and relationship to the social graph.
Likes:
Likes are counted periodically. Not necessary to be very accurate.
Notification:
for users who turned on notification: use SSE(ServerSentEvent) to send new update to users. Don't use WebSocket or long polling because it is expensive to maintain connection. Use Android or iOS system notification
Video process:
video understanding; add labels; different version of videos for different network/client environment
Trade offs/Tech choices
Dynamo DB: flexible schema, friendly for documents storage and process; key value storage; horizontal scalable; high availability;
MySQL: users table
Video/Image: S3, CDN,
Cache: Redis, LRU eviction
ElasticSearch: index tweets and comments
Failure scenarios/bottlenecks
For top tweets, leverage cache, CDN to make them available globally.
There could be many concurrent likes globally. We can use a Kafka message queue to queue all the like events and aggregate them.
Hotspot: some very popular tweets: Consistent hash, amazon elastic loadbalancer to add more server, auto scale, global replicas
DB is down or server is down: replicas, recovery
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?