Design Twitter - System Design

System requirements

Functional:

Post new tweet
Add comment to tweet, comments on comments,
Like/dislike tweet
Follow someone
Create/delete/login account, manage account profile.
Support multi media tweet, text, emoj, video, image,
Personal discovery page: ranking top tweets from following account,
Ranking of comments,
Notification system for new tweet, new follower, new comments
Private message between users.

Optional & Advanced:

Identity verification
Fraud account detection
Fake news & suspicious activity detection and action
Ads: personal profile, ads platform, recommendation system, billing, ....
Hate speech and other violence detection.
Web client, iOS client, Android client
Support re-tweet
Support sharing,
Support Hashtag

Non-Functional:

High availability
Eventual consistency
High scalability
logging
Performance monitoring metrics: latency, utilization, traffic monitoring, DAU/MAU,

Capacity estimation

The design should meet billion of users' need globally. Thus scalability, region, are considered

Assumption: 1% of the users have a lot of followers, read heavy/comment heavy/like heavy, the celebrities contribute to most of the tweets.

API design

Request: PUT twtr/v1/tweet

{tweet_content, user}

Response:

{tweet_id, status}

Error: too long; compliance_error

Request: PUT twtr/v1/comment

{tweet_id, comment_content}

Response: {comment_id, status}

Error: too long; compliance_error

Request: POST twtr/v1/like

{tweet_id, comment_id}

Request: PUT twtr/v1/follow

{follower_id, followee_id}

Request: GET twtr/v1/GetUpdates

{user_id}

Response: a list of tweets

Request: GET twtr/v1/GetUser

{user_id}

response: detail about the user: # of follower, # of followee, recent activities, status: followed or

Request GET twtr/v1/GetTweet {tweet_id}

Request Get twtr/v1/Search {keyword, user_id, hashtag}

Database design

We can use Amazon Dynamo DB: both key value store and document DB, which is also highly available and high scalable, easy to deploy globally.

We can also use PostgreSQL to store user table

Use ElasticSearch to index tweets and comments for keywords search

Tables:

Tweets:

tweet_id unique primary key

tweet_content

Comment:

comment_id: primary key

tweet_id FK

commented_id: FK

Like:

comment_id or tweet_id primary key FK

enum_type: like,dislike

enum: comment, tweet

User:

user_id:

phone number

Following:

follower FK

followee FK

(follower, followee) unique

High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...

Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...

Detailed component design

GetUserDetail: periodical update for top 1% of users about their follower counts. Could be stale for some time but accepatble to avoid expensive DB aggregation.

GetUpdates: offline calculation for top active users; realtime calculation for users who doesn't often read.

Ranking of Tweet: many signals: # of views, # of likes & dislike,

personal signals: clustering based on similar users; personal correlation with hashtags; previous activities with the followee;

PostProcessTweet: use ElasticSearch for keywords search; add system labels; add signals and relationship to the social graph.

Likes:

Likes are counted periodically. Not necessary to be very accurate.

Notification:

for users who turned on notification: use SSE(ServerSentEvent) to send new update to users. Don't use WebSocket or long polling because it is expensive to maintain connection. Use Android or iOS system notification

Video process:

video understanding; add labels; different version of videos for different network/client environment

Trade offs/Tech choices

Dynamo DB: flexible schema, friendly for documents storage and process; key value storage; horizontal scalable; high availability;

MySQL: users table

Video/Image: S3, CDN,

Cache: Redis, LRU eviction

ElasticSearch: index tweets and comments

Failure scenarios/bottlenecks

For top tweets, leverage cache, CDN to make them available globally.

There could be many concurrent likes globally. We can use a Kafka message queue to queue all the like events and aggregate them.

Hotspot: some very popular tweets: Consistent hash, amazon elastic loadbalancer to add more server, auto scale, global replicas

DB is down or server is down: replicas, recovery

Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?