Design Twitter - System Design

System requirements

Functional:

post tweet
follow user
share tweet
like tweet
tweets feed

Non-Functional:

highly available
highly scalable
low latency

Capacity estimation

10M DAU

10 M tweets created per day

each tweet 100 bytes

10^3 MB = 1GB data

5 years ~= 200GB to store tweets

1M favorite tweet per day, 16 bytes to record userid + tweeid

16 MB per day

365 * 16 ~= 5 GB per year to store favorite tweets

5 year ~= 30 GB

API design

POST api/relationship/follow

request:

{

auth-header,

body: {

followeeId

}

GET api/relationship/followers

response {

[{userId, name}]

}

GET api/relationship/followee

response {

[{userId, name}]

}

POST api/tweets

request:

{

auth-header,

body: {

content,

media

}

response:

200

{

tweetId,

tweet: {

content

likes: 0,

}

GET api/tweets/{userId}/?page=x&offeset=y

response 200

{

[

tweetId,

tweet:

{content, likes, shares}

]

}

POST api/tweets/like

request {

auth header,

tweetId

}

response 200

{

tweetId,

liked,

likes

}

GET api/tweets/share?tweetId=xxx

response

{

shareUrl

}

GET api/tweets/feed

response

{

[

{

tweetId,

tweet: {

content,

likes,

}

]

}

Database design

Friends Table

| User Id | Follower Id | Created At

User Tweets Table

Liked/Shared Tweets Table

TweetId | UserId | Created At

User Feed Table

UserId | TweetId | Content | Created

High-level design

We use microservice structure to so we could scale individual service separately and separate concerns. It also has better fault tolerant as single service down won't affect the whole system.

Top level, we could have three servives.

Tweet Service is for posting tweet, share or like tweet. Fetch list of tweets (for single user or tweets user liked).

Feed Service is for fetching user tweets feed.

Friend Service is for follow & unfollow users.

On DB level,

Tweets Table where we store all user's tweets.
Liked Tweets to store all tweet ids a user liked.
Tweets Metadata Table to store tweet info of likes count and shared count.
User Tweets Feed Table to store feed tweets
Friends Table is to store user and followers

Request flows

Create Tweet and retrieve tweets

When user creates, they post tweet with content to backend, the tweets service will record tweet in the Tweets Table. And to retrieve tweets, we return user tweets in pagination.

When we store the tweet in Tweets Table, the tweet will also fan out to all followers' and record in user tweets feed table.

For Like Tweet

User clicks like button and it's reflected on the UI part immediately. We will insert new record in the Likes Tweet Table and asynchronously update the Tweet Table's likes count.

For Share Tweet

The process is similar to like Tweet. For the share url or share to app, we could either have fixed url format in app or have server generated url, depending wether we need to track source and the access control.

For follow & unfollow

User could send follow/unfollow request to friends service. Would add/remove record to the Friends DB.

For user tweets feed

We retrieve feed from the user tweets feed DB.

Detailed component design

For Friends Table, we could use SQL DB (e.g PostgreSQL). Even we don't require ACID to store friend relationship record, SQL is still useful when we try to perform some complex query. It also has mature ecosystem.

For User Tweets Table and User Tweets Feed Table, it needs high write and read output. We could use NoSQL DB like Cassandra, use UserId as partition key and Tweet Id as cluster key. We could use timestamp + UUID to composite the Tweet Id, this way the tweets for each users are sorted by time.

For Liked/Shared Tweets Table, we could use Dynamo DB, primary key as Tweet Id as Parition Key + (Create_At, User Id) as Sort Key.

For Cache, we could Redis.

Friends relationship data we could use Redis Set.

User Tweets and Feeds could use Zset.

For User Tweet Fan out, since the user's followers list could be big, we could use MQ. The message is basically {userId, tweetId, content} and consumer would get the message and add the tweet to each follower's Feed Table. If the follower list is long, we could split the task, set up another MQ where the msg would be {userId, tweetId, content, followerIds: []}, each task would contain 100 follower ids and gradually update feed for all followers.

Trade offs/Tech choices

We choose SQL DB (PostgreSQL) to store the friends relationship as we assume the write is relatively low on Friends Table. We could also track DAU and friends data easily. But if we want to scale to support more users, we need to shard and manipulate manually.

For Tweets related table, we use NoSQL DB to support high read & write volume. They suport scale by nature. It also means we need to accept eventual consistency.

For Tweets Table & User Feed Tweets Table, we use user id as parition key, it makes it easier to fetch all feeds/tweets for a user, but could also lead to hot partition. We need to mark the user as celebrity and divide their tweets into multiple paritions. Each parition has a new partition key as UserId_ShardX and when query, we dynamically add shards depending on the user's tweets count in the newest parition.

For the Tweet Feeds, we use a push model where we store user's feeds and update whenever followee post new tweets. But the process would be very time consuming if the user has a lot of followers. We could combine push and pull model for this case. For normal users, they push the new tweets to follower's feed table. For celebrity, user needs to pull from their tweets.

Failure scenarios/bottlenecks

DB could go down. For SQL DB, we could add replicas, and use slave to catch up and upgrade to new master when master is down.

For NoSQL DB, data is replicated to multiple nodes by nature.

Redis cache could also be broken, we could also add replicas to increase fault tolerant.

For hot partition issue in Tweet Table and celebrity fan out tweets issue, it has been covered in Trade offs/Tech choices.

Future improvements

We haven't covered case when user unfollow someone, for that case, we would also need update User Feed Table, but that would be an async process. We could use some client filter while backend is hiding the unfollowed person's tweets.
For disaster recovery, we could have multiple data centers to ensure we could restore from backups.
We could add monitor system includes:
- Performance Monitoring (response time, request rate, and throughput)
- Error Monitoring (API error rates, service availability, and exception logs)
- Resource Monitoring (CPU, memory, disk, and bandwidth utilization)