Design Twitter - System Design

System requirements

Functional:

1 User can post tweets on their on timeline

2 User can view other users' tweets in a generated news feed, or other users' home page

3 Users can follow/unfollow each other

4 Users can like tweets

5 User should be able to search tweets

Non-Functional:

1 The application should be highly available and the latency should be minimized as much as possible

2 Eventual consistency should be sufficient for tweet data, since we will prioritize availability over consistency

3 Scalability, the application should be easy to scale up and down to meet the change of daily active user.

4 Fault tolerance.

Capacity estimation

Assume 10 million daily active users, each users post 10 tweets per day on average, and the read to write ratio is 10 to 1.

Write QPS: 100*10^6/24/3600 = 1157

Read QPS: 10*1157 = 11570

Storage required for storing tweets for 10 years (assume each tweet is 100 bytes): 100*10*365*10^7*10/(10^12) = 36.5 TB

API design

1 Post tweet

[POST] /tweet

with JSON body about the post/user information, the end point will return the newly posted tweet's unique id on success.

2 Get tweet

[GET] /tweet/tweet_id

return the detail of a given tweet in JSON format.

3 Get tweets

[GET] /tweets?user_id={0}&max_result={1}&tweet_user_id={2}

return the timeline for a given user with results pagination.

4 Like tweet

[POST] /tweet/like?tweet_id={0}&user_id={1}&tweet_user_id={2}

5 Follow/unfollow user

[POST] /users/follow?user_id={0}&followed_user_id={1}

[POST] /users/unfollow?user_id={0}&followed_user_id={1}

6 Search tweets

[POST] /tweets/search

with payload about searched term, searched user id, sort order, pagination, etc

returns a list of matched tweet ids

Database design

Choose relational database for user data since it's easier to maintain the data integraty.

Choose column based no-sql data for tweet data, because it suits our need for scalability and large write QPS.

Choose a key value storage for storing the indexed file generated for searching tweets.

Tweet:

tweet_ID: primary key
created_by: user ID, index
posted_time: index
content
medialink:
number_of_likes

User:

user_ID: primary key
other personal information

High-level design

A hybrid approach for generating the news feed: for average users with not so many followers, push model was used since pre-computed news feed minimizes the latency while the follower number is not large enough to cause a hotkey issue. For celebrity users (users with many followers), pull model was used to allow their follower pull their tweets on demand without causing a system overload.

The posted tweets will be fanned out to for generating newfeeds, as well as for generating inverted index documents to serve tweet search functionality. Sharded counters are applied to handle the tweet like count.

Request flows

User post a tweet:

1) The load balancer routes the request to available instances.

2) The post service instance will write the tweet record to the wide column database, as well as updating the tweet cache.

3) The indexing service instance will tokenize the tweet, going throught the map reduce process and convert it to inverted index documents. The documents will be stored for search tweet purpose.

4) For average user with push data model, The fanout service instance will relay the message of new tweet on the message queue. A fanout worker will later pick up the message, get the user graph and insert the new tweet into given timelines in the newsfeed cache.

User like a tweet:

1) The load balancer routes the request to available instances

2) The like count was updated in the shared counter.

3) The shared counter will save the like count into the wide column database periodically

User search a tweet/timeline, follow/unfollow users

The load balancer will route the request to available service instances, after looking into the cache, updating cache and database, the result will be returned to the users.

Detailed component design

For the hybrid aproach to generate newsfeed, the push model (fanout on write when someone post a tweet) can enable user fetch timeline in a near real time way with minimum latency, since the newsfeed is precomputed, but it will be a waste of resource to precompute the timeline for inactive users. Also for users with many followers precomputing the newsfeed for every follower is too expensive to be feasible. On the other hand, the pull model (fanout on read when someone requested the newsfeed timeline) will avoid the hotkey issue for celebrity users and the resources waste on inactive users, but it comes with slow timeline fetching and might not be appropriate for average user's experience. So pull model is used for handling inactive users or celebrity users.

A message queue layer was added in the newsfeed generating service part. The reason for it is that if we choose to push the update to every follower's timeline immediately when a new tweet is posted, it is easy to create a traffic spike and overload the services. An asynchronous layer can help smoot the traffic spike.

For cache and databases we can choose consistent hashing to make the data distribute as evenly as possible. For shared counters, the sharded key needs to be chosen carefully so we don't risk a hot spot issue for the incoming update requests.

Trade offs/Tech choices

We prioritized availability and low latency first, and chose eventual consistency over stronng consistency. Wide column DB was chosen to store the tweet data because it is easy to scale horizontally, and supports large write QPS well.

Message queue was added as a layer for generating new feed. The asynchronous communication can help smooth the traffic spike and reduce the chance to overwhelm the newsfeed generating services. It will take some time for the user to see the new timeline after the tweet posted due to this asynchronous layer, but it should be acceptable for user experience.

Failure scenarios/bottlenecks

To avoid uneven data distribution/hotkey issue, consistent hashing can also be used for our cache and database clusters.

Shared counter will need to be resharded, if the data distribution between shards are not even and cause a hot spot issue.

The service usage, services health needs to be carefully monitor, so a failed instance won't cause the service to be overloaded and we can decide to scale it up/down when needed.

Future improvements

Add user authentication and authorization, as well as content encryption and multi-media supports.

Data synchronization decision within /between data centers accross different regions.