System requirements
Functional:
entities: user, post (textual, image, video), comment, like
users can post tweets
users can follow other users
users can view tweets from all the other users that they are following
users can like the tweets they see
users can comment on the tweets they see
users can retweet
availability
scalability:
latency: 1-2 seconds end to end
Non-Functional:
consistency
search
recommendation
Capacity estimation
assume DAU: 1M
every active user tweet 1 time per day, so 1M tweets per day, 360M per year
text: average size of 256bytes
image: 10%, 256KB
video: 10%, 10MB
total size = 360M*(0.25 * 0.8 + 256 * 0.1 + 10*1024*0.1)KB = 378TB
with high scalability the system should be able to accommodate twice of the estimation, 720M tweets / 756TB a year
API design
user.Tweet(text, image, video, tweetId)
user.Follow(userId)
user.View(datetime)
user.Like(tweetId)
user.Comment(text, tweetId)
Database design
Both RDBMS and ObjectStorage should be used
RDBMS:
USER: userId, password, registeredDate, email, phone, avatar
OS:
USER: tweetId1, tweetId2, ... tweetIdN
TWEET: tweetId, createtime, text, imageURL, videoURL, retweetId, likedByList(userId1, userId2, ... userIdN), CommentList(<userId1, comment1>, ... <userIdN, commentN>)
USERFOLLOW: userId, followlist(userId1, userId2, .... userIdN)
High-level design
Both PC and mobile clients are supported. They will use the same APIs to access appserver
The appservers are responsible for all user logics. They read/write to RDBMS and OS, send update request to cache, and send mirroring request to CDN
RDBMS keeps user metadata
OS keeps all other data, including user following relationship, tweet information
Cache is in memory and key'ed by tweetId, uses LRU strategy.
CDN keeps a mirror of images and videos that user uploaded
Request flows
Tweet: client sends a request with his/her id, text, image, video or retweetId to appserver. Appserver will create a new entry in OS, persist the tweet into OS, get the new tweetId, and append to USER's tweetId list. If it has image or video the appserver will also send mirror request to CDNs
View: client sends a request with his/her id and a datetime. If the client is pulling the latest the datetime will be now. If it is pulling old tweets then the datetime will be the createtime of the oldest tweet it was showing. Appserver then lookup OS to get the followees' Ids, and get the tweets from all of them that is older than datetime. In the end appserver merge all the tweets into a timeline and send back to client. Client merge the timeline and contents pulled from CDN to generate readable tweets.
Follow/Like/Comment: similar procedure, that appserver update corresponding table with id and content from the client.
Detailed component design
The appserver is the most critical component is this system. It should be able to handle all 5 kinds of APIs. As such, it should be highly available and scalable. The appservers are stateless, so they can be scaled up vertically, with a load balancer ahead of them to dispatch workloads evenly. Tweeter is a read intensive system, with #read being about 10 times of #writes. To improve system performance a cache should be added to each appserver. The cache caches hot tweets that users accessed the most in memory to reduce the number of read appservers conduct to OS.
Trade offs/Tech choices
Tweeter does not guarantee high data consistency, meaning if some tweets missing occasionally it is acceptable. In this case if the content of a tweet cannot be read from OS correctly, appserver will continue with rest of the tweets that are available.
Failure scenarios/bottlenecks
OS storage has to be scalable. The best way to scale is scale vertically with sharding by userId. However, some of the top users may have a long tweet list. This could cause imbalance and hurt scalability.
Future improvements
To solve the data imbalance problem, a better sharding strategy should be introduced. For example, a server can be added to maintain the list of users and their shardId, and number of tweets. It will periodically recalculate the tweets in each server, and ask the OS to rebalance.