System requirements
Functional:
user should send a new post.
user should read friends' posts
user should connect with each other
user should be able to thumb up and leave comment
user should be able to upload photo
user should be able to register and login
Non-Functional:
should be low latency,
should be resilient and fault-tolerant,
should support high-throughput traffic, should be scalability scale up
Capacity estimation
suppose the system is read heavy. write read ratio is 1:10, if we have 1million user online avg per second, and 1% of them write new post, the write qps is 10k/s, and suppose 1 user has 10 friends online and the read qps is 100k/s. suppose one host can have 1k connection, for write operation we need at least 10 hosts, for read we need 100 hosts,
suppose for each write operation, 1MB avg including photo or video
for one day it will cost storage 864TB
API design
a. API for new user register account public boolean register(User user information) throws UserAlreadyExistsException
b. API for user to login public boolean login(username, password) throws unathenticatedException
c.API for user to post a new tweet public int postTweets(uuid userId, Tweet tweet);
d.API for user to delete an existing tweet public boolean deleteTweets(uuid userid, uuid tid);
e.API for user to read friends' tweet public List
f API for user to connect a new friend public boolean connect(uuid user1, uuid user2)
g API for user to disconnect an existing friend public boolean disconnect(uuid user1, uuid user2)
Database design
user table{
uuid userid(PK)
string username
string password
Set
string email
string phonenumber
string zipcode
long registeredTime
}
twitter table {
uuid tweetsid (PK)
uuid userid (FK)
string context
uuid photoid(FK)
uuid videoid(FK)
string comment,
long createdTime
int forwardCount,
}
file table {
uuid filed(PK)
byte[] content
long uploadTime
}
High-level design
- load balancer, we need load balancer to balance the traffic among hosts.
- api gateway, we need that to do authentication and authorization and throttling by username and by IP
- we need database to store the user information, tweet information and files, for files we can store the file location URL in database but store the actual file inside storage service like s3 to save cost.
- we need hosts to deal with the traffic, user requests, read request, write request, batch jobs,
- for top freq popular tweets we can use caching mechanism to reduce the high read requests.
- we will run a hourly batch job which monitor the popularity of the tweets so that we know the hourly trend, daily trend, which tweets got high forwarded count, and we will user batch job to update our caching write the popular tweets into our caches
Request flows
- user send a register/login request to load balancer, lb to gateway, inside gateway, call the authentication/authorization function, and all these will call the database the user table, for the registration function, we need to check if the user already exists or not, if yes then send message user already exist and redirect to the login page. for the login function we need to check the username and password and check the database to see if the information is correct or not. if user entre the right username and password and we will generate a jwt bearer token which has the user information and expiration time, and user will user this bearer token everytime call our API and our API gateway will also check for each request the bearer token if it is unexpired, if iti s expired, then we will redirect user to the login page.
- when user send a tweets, it will first forwarded by the LB and API gateway and the hosts will call the database and save the information into the db, if user also upload photos or videos then the host will also send the files into the s3 and store URL inside the table,
- when a user send a make new friend request, it will nofity the other user, and if the user accept, then we will update the user table by adding the friend uuid into our set mutually, this should be implemented using websocket for real time, and if user send a message to another user, we should also use websocket,
- when a user delete a tweet, we will need to modify the deletiondate of the tweets, set it to today, and we will run the batch job to remove all these deleted tweets at once.
- our hourly batch job will calculate the forward count of each tweets and calculate which isare the most popular tweets, and write into our caches for people to read and reduce the latency.
- our daily batch job will run every night to delete database, check those tweets with deletiondate today,
- when a user make a new tweets, it will nofity all the friends, using websocket for real time, purpose,
- when a user read tweets, it will pull friends most tweets decendingly, pagination, with a limit of 10, and if user continue read more it will continue load more tweets.
Detailed component design
for database, we can use nosql db like MongoDB for better performance, and easier scale up. we can make a db cluster, make one db instance deal with write request, new tweets, new friends, and other db replica read instance. we can also add index for those high freq columns and we can also do sharding to scale wide our db, split db into multi subset.
for userid, tweets id we should use uuid to generate, uuid is globally unique, it is generated by timestamp and mac address, it is safe that unpredictable, it make sure it is unique in the distributed system.
when user login, we will generate jwt bearer token, and send back to user, and everytime user request our server, it will also add the bearer token and our API gateway will check the expirationdate of the token and if it expired we will send 401 responds and let user login again, if the token has an unexpired refresh token, then we will generate a new bearer token and send back to user and still let user pass.
our batch will calculate the high popular tweets on hourly basis and will put the high freq tweets into the cache, to calculatepopular tweets, we can use data structure like priorityqueue to prioritize tweets, we will evaluate multiple dimension like forwarded count, thumbs up count, reply count, etc, when user post tweets with large files like photo, video, our batch will also upload video, photos large files into s3, user don't have to wait until the batch job finish, also our batch job will run daily to clean up the deleted tweets, deleted friends.
Trade offs/Tech choices
adding caching we can reduce the read latency for popular tweets, but we also need to maintain the caching using batch jobs will cost overhead.
choosing nosql will boost the performance, but it could e more complex to run complex query and or aggregation function.
for real time message sending, friend request, we choose websocket, but websocket is expensive which cost more.
Failure scenarios/bottlenecks
what if user delete a popular tweets, then it still remain in the caching, which will cause data inconsistency. issue, and our hourly batch won't delete it from cache until 1 hour time up
if multiple user comment on one tweets at same time, then there could e data inconsistency issue, some user don[t see other people comment,
Future improvements
we can improve the batch jobs, for example, make the batch job more frequency, when user delete a tweets, check both caching and db, so that to make sure the data consistency.
for multiple user comment on 1 tweets issue, we can use websocket to update the comment in real time so that user can see other people's new comment