System requirements
Functional:
- publish tweets:
- Text length limit? 200B.
- Can attach media? Media size limit? Average 1MB.
- read tweets:
- read in reverse-chronological order
- only load recent, page by page
- click on user to see all user historical tweets
- save favorite tweets
- optional: follow / un-follow?
- optional: edit / delete?
Non-Functional:
- High availability / scalability
- Performance / low latency
- Consistency:
- For self-published tweet, strong read-after-write consistency, should see it immediately after publishing
- For friends tweets, eventual consistency, allow some latency before seeing it
- Storage: see how-long ago tweets? Maybe store for 3 years
Capacity estimation
Total users: 100M
DAU: 10M
Publish per user per day: 2
Read per user per day: 10
Each user follows 500 other users
publish/write QPS: 2 * 10^7 / 10^5 = 200
read QPS: 1000
storage:
size per tweet = 200 + 1*10^6 * 0.1 = 0.1MB
size per day = 0.1MB * 2 * 10^7 = 2 TB
size per year = 700 TB
size per 10 year = 7 PB
API design
- publish tweet: POST/publish/$userId/$tweet
- read tweet: GET/read/$userId
- fav tweet: POST/fav/$userId/$tweetId
Database design
Start with relational DB.
Tables:
- User Table
- user Id
- user name/profile
- following id
- Tweet Table
- tweet id
- publisher id
- published timestamp
- tweet content
- url (for large media content in object storage)
- Fav Tweet Table
- tweet id
- fav user id
Publish a tweet flow:
User Table 1 -> n Tweet table
Read tweets flow:
User Table 1 -> n Tweet table
Fav tweets flow:
User Table 1 -> n Fav table 1 -> Tweet table
High-level design
Client
Load Balancer: balancing traffic to proxy servers
Proxy servers: API gateway, route request to different services, rate limiting, user authentication, security
Publish service: publish a tweet, update tweet table, potentially upload large media to object storage, notify fanout service internally to push new tweets to followers cache
Read service: read recent tweets from cache first, if read more than cache capacity, query from user table and tweet table
Fav service: save (user_id, tweet_id) into fav table, also serves fav read request, which pull data from fav table, tweet table.
Fanout service: upon notified about new tweet, push new tweet to followers cache
Cache: LRU-evicted, keep recent 1-2 days tweets
DB layer contains tweet table, user table and fav table
Request flows
Publish flow:
- Client sends publish request
- Proxy server authenticate client, check rate limiting
- Publish service add tweet to tweet table, initiate uploading large media file to object storage, upon success, notify fanout service to fanout new tweets, returns success response to client (could fail at each step, needs further error handling)
- Fanout service push new tweets to followers' cache
- For some hot tweet, the media file can also be saved to CDN for fast retrieval
Read flow:
- Client sends read request
- Proxy server authenticate client, check rate limiting
- Read service first read from cache to get tweet feeds back to the client, for some hot media content, client directly gets from CDN
- If user retrieves further back than the cache capacity, read service will query DB to get further tweets
- Error handling:
- too many request, server busy
- download failure
Fav flow: similar to publish flow, but just update fav table. When read, read directly from DB since fav read is a less frequent case.
Detailed component design
Fanout service:
- Given a new tweet, first query user table to get all followers for the current publisher. If publisher is a celebrity, either hold on fan out tweets, or use more fan out servers
- For normal publisher, fan out the tweet to all followers. This is an async work, so can be done through a message queue with pub-sub functionality.
Read service and cache:
- Read request first reads from cache. Due to celebrity problem, may also need to pull celebrity tweets from DB directly.
- Cache is LRU-evicted. So user can fetch more than cache capacity, in that case, read service needs to query DB. This will be slow, so we need to fetch with pagination (load page by page and can scroll infinitely).
For large / hot media files:
- large media can be stored in object storage in a cost efficient way: recent media is stored in hot storage, historical media can be stored in cold storage
- For hot media (like a video from a celebrity), it can be stored in CDN. CDN is expensive, so we need to be careful, only store hot media, and maybe set a size cap.
Trade offs/Tech choices
New tweets, push v.s. pull:
- Read is more often than write, so we need to make read fast. In that case, push model works better.
- However, push model may not be efficient for celebrity tweets, in that case, we may switch to pull model.
SQL vs No-sql:
- we can start with sql since the data (user/tweet) is structural. However, we need to monitor the DB layer performance, if read request latency is high, we may need to switch to no-sql, denormalize data into key-value pair datastore.
Object storage and CDN: instead of start from scratch, can outsource to existing provider, like amazon S3, or CDN providers.
Failure scenarios/bottlenecks
Scalability: each component should be horizontally scaled. Proxy servers and each service is stateless, so can be scaled out easily. For DB, we can provide single leader/write server with read replicas for faster read. And may need to further partition into different servers, partition key can be user_id.
Fanout service will have high traffic, we need to scale it more and use message queue to handle the large amount of async fan out jobs reliably.
Failure/error handling:
- Failed persist needs to be retried
- Failed upload/download: retry with resumable upload/download
- Server down: for api servers, just use new, since it is stateless. For DB servers, use new for read replicas, for leader, fail over to one of the read replica.
- Read after write consistency: publish server can directly push new tweet to reader server/cache before return success response to client.
Future improvements
- Add analytics service/component, analyze user patterns, server performance/reliability. Feedback to improve service performance/reliability.
- Functions: edit/delete tweets, tweet censorship, spam detection, gen ai, etc.