Design Twitter - System Design

System requirements

Functional:

publish tweets:
1. Text length limit? 200B.
2. Can attach media? Media size limit? Average 1MB.
read tweets:
1. read in reverse-chronological order
2. only load recent, page by page
3. click on user to see all user historical tweets
save favorite tweets
optional: follow / un-follow?
optional: edit / delete?

Non-Functional:

High availability / scalability
Performance / low latency
Consistency:
1. For self-published tweet, strong read-after-write consistency, should see it immediately after publishing
2. For friends tweets, eventual consistency, allow some latency before seeing it
Storage: see how-long ago tweets? Maybe store for 3 years

Capacity estimation

Total users: 100M

DAU: 10M

Publish per user per day: 2

Read per user per day: 10

Each user follows 500 other users

publish/write QPS: 2 * 10^7 / 10^5 = 200

read QPS: 1000

storage:

size per tweet = 200 + 1*10^6 * 0.1 = 0.1MB

size per day = 0.1MB * 2 * 10^7 = 2 TB

size per year = 700 TB

size per 10 year = 7 PB

API design

publish tweet: POST/publish/$userId/$tweet
read tweet: GET/read/$userId
fav tweet: POST/fav/$userId/$tweetId

Database design

Start with relational DB.

Tables:

User Table
- user Id
- user name/profile
- following id
Tweet Table
- tweet id
- publisher id
- published timestamp
- tweet content
- url (for large media content in object storage)
Fav Tweet Table
- tweet id
- fav user id

Publish a tweet flow:

User Table 1 -> n Tweet table

Read tweets flow:

User Table 1 -> n Tweet table

Fav tweets flow:

User Table 1 -> n Fav table 1 -> Tweet table

High-level design

Client

Load Balancer: balancing traffic to proxy servers

Proxy servers: API gateway, route request to different services, rate limiting, user authentication, security

Publish service: publish a tweet, update tweet table, potentially upload large media to object storage, notify fanout service internally to push new tweets to followers cache

Read service: read recent tweets from cache first, if read more than cache capacity, query from user table and tweet table

Fav service: save (user_id, tweet_id) into fav table, also serves fav read request, which pull data from fav table, tweet table.

Fanout service: upon notified about new tweet, push new tweet to followers cache

Cache: LRU-evicted, keep recent 1-2 days tweets

DB layer contains tweet table, user table and fav table

Request flows

Publish flow:

Client sends publish request
Proxy server authenticate client, check rate limiting
Publish service add tweet to tweet table, initiate uploading large media file to object storage, upon success, notify fanout service to fanout new tweets, returns success response to client (could fail at each step, needs further error handling)
Fanout service push new tweets to followers' cache
For some hot tweet, the media file can also be saved to CDN for fast retrieval

Read flow:

Client sends read request
Proxy server authenticate client, check rate limiting
Read service first read from cache to get tweet feeds back to the client, for some hot media content, client directly gets from CDN
If user retrieves further back than the cache capacity, read service will query DB to get further tweets
Error handling:
1. too many request, server busy
2. download failure

Fav flow: similar to publish flow, but just update fav table. When read, read directly from DB since fav read is a less frequent case.

Detailed component design

Fanout service:

Given a new tweet, first query user table to get all followers for the current publisher. If publisher is a celebrity, either hold on fan out tweets, or use more fan out servers
For normal publisher, fan out the tweet to all followers. This is an async work, so can be done through a message queue with pub-sub functionality.

Read service and cache:

Read request first reads from cache. Due to celebrity problem, may also need to pull celebrity tweets from DB directly.
Cache is LRU-evicted. So user can fetch more than cache capacity, in that case, read service needs to query DB. This will be slow, so we need to fetch with pagination (load page by page and can scroll infinitely).

For large / hot media files:

large media can be stored in object storage in a cost efficient way: recent media is stored in hot storage, historical media can be stored in cold storage
For hot media (like a video from a celebrity), it can be stored in CDN. CDN is expensive, so we need to be careful, only store hot media, and maybe set a size cap.

Trade offs/Tech choices

New tweets, push v.s. pull:

Read is more often than write, so we need to make read fast. In that case, push model works better.
However, push model may not be efficient for celebrity tweets, in that case, we may switch to pull model.

SQL vs No-sql:

we can start with sql since the data (user/tweet) is structural. However, we need to monitor the DB layer performance, if read request latency is high, we may need to switch to no-sql, denormalize data into key-value pair datastore.

Object storage and CDN: instead of start from scratch, can outsource to existing provider, like amazon S3, or CDN providers.

Failure scenarios/bottlenecks

Scalability: each component should be horizontally scaled. Proxy servers and each service is stateless, so can be scaled out easily. For DB, we can provide single leader/write server with read replicas for faster read. And may need to further partition into different servers, partition key can be user_id.

Fanout service will have high traffic, we need to scale it more and use message queue to handle the large amount of async fan out jobs reliably.

Failure/error handling:

Failed persist needs to be retried
Failed upload/download: retry with resumable upload/download
Server down: for api servers, just use new, since it is stateless. For DB servers, use new for read replicas, for leader, fail over to one of the read replica.
Read after write consistency: publish server can directly push new tweet to reader server/cache before return success response to client.

Future improvements

Add analytics service/component, analyze user patterns, server performance/reliability. Feedback to improve service performance/reliability.
Functions: edit/delete tweets, tweet censorship, spam detection, gen ai, etc.