Design Twitter - System Design

System requirements

Functional:

The ability to create a tweet
The reverse chronological feed of tweets of users I follow
The ability to list users and subscribe/unsubscribe
The ability to register

Non-Functional:

We care about scalability to make the platform ready for growth. Eventual consistency is enough for viewing tweets. Delays of even seconds are acceptable. The API should have low latency for good user experience of both creating and listing tweets. The system should be able to elastically scale up and down since there may be events, such as elections or concerts where the usage is significantly higher than the standard usage. Since the storage requirements are significant since tweets may contain images and videos, we should optimize storage cost for using older images and videos. The system should tolerate situations such as some internal service is down or not available.

Capacity estimation

500M DAU
2 new posts per user daily
100 read posts per user daily
storage per post: 140 bytes for text, 50kb media (on average, taking into account that most posts don't include media), 100 bytes metadata, total still around 50kb
storage daily: 50kb * 1B
50 kb * 1 000 000 000 = 50 * 1 000 000 mb = 50 * 1 000 gb = 50 tb for 5 years
writes: 1b day = approx 10 000 per second
reads: 50B per day = approx 500 000 reads per second

API design

For a current user, UI will call the following backend HTTPs API based on JWT auth.

Managing your own tweets:

POST /v1/tweets - to create a tweet

GET /v1/tweets - to list tweets the user created

DELETE /v1/tweets/{id} - to delete the tweet

Browsing tweets that I'm subscribed to:

GET /v1/feed - to list N most recent tweets of my subscribers, the identity is based on the JWT auth; in future iterations we may consider an alternative web socket based API to get real life updates

Browsing users:

GET /v1/users - lists all users with pagination with the relevance ranking

POST /v1/users/{id}/subscribe - to add a user to the list of subscribers

POST /v1/users/{id}/unsubscribe - to unsubscribe

Registering:

POST /v1/register

This is just enough API for the MVP.

Database design

user db is a standard relational database like pgql. in case we want to fetch user metadata for some flows, we'll have to extend it with a key/value store such as redis for fast popular user lookup.

the user db database is light because it offloads the friends lookup to the adjacency list key/value store that optimizes reads from current user ID to the list of friends. redis again.

tweets balance both reads and writes so we should go with dynamodb or sylladb. and again redis on top of them.

fanout workers populate feed cache for each recipient.tweet_id to recipient_id pairs.

High-level design

The entry point is some UI that web/app serve which actually belongs to the identity provider. The web is stateless single page app that loads assets from the CDN (based on S3). The mobile app is apple/android store APP. Once the identity provider flow succeeds, the user makes it to the app and has a corresponding JWT token that web/app forward to the backend. Both of them don't reach out directly to the backend but to the load balancer. The load balancer uses consistent hashing per user ID to load balance. Moreover, the load balancer inspects the path and based on matching rules distributes the traffic to different backend services. Each backend service is stateless and uses the auth JWT to understand which user it is and verifies the JWT. The types of backend services are based on the domain. The simplest service is the user service. It is stateless and connects to a relational database. It supports all the API related to registering, listing users, and subscribing. In the future, we may add a distributed cache even for this service to get the load off the database. Another service is a tweet service. It's responsible for managing my own tweets. Since the volume is huge, it has a distributed cache like memcached or redis before reaching out to the database. Once added, the cache is populated with some TTL to free up memory for not popular tweets. The cache is partitioned by user ID. The DB has user ID as an index. Once the tweet is deleted, the tweet service deletes from both the cache and the database. The tweet service puts the post to the pipeline for the fanout service to process (about it later). The most complex piece is the feed service. It has a single API that loads tweets in the paged manner for some user. The only way to insure the speed is for this service to reach to feed cache. Now, how the feed cache is populated. First, it should load the list of connections for the user. For that it reaches out to the connections DB. It can't be done by complex queries against user db because it would make it a bottleneck. It only reaches out to adjacency list key/value store in RAM such as rocksdb or Cassandra with userid sharding and userid as a key with the adjacency list of user ids as the value. The user service updates this store based on subscribe/unsubscribe operations. Now the service is ready to fetch all the subscribers in no time. The fanout service takes tweets from the pipeline. On the new tweet arrival it fetches the list of subscribers and creates jobs for workers that update feed cache. The feed cache is post ID and the user ID of the user that should see the post. Them the feed service takes all the items from the cache and reaches out to feed cache (with the fallback to the tweet db) to fetch all the information based on the data and ranks. It may optionally reach out to the user service cache (in this case we need the db)

Request flows

this is a fanout on write model where the feed is effectively build for all the recipients right after posting. and the read is just from cache. so fast reads!

Detailed component design

N/A

Trade offs/Tech choices

n/a

Failure scenarios/bottlenecks

n/a

Future improvements

we may cover the celebrity problem with the fanout on read. we may support reading old data or before the subscription by some fallback solutions based on the tweet storage and some optimized storage on top of it.