System requirements
Functional:
User can post tweets (text only for now).
User can follow/unfollow other users.
User can like/favorite tweets.
User can view their feed, which shows tweets from people they follow in reverse chronological order.
Non-Functional:
High Availability → Users should be able to tweet and view feed anytime.
Low Latency Reads → Feed should load fast (~100-200ms).
Scalability → System should scale to 500 million users and 1 billion tweets/day.
Durability → Tweets and likes should never be lost.
Capacity estimation
Daily Tweets → 1 billion per day → ~11,500 tweets/sec.
Peak Load → Assume ~2x peak → ~25,000 tweets/sec.
Followers per User → 500 on average → Celebrities can have 10M+.
Feed Reads → Heavy read → Assume ~5-10 billion reads/day.
API design
| APIDescription | |
POST /tweet | Post a new tweet |
GET /user/{id}/feed | Get user timeline |
POST /user/{id}/follow | Follow a user |
POST /user/{id}/unfollow | Unfollow a user |
POST /tweet/{id}/like | Like a tweet |
GET /tweet/{id}/likes | Get likes count |
Database design
Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...
Request flows
Posting Tweet
- Client → API Gateway → Tweet Service → Tweets DB
- Tweet Service → Fan-out to followers → Feed Cache updated.
Viewing Feed
- Client → API Gateway → Feed Service → Feed Cache → Tweets shown.
Follow/Unfollow
- Client → API Gateway → Follow Service → Follow DB → Update Feed Cache.
Like Tweet
- Client → API Gateway → Like Service → Like DB → Update like count.
Detailed component design
Feed Generation
- Use Push Model for Normal users → Push tweets directly into Feed Cache (Redis List).
- Use Pull Model for Celebrities (10M+ followers) → On demand pull from Tweets DB.
Feed Cache
- Store user timelines in Redis Sorted Sets / Lists → Fast read and pagination.
Tweets DB
- Sharded by user ID → distribute write load.
Like Service
- Simple table, sharded by tweet_id → scalable like counter.
Trade offs/Tech choices
| Push model for feeds | Fast reads, but more fan-out write load | Pull model for celebs | Avoid massive fan-outs | Redis for Feed Cache | Low latency read | SQL for tweets | Strong consistency and ordering | NoSQL/Redis for follow | Fast relationship queries |
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
Add Media support (images/videos) using Object Storage (S3).
Add Notification Service for likes/follows.
Add Search Service to find users/tweets.
Improve Feed with ML based ranking + relevance.
Introduce Geo-based sharding for global distribution.