System requirements


Functional:

List functional requirements for the system (Ask the chat bot for hints if stuck.)...

registry

login

profile management

post twitter, length = 140

comment tweet

forward tweet

delete tweet

follow other user

trends

home feed + recommendation ranking

like tweet

user activity log reporting

Notifications can include: that user can decide which type of notificaiont they want

    • New followers
    • Likes on tweets
    • Retweets
    • Replies or mentions in tweets
    • Direct messages, if implemented later


Non-Functional:

eventual consistency: A delay of a few seconds to a minute is often tolerable.

availability: 99.9% uptime, 8.76 hours of downtime per year. Not over 1 hour for each incident.

performance: write a tweet (post it) should ideally be within 200 milliseconds. This ensures a smooth and quick experience when users send tweets.

Opening a tweet should be very quick, targeting around 100 milliseconds to display the tweet content.

performance:Retrieving and serving the home feed with recommended tweets should ideally take less than 500 milliseconds. Pre-generating recommendations can help achieve this.




Capacity estimation

Estimate the scale of the system you are going to design..

Basic data:

DAU: 500Million

create tweet: 2 tweet every day = 1,000 Million tweets every day

read tweet: 100 /user / day = 100 * 500 M = 50,000 M views every day

years: 5

Maximum followers: 10,000

Maximum following: 5,000 (to prevent spammy behaviors)

Each tweet has a maximum length of 140 characters


QPS:

read:

50,000 M /24/3600 = 0.5 Million = 578K QPS

backend server: 115

sql db: ~1000

peak hour: backend server 230

sql db ~ 2000


write:

1000M / 24/ 3600 = 0.01 Million = 11K write per sec

peak hour: 2 times average, 22K write per sec

assuming backend server handle 5K QPS, need 5 backend server

Assuming DB handle 500QPS, need 50 db server, some db server is better than this, we can talk about this later section


Data storage:

data for storage = 1,000 M * 140Bytes * replica = 3 = 420,000 MB = 420 TB every day

5 years = 420TB * 365 * 5 = 766,000 T = 766 PB


Network bandwidth:

read: 578K * 140/2 = 40,000K = 40MB per sec

write 11K * 70 = 700K per sec




API design

/registry

/login

/post,input: user_id, content

/delete, input: user_id, tweet_id

/like, input: user_id, tweet_id

/follow, input: follower_id, followee_id

/unfollow, input: follower_id, followee_id

/forward,input: user_id, tweet_id

/comment,input: user_id, tweet_id, content

/home_feed, input:user_id, return array[tweet_id]

/notification

/trends, return array[hashtag]






Database design

tweets

|tweet_id

|content|varchar

|cretead_at|

|status| enum['deleted', 'posted', ]


user

|user_id| int|

|description| varchar|

|State| varchar(2)|

|Country| varchar|


graph db

following_relation

follower

followee

follow_time


user_view_relation

user_id

tweet_id

status | enum['seen']


in memory db for recommendation:

user_id, array[tweet_id]



High-level design

Load balancer

API gateway

Backend server for different endpoint

in memory cache for reading tweet

sql server for storing user info and tweet info

cache for storing recommendation result, will discuss tradeoff here

kakfa server

recommendation calculation engine, spark, flink, mapreduce etc

influxdb to store monitoring

grafana to show metrics


Request flows

tweet related: user -> load balancer- > API Gateway -> backend server -> update db, also send data to kafka and computation engine will calculate recommendation result daily/hourly

notification: user <-> push/poll notification server

home_feed: user -> recommendation backend sever -> query from cache,




Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...



2. Backend Services

  • Tweet Service (Handles /post/delete/forward):
    • Write Path: Tweets are stored in sharded MySQL (shard key = user_id % 1024). Async replication to followers' timelines via Kafka.
    • Delete: Soft delete (mark status=deleted in DB) + cache invalidation.
  • Social Graph Service (Handles /follow/unfollow):
    • Uses Neo4j (graph DB) for follower/following relationships. Limits: 5k following/user (enforced at API layer).
  • Feed Service (/home_feed):
    • Precomputed timelines stored in Redis Sorted Sets (score = tweet timestamp). Mix of:
      • Followed users' tweets (70%)
      • Recommendations (30%, from precomputed Spark jobs).
  • Notification Service:
    • Fan-out via Kafka. Users subscribe to event types (e.g., "likes only"). Notifications stored in Cassandra (TTL=30 days).

3. Caching

  • Tweet Cache: Redis cluster (LRU eviction). Key: tweet:{id}, Value: JSON with content, likes, retweets.
  • Recommendation Cache: Precomputed feed fragments stored in Redis. Refreshed every 5min by Spark jobs.

4. Data Storage

  • Tweets: MySQL sharded by user_id. TTL after 5 years (archived to S3).
  • User Metadata: MySQL (non-sharded, 500M users × 1KB = 500GB, fits in RAM with read replicas).
  • Activity Logs: Elasticsearch for reporting + InfluxDB for metrics.

5. Recommendation Engine

  • Batch Layer (Spark): Nightly job computes user similarity (collaborative filtering) and trending hashtags.
  • Real-Time Layer (Flink): Processes Kafka streams for click-through events to adjust recommendations.




Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...


  1. Consistency vs. Latency
    • Choice: Eventual consistency for feeds (precomputed timelines).
    • Tradeoff: Users might see tweets 1-5min late, but meet 500ms read SLA.
  2. Cache Freshness vs. Cost
    • Choice: 5min TTL for recommendation cache.
    • Tradeoff: Saves 80% compute costs vs. real-time updates, but recommendations lag.
  3. SQL vs. NoSQL for Tweets
    • Choice: Sharded MySQL for ACID compliance on deletes.
    • Tradeoff: Complex sharding vs. NoSQL's easy scaling.
  4. Fan-out Timing
    • Choice: Async fan-out via Kafka (not inline).
    • Tradeoff: Delayed feed updates but ensures write QPS <22k.




Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

  1. Hot Partitions in MySQL
    • Scenario: Celebrities (10M followers) cause write hotspots when tweeting.
    • Mitigation: Separate "celebrity" shard pool with higher throughput.
  2. Cache Stampede
    • Scenario: Cache miss on viral tweet (e.g., 1M req/s).
    • Mitigation: Probabilistic early expiration + request coalescing.
  3. Kafka Backlog
    • Scenario: Network outage causes 1hr backlog in feed updates.
    • Mitigation: Auto-scale Flink consumers + prioritize recent events.
  4. Geo-Unbalanced Traffic
    • Scenario: 70% users in India cause regional API overload.
    • Mitigation: Geo-sharded API gateways + CDN caching for static assets.
  5. Graph DB Slow Queries
    • Scenario: "Followers of followers" query times out for influencers.
    • Mitigation: Materialized views for common traversal depths.





Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?

  1. Global Sharding
    • Split MySQL/Redis by region (NA, EU, APAC) to reduce cross-DC latency.
  2. Edge Caching
    • Deploy Redis clusters at edge locations (Cloudflare Workers) for faster feed access.
  3. ML-Driven Recommendations
    • Replace Spark batch jobs with real-time TensorFlow Serving (user embedding models).
  4. Cost Optimization
    • Tiered storage: Keep hot tweets in Redis (7 days), warm in MySQL (1 year), cold in S3 Glacier.
  5. Security Enhancements
    • Add rate limiting per IP/ASN to prevent bot attacks on /registry.
  6. Multi-Active DCs
    • Use CRDTs for like counters to enable conflict-free multi-region writes.
  7. JIT Feed Generation
    • Hybrid model: Precompute 80% of feed + real-time append for recent tweets.