Requirements


Functional Requirements:


  • Allow users to tweet messages up to 140 characters.
  • Enable users to follow other users.
  • Allow users to like tweets from other users.
  • Display tweets from followed users in the home feed.
  • Show top K popular tweets in the home feed based on likes and followers.



Non-Functional Requirements:


  • High availability - 99.99 avail - users expect to always be able to read/write tweets
  • Low latency - home feed reads should be <200ms p99. Tweet posting should feel instant (<500ms)
  • Eventually consistent: Feed delivery can tolerate slight delays (seconds). Likes/follower counts can be eventually consistent.
  • Partition tolerant: Given CAP - we choose AP (avail + partition tolerant) over CP (consistency) for feed reads
  • Durability: Tweets, once posted, must never be lost.


API Design


Write path:


POST /v1/tweets

Body: {"user_id": string, "content": string (max 140 chars), "media_ids": [string]}

Response: {"tweet_id": string, "created_at": timestamp}


POST /v1/users/{user_id}/follow

Body: {"target_user_id": string}

Response: 204 No Content


Post /v1/tweets/{tweet_id}/like

Body: {"user_id": string}

Response: 204 No Content


Read path:


GET /v1/feed?user_id={id}&cursor={cursor}&limit=20

Response: {"tweets": [...], "next_cursor": string}


GET /v1/feed/trending?limit=K

Response: {"tweets": [...]}



High-Level Design

The system splits into two core paths:


  1. Write path (Tweet Ingestion): User posts a tweet -> stored in Tweets DB -> fan-out service pushes to followers' feed caches
  2. Read path (Feed Generation): User requests feed -> read from precomputed feed cache (for normal users) or pull-based merge (for celebrity followers).


Key components:


  • Tweet service: Handles CRUD for tweets
  • Fan-out Service: Pushes tweets to follower timelines (hybrid push/pull model)
  • Feed service: serves the home timeline from pre-computed caches
  • Social graph service: managed follow relationships
  • Trending service: Computes top-k popular tweets using a streaming count algorithm





Detailed Component Design

1. Fan-out Service (Hybrid push/pull)


Problem: A celebrity with 50m followers posting a tweet cannot fan out to 50m feed caches synchronously


Solution: hybrid aproach

  • Push (fan-out on write): For users with <10k followers, when they tweet, asynchronously write the tweet_id in each follower's feed cache (Redis sorted set, scored by timestamp)
  • Pull (fan-out on read): For celebrities (>10k followers), do NOT fan-out. Instead, at read time, the Feed Service merges the user's precomputed feed with recent tweets from followed celebrities.


Data structure: Each user's feed is a Redis sorted set: feed: {user_id} -> {tweet_id: timestamp}. Capped at 800 entries.


Scaling: Fan-out workers are horizontally scaled behind a message queue (Kafka). Partitioned by target user_id for ordering guarantees.


2. Trending/Top-K Service


Problem: Compute the top K tweets globally based on a scoring function: score = likes * wl + retweets * w2 + recency_decay.


Solution:

  • Use a streaming approximate count approach (Count-Min sketch -> Min Heap of size K)
  • Every like/retweet event is published to Kafka -> consumed by the Trending Service.
  • The service maintains a sliding window (e.g. last 24h) and a minheap of K tweets by score
  • Results are cached in Redis wit ha TTL of 60s and served directly.


Tradeoff: Approximate counts are acceptable - trending doesn't need exact precision.


3. Social Graph Service


Storage: Follow relationships stored in a graph-optimized store (adj list in Cassandra or dedicated graph store like TAO)


Schema (Cassandra):

  • Following: partition key = user_id, clustering key = followed_user_id
  • Followers: Partition key = user_id, clustering key = follower_user_id

both tables maintained in a dual-write pattern for O(1) lookups in either direction. The followers table is critical for the fan-out service to know who to push to.


dual write - use a transactional outbox or event sourcing to ensure durability


Key Tradeoffs summary:


Decision: Hybrid fan-out: complexity vs handling celebrity problem


Redis sorted sets for feeds: memory cost vs read latency

Approximate trending (Count-Min Sketch) precision vs throughput


Eventually consistent likes/counts: Consistency vs Availability


Dual-write follower/following tables: Storage cost vs O(1) lookups in both direcitons