Design Twitter - System Design

Requirements

Functional Requirements:

Allow users to tweet messages up to 140 characters.
Enable users to follow other users.
Allow users to like tweets from other users.
Display tweets from followed users in the home feed.
Show top K popular tweets in the home feed based on likes and followers.

Non-Functional Requirements:

High availability - 99.99 avail - users expect to always be able to read/write tweets
Low latency - home feed reads should be <200ms p99. Tweet posting should feel instant (<500ms)
Eventually consistent: Feed delivery can tolerate slight delays (seconds). Likes/follower counts can be eventually consistent.
Partition tolerant: Given CAP - we choose AP (avail + partition tolerant) over CP (consistency) for feed reads
Durability: Tweets, once posted, must never be lost.

API Design

Write path:

POST /v1/tweets

Body: {"user_id": string, "content": string (max 140 chars), "media_ids": [string]}

Response: {"tweet_id": string, "created_at": timestamp}

POST /v1/users/{user_id}/follow

Body: {"target_user_id": string}

Response: 204 No Content

Post /v1/tweets/{tweet_id}/like

Body: {"user_id": string}

Response: 204 No Content

Read path:

GET /v1/feed?user_id={id}&cursor={cursor}&limit=20

Response: {"tweets": [...], "next_cursor": string}

GET /v1/feed/trending?limit=K

Response: {"tweets": [...]}

High-Level Design

The system splits into two core paths:

Write path (Tweet Ingestion): User posts a tweet -> stored in Tweets DB -> fan-out service pushes to followers' feed caches
Read path (Feed Generation): User requests feed -> read from precomputed feed cache (for normal users) or pull-based merge (for celebrity followers).

Key components:

Tweet service: Handles CRUD for tweets
Fan-out Service: Pushes tweets to follower timelines (hybrid push/pull model)
Feed service: serves the home timeline from pre-computed caches
Social graph service: managed follow relationships
Trending service: Computes top-k popular tweets using a streaming count algorithm

Detailed Component Design

1. Fan-out Service (Hybrid push/pull)

Problem: A celebrity with 50m followers posting a tweet cannot fan out to 50m feed caches synchronously

Solution: hybrid aproach

Push (fan-out on write): For users with <10k followers, when they tweet, asynchronously write the tweet_id in each follower's feed cache (Redis sorted set, scored by timestamp)
Pull (fan-out on read): For celebrities (>10k followers), do NOT fan-out. Instead, at read time, the Feed Service merges the user's precomputed feed with recent tweets from followed celebrities.

Data structure: Each user's feed is a Redis sorted set: feed: {user_id} -> {tweet_id: timestamp}. Capped at 800 entries.

Scaling: Fan-out workers are horizontally scaled behind a message queue (Kafka). Partitioned by target user_id for ordering guarantees.

2. Trending/Top-K Service

Problem: Compute the top K tweets globally based on a scoring function: score = likes * wl + retweets * w2 + recency_decay.

Solution:

Use a streaming approximate count approach (Count-Min sketch -> Min Heap of size K)
Every like/retweet event is published to Kafka -> consumed by the Trending Service.
The service maintains a sliding window (e.g. last 24h) and a minheap of K tweets by score
Results are cached in Redis wit ha TTL of 60s and served directly.

Tradeoff: Approximate counts are acceptable - trending doesn't need exact precision.

3. Social Graph Service

Storage: Follow relationships stored in a graph-optimized store (adj list in Cassandra or dedicated graph store like TAO)

Schema (Cassandra):

Following: partition key = user_id, clustering key = followed_user_id
Followers: Partition key = user_id, clustering key = follower_user_id

both tables maintained in a dual-write pattern for O(1) lookups in either direction. The followers table is critical for the fan-out service to know who to push to.

dual write - use a transactional outbox or event sourcing to ensure durability

Key Tradeoffs summary:

Decision: Hybrid fan-out: complexity vs handling celebrity problem

Redis sorted sets for feeds: memory cost vs read latency

Approximate trending (Count-Min Sketch) precision vs throughput

Eventually consistent likes/counts: Consistency vs Availability

Dual-write follower/following tables: Storage cost vs O(1) lookups in both direcitons