Define the APIs expected from the system. This is your chance to analyze and define the read and write paths so that you can come up with the high-level design...
POST /shorten
Headers: Authorization: Bearer
Request: { "longUrl": "https://example.com/very/long/url" }
Response: { "shortUrl": "https://tinyurl.com/ab3Xy9", "expiresAt": "2027-03-16" }
Status: 201 Created
GET /{shortCode}
Response: 301 or 302 redirect to longUrl
Status: 301 (permanent, browser caches — less server load)
302 (temporary, hits server every time — better for analytics)
Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.
The system uses an event-driven architecture with two completely separate paths for reads and writes, chosen because read traffic (redirects) will outnumber write traffic (URL creation) by roughly 100:1.
Write path: Client sends POST /shorten over HTTPS to the API Gateway, which applies rate limiting using a token bucket algorithm and routes the request to the URL Shortening Service via POST /shorten. The service generates a Base62 short code, immediately returns it to the client, then asynchronously publishes a url.created event to a Kafka topic. This async approach means the client never waits for the DB write, keeping write latency low. Two independent Kafka consumer groups — db-writers and cache-writers — consume the event independently, writing to Cassandra and Redis respectively.
Read path: Client sends GET /{shortCode} to the API Gateway which routes to the URL Redirecting Service. The service first checks Redis cache. On a cache hit, the long URL is returned as a redirect in milliseconds. On a cache miss, consistent hashing is used to locate the correct Cassandra shard — hash(shortCode) maps to the nearest clockwise node — the long URL is fetched, populated back into Redis with a TTL, and returned as a redirect.
Why separation of services: The URL Shortening Service and URL Redirecting Service are intentionally decoupled so they can scale independently. Redirect traffic is 100x higher so those pods can be scaled horizontally without touching the write service.
Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.
1. Base62 Encoding Service
Converts a long URL into a unique 7 character short code using characters [a-z, A-Z, 0-9].
Called synchronously inside the URL Shortening Service pod before publishing to Kafka. Takes longURL as input, returns shortCode as output.
Base62 chosen over MD5/SHA because cryptographic hashing produces long outputs and has higher collision probability at short lengths. Base62 with 7 characters gives 62^7 ≈ 3.5 trillion unique combinations — sufficient for decades of growth. UUID was rejected because it produces non-URL-friendly characters.
Collision handling — before publishing to Kafka, the service checks if the generated shortCode already exists in Cassandra. If it does, an incrementing counter is appended to the original URL and re-encoded until a unique code is found. At scale, collision probability is extremely low given 3.5 trillion combinations but the check is mandatory for correctness.
2. Kafka + Consumer Groups
Decouples the URL Shortening Service from storage concerns, allowing async fan-out to multiple consumers.
URL Shortening Service publishes a url.created event containing {shortCode, longUrl, createdAt, ttl} to the url-events Kafka topic. The topic has multiple partitions — the shortCode is used as the partition key ensuring all events for the same key land in the same partition, maintaining ordering. Two separate consumer groups — db-writers and cache-writers — each maintain independent offsets, meaning both groups receive every message without the producer writing twice.
Kafka chosen over SNS+SQS because Kafka retains messages after consumption, allowing offset replay if a consumer crashes mid-write. SQS deletes messages after consumption making the two-consumer-group pattern impossible without extra complexity. Kafka also enables future consumers like an analytics service at zero additional cost to the producer.
If db-writers consumer crashes, Kafka retains all unconsumed messages at the last committed offset. On restart, consumption resumes from exactly where it stopped — no data loss. If a partition becomes unavailable, Kafka rebalances partitions across remaining consumer pods automatically.
3. Redis Cache Layer
Serves as the primary read path for all redirect requests, targeting >90% cache hit ratio.
URL Redirecting Service checks Redis synchronously before touching Cassandra. Key is shortCode, value is longUrl. Redis is populated asynchronously by the cache-writers consumer group from Kafka — not by the redirect service itself — keeping the read path clean.
LRU eviction policy chosen because URL access follows a power law distribution — top 20% of URLs receive 80% of traffic. LRU naturally retains hot URLs and evicts cold ones. TTL set to 24 hours per entry to handle URL expiry and prevent serving stale redirects after a URL is deleted. Redis chosen over Memcached because Redis supports TTL natively per key.
On cache miss, redirect service falls back to Cassandra — system remains correct, just slower. If Redis goes down entirely, 100% of traffic falls through to Cassandra — system stays available but latency degrades. To mitigate, Redis can be run in cluster mode with replicas so a single node failure doesn't take down the entire cache.
4. Cassandra + Consistent Hashing
Primary durable storage for all shortCode → longUrl mappings, sharded across multiple nodes for horizontal scalability.
Written to asynchronously by db-writers consumer group. Read from synchronously only on Redis cache miss. Schema is simple key-value: shortCode (partition key) → {longUrl, createdAt, expiresAt}.
Cassandra chosen over MySQL because the access pattern is purely key-value — there are no joins, no transactions, no relational queries. Cassandra's native support for horizontal sharding and its partition key based routing aligns perfectly with consistent hashing. MySQL would require a separate sharding layer and struggle to scale writes at this volume.
Virtual nodes in consistent hashing ensure no single shard gets overwhelmed — each physical node owns multiple small virtual nodes spread across the hash ring. When a new Cassandra node is added, only the keys that fall in its virtual node ranges are migrated — not a full reshuffle. During rebalancing, reads fall back to the previous node if the new node hasn't received the migrated key yet, maintaining availability at the cost of brief eventual consistency.