Designing A Simple Url Shortening Service A TinyURL Approach - System Design

Requirements

Functional Requirements:

Create a short URL for a given long URL.
Return the long URL associated with a given short URL.

Non-Functional Requirements:

Handle 1000 QPS
Handle global traffic
99.99% availability
Store billions of URL mappings
Round trip P99 latency of <100ms for createShortUrl API

API Design

High level functions:

String createShortUrl(String longUrl);
- Takes a long URL and returns a short URL from our domain
- If the URL is already mapped, we return the existing short URL
- If the URL is malformed or banned, we return an error
String getLongUrl(String shortUrl):
- Return the long URL associated with a short URL

Now let's map that to actual HTTP endpoints:

POST /create&url=<url>
- URL is URL-encoded in query string parameter "url"
- Returns JSON with short URL
- Web app renders page with short URL short and "copy me!" button
GET /<shortId>
- A short URL is just our domain + "/"+ a short ID
- The short ID is base-62 to be URL-safe (digits, lowercase/uppercase letters)
- The short ID is length 8, giving us 200 trillion possible IDs
- Returns 404 if the short link is not mapped or is banned

High-Level Design

Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.

Database
- A document DB works well here (e.g. DocumentDB)
- Stores bidirectional mapping between long and short URLs
- Also stores a ban list so admins can ban certain domains
- Also supports an ID block checkout system, where application servers can check out ranges of contiguous short IDs (blocks) and hand them out
- Essentially we have a range of all of the possible IDs. A server will check out a range of contiguous IDs (say, a million IDs) and mark that block as IN_USE. When the server returns the block or the timer expires, the block is back to READY. The block also contains a counter of how many IDs have been used. The IDs are hashed and salted with a per-block salt to make them extra random, instead of just monotonically increasing. We check each ID against the actual short ID index in the main collection before handing them out to ensure uniqueness at a small latency cost.
- Readily handles billions of documents
- <10ms latency roundtrip for reads and writes
Cache
- To reduce load on DB and serve low-latency results while tolerating some staleness, we can front with Elasticache (Redis mode)
- <10ms latency for roundtrip and decreases load on DB
Application Server
- Let's use dedicated EC2 servers. This avoids the issue with cold starts
- The server checks out blocks from the block store and hands out IDs from that block (I'll go into more detail in the detailed component design)
API Gateway / Reverse Proxy
- Route to Lambda functions
- Can later easily swap over to dedicated application servers if we go that route (once we have sustained traffic)

Detailed Component Design

Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.

First let's design the database schema.

Collection 1: URLs
- Key: short ID (length 8 string in base-62)
- Field: long URL (full URL)
  - We create an index on the long URL field so we can do a bidirectional mapping
- Field: created
- Field: lastAccessed
  - Optional - for garbage collection feature, if we want to implement it later
Collection 2: Banned domains
- Key: UUID
- Field: Normalized domain (string, indexed)
- Field: Reason (human-readable ban reason string)
- Field: Audited By (email of admin who banned the domain)
Collection 3: Blocks
- Key: Block start (always a multiple of the configured block size)
  - Don't need to store block_end, since it's just block_start + block_size
- Field: allocationPointer, integer, increments with each allocation
- Status: AVAILABLE, IN_USE, FULL, NOT_RETURNED

Block allocation:

This is an allocator. We need a central repository of blocks with strong consistency.
Each server asks for a block at startup. When it fills a block, it asks for another.
Each block has a status. We mark a block as IN_USE while we're allocating with it, so another server doesn't use the same block.
The server writes back the incremented allocationPointer whenever it hands out a URL. It will write this back periodically. However, to avoid congesting the ~20k writes/second for a MongoDB shard, which likely will contain all of the blocks since it's a small collection, we want to durably store the write pointer in a Redis cluster, and periodically write it back to the database (say, every 5 seconds).
- We still write back the URLs immediately, since this is a large collection and writes will go to different shards.
- If a server crashes, we can recover the allocation pointer state from Redis and fall back to rebuilding it from the URL database.
Blocks have a timeout. If a server crashes and a block isn't returned, it is marked as NOT_RETURNED, meaning we have to do recovery.