Requirements
Functional Requirements:
- Create a short URL for a given long URL.
- Return the long URL associated with a given short URL.
Non-Functional Requirements:
- Scalability
- Messaging queues for creating shortened URLs
- NoSQL so horizontal scaling can be supported
- Base62 Counter for shortened URL IDs
- Low latency
- CDN network for frequently accessed short URLs
- Redis caching layer with LRU
- High availability
API Design
Define the APIs expected from the system. This is your chance to analyze and define the read and write paths so that you can come up with the high-level design...
Write
- POST /shorten
- Long URL is sent and stored
- Returns a short URL
Read
- GET /shortenedId
- Short URL is sent
- Returns 302 and a redirect to the long URL page
High-Level Design
Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.
Load balancer + API Gateway
- Directs requests to either shorten the URL or get the shortened URL
URL Shortening Service
- Creates the shortened URL using UUIDv4 + Base62 for ID generation
- The UUIDv4 IDs will help distribute the mappings across the database
Redirect Service
- Looks up the shortenedId get the long URL for the client
- CDN, Redis, then NoSQL
NoSQL Database
- Stores long URL to short URL mappings
- Easy to scale horizontally
Caching Layer
- CDN Network
- Redis Cache
Detailed Component Design
Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.
Load Balancer
- Distributes incoming traffic across multiple stateless instances of the URL Shortening Service using round-robin or least-connections.
- Horizontal scaling: Add more service instances behind the load balancer as traffic grows.
- Read path (redirects) and write path (shorten) are handled by the same service pool, but can be routed separately via path-based rules if needed.
Redis
- Redis handles read traffic for high frequency URLs
- Use consistent hashing to spread URLs across instances
- Eviction based on LRU, volatile-lru, based on TTL set to 24 hours
URL Shortening Service
- Input: long URL
- Output: short URL
- First looks for a mapping for the long URL sent in the caching layers, to avoid attempting to map a popular URL
- For new long URLs:
- Generate short ID.
- Attempt to insert into NoSQL DB (using shortId as primary key).
- On collision (ID already exists): Generate the next ID and retry (1-2 times max). Probability is extremely low with 6-7+ character Base62 IDs.
- Using UUIDv4 will help distribute the mapping evenly across the database shards
Redirect Service
- Input: shortened URL ID
- Output:
- Found: 302 and a redirect to the full URL
- Not found: Returns a 404
- First checks the CDN for the mapping, then Redis, and finally the database
- No mapping is found
- Return 404
- If mapping is found in Cache
- Updates the last accessed timestamp and return redirect
- If a mapping is not cached, but in DB
- Add it to the Redis cache and return redirect