Requirements
Functional Requirements:
- Create a short URL for a given long URL.
- Return the long URL associated with a given short URL.
Non-Functional Requirements:
- Low latency
- High consistency
- Scalable to 1 lakh req / s
API Design
GET /urls/?url=
Status code: 302 redirect
{
"redirected_url":
}
POST
/urls/
{
"url": "long format url",
"short_code":
"expiry":
}
return {
"url":
}
1. API Architecture & Request Lifecycle
POST /api/v1/shorten (URL Creation)
- Ingress & Traffic Control: Requests land on the API Gateway, which handles global rate limiting using a Token Bucket Algorithm to mitigate multi-IP abuse. Traffic is then distributed via a Load Balancer to horizontally scalable application nodes.
- ID Generation Strategy:
- Custom Short Codes: If a user provides a custom
short_code, the application validates it against a distributed Cuckoo Filter (which allows deletions) to quickly check for existence. If it passes, a strong database unique constraint handles final concurrency mitigation. - System-Generated Codes: If no custom code is provided, the system utilizes a coordinated, lock-free Snowflake-inspired generator optimized for short lengths, converting the output using Base62 encoding (
[a-zA-Z0-9]) to guarantee zero distributed collisions.
- Custom Short Codes: If a user provides a custom
- Storage Write: The record is written to the primary database while concurrently seeding the hot cache layer.
GET /{short_code} (URL Resolution)
- Edge Caching (CDN): Viral and highly repetitive links are cached and served directly from location-based CDNs to offload traffic from core infrastructure.
- Cache Penetration Mitigation: On a CDN miss, the request hits a localized Cuckoo Filter. If the filter returns a negative response, the system immediately drops the request and throws a
404 Not Found, protecting the downstream database from malicious invalid-link floods. - Hot Cache Layer: Valid requests check a Redis cluster containing a sliding window of recent links (24-hour TTL). This handles an estimated 70% of standard traffic.
- Database Fallback: Cache misses query indexed read-replicas.
- Redirection Status Code: The system responds with a
302 Found(Temporary Redirect) status code instead of a301. This forces client browsers to check the server on every hit, ensuring that URL expirations, metrics tracking, and rate limits are enforced in real time.
2. High Availability, Data Partitioning & Expiration
Data Partitioning & Global Replication
To handle viral, cross-region traffic without cross-continental database latency loops, the system avoids strict geo-location IP pinning. Instead, it utilizes a Single-Leader, Multi-Region Replication topology. Writes are processed in a primary region and asynchronously replicated to read-replicas worldwide, ensuring ultra-low latency reads globally.
URL Expiration Mechanics
- Soft Expiration: Every URL record contains an
expires_attimestamp. Read operations evaluate this field inline; if the current time exceeds the expiration threshold, a404is returned immediately, and the associated Redis key is purged. - Hard Cleanup (Storage Management): To avoid heavy PostgreSQL table bloat, dead-tuple fragmentation, and intensive background
VACUUMlocks caused by mass row deletions, the database is partitioned by time (e.g., daily or weekly tables). Expired data blocks are cleanly removed using low-overheadDROP TABLEcommands on older partitions.
Key System Highlights
- Base62 Encoding: Keeps generated tokens highly readable and compact.
- Cuckoo Filter Optimization: Used on both GET and POST paths to block invalid queries and handle dynamic URL updates/deletions seamlessly.
- High Availability: Achieved via stateless, horizontally autoscaling application layers backed by distributed Redis caching.
- Strong Consistency: Enforced on the write-path via a coordinated Snowflake-based structure to completely eliminate collision vectors across distributed data nodes.
Detailed Component Design
Availaibility perspective:
Since we have redis layer on top of it , it can scale to 1m req / s
Also cannot induce CDN since we want strong consistency
Tradeoffs:
No tradeoffs since we have extension in place to handle expiration.
Snowflake package to handle unique short code generation
Concurrency handling:
Concurrent calls will not usually collide since in packages like snowflake it used multiple parameters to create a hash Id
Thundering herd problem:
We might have issue of redis miss cache for multiple requests, hence we would be introducing rate limiting as well so that LB does not allow request above certain limit to exceed