POST api/v1/shorten/:longUrl
-> 201 Created
GET api/v1/long/:shortUrl
-> 301 Moved Permannet
Here we need to focus on two aspects mainly shorten the url and get the url
We assume 10k writes per second and peak 100k
Then
10,000 * 3600 * 24 = 864 millon url/day
26 billon/month
315 billion/ year
URL shortener:
We will take 58 char left the confusing. We know we need roughly 1T url per year. So taking 58^8 , 8 char shorten url which is 128 T url combination sufficient enough
Storage and DB:
Lets assume temporary and permanent url split. Assume that around 250b urls max length 1 year and 75b max 10 year
Assuming 500 byte per url
So for a single year we need storage of,
250 + 750 = 1T url storage ,
which is 500 TB
We assume 5TB of of per db node so 100 db node needed
Based on number of visit and time created we will create a ranking list and used caching. We assume 1% url mostly visited . Which would be 50 gb . Can be handle with 4*16 shared redis cluster
We assume 100k read per second
Unique Id Generator
There can be generally two ideas for url shortening thing 1. Url encoding and 2. Key generation and check for duplicate in the db
As we already said we will have 58 characters so we will go for 58 base encoding.
Counter and Fault Tolerance
We need to have a unique number in decimal and just go for 58 base encoding. For scale we may need multiple unique id generator . For simplicity lets assume we have 10 unique id generator that can have the id of unique id % 10 = id generator num. It will be increase one by one periodically
We need counter nodes to be fault tolerance as it is crucial and possibility of collision. So we will use Quorum consensus system like zookeper that will asign ranges to counter nodes and if any of the counter node down it can assign a new one
We can map these unique id generator nodes to db nodes . Because we can decode short url from fetch and can tell which db nodes the data in
Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.
To ensure low latency, high availability, and massive horizontal scalability, the system relies on the following core components:
* **API Gateway / Load Balancer:** Layer 7 routing and rate limiting.
* **Range Manager (etcd/Zookeeper):** A quorum-based consensus system to manage ID generation and prevent split-brain scenarios.
* **Caching Layer:** Redis for distributed caching, augmented by Local In-Memory Caches (e.g., Guava) on App Servers for viral hotspots.
* **Database (ScyllaDB):** A high-throughput, masterless NoSQL database chosen for its C++ performance and native horizontal scaling capabilities.
---
### 2. The Entry (Write Path) & Burst Handling
*Goal: Generate a unique short URL quickly while protecting the system from abuse.*
1. **Rate Limiting & Burst Handling:** When a user `POST`s to `api/v1/shorten`, the request hits the API Gateway. We use a **Token Bucket** algorithm based on the user's IP or API key. If a user bursts past their limit, the gateway immediately returns an **HTTP 429 (Too Many Requests)**. Critically, it includes a `Retry-After: 60` header to provide a backoff hint to clients.
2. **ID Generation (Avoiding Split-Brain):** The request routes to an App Server. To guarantee no URL collisions without bottlenecking the database, we use a block-allocation strategy.
* The App Server checks its local memory for an available ID.
* If depleted, it requests a new range (e.g., `100,000 - 200,000`) from the **Zookeeper/etcd** cluster. The quorum ensures that even if two servers request a range simultaneously, they are guaranteed mutually exclusive blocks.
* **Outage Handling:** If an App Server crashes, its unused IDs are lost. Because our ID space is massive, this gap is an acceptable trade-off for high performance.
3. **Encoding & Storage:** The App Server takes the unique base-10 ID, converts it to **Base62** (creating the short code), and writes the mapping to ScyllaDB. It then returns a `201 Created`.
---
### 3. The Exit (Read Path) & Hotspot Mitigation
*Goal: Redirect users to the long URL with absolute minimum latency, handling viral traffic spikes safely.*
When a user `GET`s `api/v1/long/:shortUrl`, the system prioritizes speed through multiple fallback layers:
1. **CDN & Edge Caching:** Geographical DNS routing directs the user to the nearest edge node. If the short URL is heavily requested in that region, the CDN handles the 301 redirect entirely (0-10ms latency).
2. **App Server Local Cache (Hotspot Handling):** For hyper-viral links that bypass the CDN, querying Redis might create network bottlenecks. The App Servers maintain a small, LRU (Least Recently Used) local memory cache (e.g., Caffeine/Guava) for the top 1% most popular URLs, allowing instant redirection.
3. **Distributed Cache (Redis):** If not in local memory, the server checks the Redis cluster.
4. **Database Fallback & Cache Miss Path:** On a Redis cache miss, the App Server queries ScyllaDB.
* **Thundering Herd Protection:** If a popular URL's TTL expires, 10,000 concurrent requests might miss the cache simultaneously and crash the database. To prevent this, we implement a **Distributed Mutex Lock**. The first request acquires the lock, queries ScyllaDB, and repopulates Redis. The remaining 9,999 requests wait a few milliseconds for the cache to refresh, protecting the database layer.
5. **Redirect:** The server returns an **HTTP 301 (Moved Permanently)** to reduce future server load, or a **302 (Found)** if granular click analytics are explicitly required for that URL.
---
### 4. Horizontal Scalability & High Availability
* **Database Sharding Strategy:** We **do not** use range-based sharding for the database, as that causes write hotspots. Instead, we use ScyllaDB’s native **Hash-Based Partitioning**. The `shortUrl` string is hashed, and the data is distributed evenly across the masterless ring. To scale horizontally, we simply add more nodes to the ScyllaDB ring, and data rebalances automatically.
* **High Availability:** We configure ScyllaDB with a Replication Factor of 3 (`RF=3`). Every URL is copied to three separate physical nodes. If a database node suffers an outage, reads and writes seamlessly route to the surviving replicas.
* **Consistency:** We accept **Eventual Consistency** for the read path. It is acceptable if a newly created URL takes a few milliseconds to propagate across all database replicas and global cache nodes.
### 5. Data Cleanup
For temporary URLs, we utilize a Time-To-Live (TTL) attribute. Instead of running heavy batch deletion jobs that consume resources, we utilize ScyllaDB and Redis's native row-level TTL features. The database automatically tombstonnes and reclaims the storage of expired URLs during its standard background compaction processes.
Limitation and Future Scope
We know that here we use incremental based counter and then based conversation that gives us some advantage in a distributed system but there are some limitation as well. Like incremental based id can easily be predictable so security concern is there.
In future like we devide temporary url and permanent url. We can have a separate system like secure url . The api can be like api/v1/secure/shorten/:longUrl
The thing is for the secure url things instead of using incremental counter we can use random unique id -> base conversation -> check for duplicate and go on. We can also explore some hashing based techniques there