Codemia | Master System Design Interviews Through Active Practice

My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach

by whisper3162

Requirements

Functional Requirements:

Shorten Url, given a long url, the short_url must be compact and unqiue.
For a given short_url, it should redirect to a long_url (original url). The latency should be as los as possible.
Link expiration based on TTL, so links auto expire after a duration.

Non-Functional Requirements:

Low latency: P95 should be below 10 secs. Redirects must be below 10ms, since any delay in latency is perceived as a broken or suspicious link. URL Creation can take some time. (50ms).
High Availability: The system should be highly available: 5 9's = 99.999% available. The redirect path should be highly available. URL Creation can sustain brief outages, but redirects cannot have.
High Scalability: the system should be able to store billions of stored urls, with thousands of redirects per second.
Durability: Once the mapping is created in DB, it should not be lost. These short urls are embedded in to docs, so any mapping lost leads to errors without recovery.
Trade Off between consistency and availability: (CAP Theorem): Availability over Consistency.

COST ESTIMATION:

Assume: 200 WPS => 200 * 86400 / day

read : write = 100:1

therefore number of redirects : 20000/sec

Storage:

short_code : 7chars => 7bytes

long_url: about 100bytes

user_id: 20bytes

created timestamp: 8bytes

estimated timestamp: 8bytes

total 1 record = 7 + 100 + 20 + 8 + 8.= 143 bytes

== 256 bytes (when metadata, indexes are involved)

therefore daily storage => 256 * 200 * 86400/day = approx 4.4 GB/day

Bandwidth:

Write : 50KB/sec => negligible

Read (Redirect) 5MB/s (due to large payload of long_url)

Caching:

Storing last 30 days of fetched mappings => 80 to 90% cache hits

API Design

POST /api/v1/url

req payload:

long_url: string

expires_at: timestamp

auth : Api key in header

Response: 201 Created

short_code: string,

long_url string (for validation)

createdAt: timestamp

expiresAt: timestamp

authentication: using API key in sent in headers

rate limit: 100 req/s per user for free tier, piad subscriptions can have higher.

return 429 with a Retry-after header if too many requests.

Validation: The server should validate that the long_url is a well formed url with a supported scheme (HTTPS, HTTP) reject javascript:. ftp: data: to prevent abuse.

Check if url is reachable using the HEAD request, although this will increase latency for create, but is bearable.

2.GET /api/v1/url

params: short_url = short_url

success response: 302 Found

response header: Location: long_url

auto redirection without authenticaton

If the short_url has been expired or doesnot exist, return 404.

For expired link, send a message field with response stating that the link has expired.

Other headers to be included: Cache-Control and X-Robots-Tag to tell search engines whether to cache the short_url or original url.

301: Moved Permanentely, Browser stores the mapping in cache. There for high reads, bit no analytics or stale date validation for this matter

High-Level Design

POST req path.

The user will send a post request to generate a short_url for a given long_url. The user can optionally send a namespace , in order to prefix the short_url with the organisation name.

The request flows through the Load Balancer and the API Gateway, which handle TLS/SSL termination, and authentication respectively.

The request finally reaches the Shortening Service, where it uses a hash based algorithm to generate hash of the long_url and take first 7 bytes from the hash in order to generate a short_id.

This short_id will be unique across shards and will be stored asynchronously in the KV store using a message Queue and queue worker.

2.GET request path

The user when requests for the original url to be redirected using the short_url, the request first goes to CDN, which will cache the most frequently used urls. If it is a hit then the request doesnot flow into the Backend , and is directly returned with the appropriate url.

The request reaches the Mapping Service, where it checks in the cache for the original_url, given the short_id, if it is a hit, then it is returned and stored in CDN.

If it is a miss, then a DB call is made, and the cache is updated. (Read Through Policy)

Caching Strategy:

The cache will contain the urls used in the last 30 days in order to get 80 to 90% hit rate.
The TTL of the cache keys will be 5min.
LRU Eviction policy is used, in order to remove cache keys which are least used.
CDN is used in edge locations to cache the most famous and popular urls, so that the request does not reach the backend and scalability is handled.

Detailed Component Design

Load Balancer : A Static Load balancer with round-robin algorithm will be good fit as it will ensure equal distribution of traffic between different instances. This works in Layer 4 hence can consume more traffic, and also handle SSL/TLS encryption and termination.
API Gateway: Works in Layer 7, forwards requests based on the route. Handles Rate limiting using token bucket algorithm for protection against DDOS. Handles Authentication and Authorization of USERs using RBAC. Responses can also be cached for lower latency.
Shortening Service: Generates a short_id based on the hash of first 6-8 characters of the long_url. This is used as a unique key to be stored for mapping the original url in the KV DB. For ensuring uniqueness, we can create hash functions on the first 7 bytes of the long_url along with calculate the next index where the DB will filled. We can also prefix the short_id generated using a namespace such as name of the org, which can be sent with request payload.
Mapping Service: This Checks against the DB whether a mapping between the short_id and the long_url exists. If it doesnot, then it will return 404. Also, if the current timestamp is greater than the expiry timestamp, then it will return 404 , with error message of "Link Expired" In order to deal with stale links.
Message Queue (Bull Queue, SQS, Kafka): Acts as a buffer to store requests between shortening service and DB. If 20000 url requests arrive in 1 sec, then this will store the requests and drain them in a steady state of 200 req/sec, thus keeping the flow constant and lower for DB to run transactions.
Database Design: A Key value based NO-SQL DB isused such as Dynamo DB. It is reliable and highlly available due to the replicas create across 3 AZs.
Cache: LRU Eviction policy is used in order to remove keys. The mappings used in the past 30 days will be cached so that 80% to 90% request will be a cache hit.
CDN: Content Delivery Network will be used to cache the the most frequently used mappings of short and long urls, thereby not taking the request to the backend to fetch the most popular link.

Problems that might occur:

Hash collisions: while generating short_ids for long_urls, there might be a scenario where the short_id of 2 urls is the same. In such case, we will first have to look up to the DB, whether there exists a mapping for a particular shortd_id, if it does, then we can change the parameters of the hash function and generate a new one. Alternatively, we can keep a count of the next index that is going to be filled, and thus calculate the mod and include it in the calculation of the shortd_id, thereby ensuring uniqueness.
Split-brain Scenario: In order to fix this: Master - Slave architecture to be followed in DB ( 1 master and 3 slaves(replica) of the DB, where all the writes go to the Master, and the replica is eventually synced with the recent updates. When the master node fails, then one of the replica can be promoted to master. For microservices, they will be deployed in ec2 instances in auto-scaling groups, with minimum and maximum capacty confugired, ensuring high availability and scalability. Also for errors, retries with exponential backoff will be implemented in microservices, while calling to Message Queue, Database, as well as in the queue workers while polling for jobs. For failed execution of jobs, those specific jobs can be added back to the same queue, or a DL Queue(Dead Letter Queue) can be used for asyn processing.
Single point of failure: Load Balancer can act as a single point of failure. To avoid this, we will create a few more instances of Load Balancers which will get traffic only when the main load balancer goes down, hence ensuring availability.

Schema

short_url: string

long_ur: string

userId: uuid

created_at: timestamp

expires_at: timestamp

PS: This is not the exact solution, learning as yall, Open for suggestions.

Supports markdown