Requirements


Functional Requirements:


  • Create a short URL for a given long URL.
  • Return the long URL associated with a given short URL.



Non-Functional Requirements:


  • low latency in redirect
  • horizontal scalability
  • CAP, consistency for url shortener
  • high availability for creating short-url


API Design

Define the APIs expected from the system. This is your chance to analyze and define the read and write paths so that you can come up with the high-level design...


for creating short url:

POST /urls/?alias

{

long_url:

}


response:{

status: 200 ok

short_url:

}



for redirect using short url:

GET /urls

{

short_url:

}


response:{

status: 302 temporary

message: redirected to orginal url

}



here status: 302 temporary, this will ensure that caching will increase redirect speed, as db need not be queried every time


temporary storage will ensure querying db after cache is evicted, so that the logs during redirect using db , can be used for monitoring and analyti


High-Level Design

Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.


components:

client

CDN

API-Gateway

Load Balancer

Rate Limiter

url-shortener-service

unique-id generator

redis cluster to store counter for unique-id

postgresql-db for url details

cache using redis

redirect service

analytics service


url-shortening service:

client: requests for short url, using POST /urls/?alias, and received response 200 ok, if request is valid

CDN: if the requested url is present in CDN, data is returned to client from CDN itself

else the request is routed via API gateway to url-shortenter service.

API-Gateway: here authentication and authorization as well as request routing to the desired service takes place

load balancer: if we scale our services horizontally, we would need a load balancer, for request distribution

rate-limiting: to prevent servers from getting overwhelmed with large number of requests, we will need a rate limiter. if rate limit exceeds we return the error saying too many requests please retry later

else the request finally reaches url-shortening service.


the server checks for cache and postgresql-db if that particular url has already been requested for shortening, if so we return same data from db, and update cache if it was a cache miss


otherwise we use

unique-id generator service to generate a unique id and update the postgres db and also set ttl on data, to allow room for newer urls, by removinf expired data form db


redirect-service:

client: requests for redirect using GET request

CDN: if the requested url is present in CDN, data is returned to client from CDN itself

API-Gateway: here authentication and authorization as well as request routing to the desired service takes place

load balancer, for request distribution

rate-limiting: to prevent servers from getting overwhelmed with large number of requests, we will need a rate limiter. if rate limit exceeds we return the error saying too many requests please retry later

else the request finally reached redirect service

redirect service checks for long or original url from cache.

if cache hit, redirect to url using that data

else if cache miss, we get data from postgresql-db and update cache as well


in case of failure, we send the logs to analytics service to analyze the failures and monitor the application.






Detailed Component Design

Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.


url shortening:

let's say there are 10^9 i.e. 1 billion req/day => 10^9/ (24*3600)>10^9/ (25*4000) > 10^9/10^5 > 10^4 req/s


say reads: writes = 100:1

=> redirect requests: 10^6 req/s


say 20% traffic req are served by CDN itself

so req to (U)url-shortening-service: 2000req/s && (R)redirect-service: 2x10^5 req/s


these many requests cannot be handled by 1 server,

1req => 8byte data

U => 16000 byte/s => 0.016 mb/s

R => 16 x 10^5 byte/s => 1.6 mb/s


to limit this we need horizontal scaling and rate limiting


also postgresql-db can only handle 2-5k connections at a moment, so choosing redis cache over postgresql-db would make more sence, as it can handle upto 10 k connectinos at a time


but still for redirect we have 100x more req, so this would still need db to be horizontally scaled, even if caching reduces the load by say 10%


so redis needs to have replicas, and to avoid write conflicts we can have write on one node and rest can handle read requests.

say we distribute load among 10 read replicas, to prevent b from overwhelming with connections we can set rate limit of total 10^5 req/s for redirect


for url-shortening service, there's a catch. even though server would have enough capacity to handle the load, it may not have sufficient hashes to address all requests.


say we maintain a counter, which we increment each time a new req is made and then hash it if we take size 5 and 64 bits => 64^5 counters in total around 10^7 to 10^8


to allow this we can set a ttl of around 2 days on each req to allow room for more urls

and to be able accomodate req for 2 days => 2* 10^5 s


around 10 req/s



now if the limits have reached, we will graceful handle the failure by saying retry later

if one service is failing due to limit issue, we will use bulkhead pattern to mnimize impact on overall service as well as use circuit breakers.


Apart from this to handle transient failures, we can add exponential retries with some jiitter.


algo for unique id generation:

possible char in encoded string: [a-z], [A-Z], [0-9] i.e. 26+26+10 = 62

hence we will maintain a counter and store it in redis cluster(with replicas to avoid single point of failure), then we encode it in bas62, similarly for redirect we can decode string back from base62 to the num.


since anyone can easily encode an int to base62, for security purpose, we will append a random string as a prefix to the encoded string and store that random string in db


bottleneck here is that there are stillk chances of collison, in that case we can do retries and try increasing the counter, even after the retry limit exceeds, if we face collision, then we return a meaningful error to client that the request cannot be currently fulfilled pelase retry later.


hence it becomes to evict cache and remove old data from db,

for cache eviction, we can use Least recently used cache eviction straegy.


also for reduced latency, we can use write behind strategy for url-shortener and read through for redirect service.

the trade-off here is that, we are favouring eventual consistency and in case cache db goes down or crashes, db stays inconsistent.

to avoid this, and ensure db inconsistency, we will have multiple replicas of cache, so that we have a replica to failover on, in case one replica goes down.


for security and to prevent DDOs attacks, the data in cache and our db where all info is stored will be encrypted and the protocol use for the requests would to https.

Apart from this, for making the servers more secure, we will add a reverse-proxy before original server, so that the identity of original servers remain hidden.


also to reduce latency further, we can add a separate cache-db for most popular urls or push the viral urls data to CDNs to prevent fanout services for other urls from getting overwhelmed.