Designing A Simple Url Shortening Service A TinyURL Approach - System Design

Requirements

Functional Requirements:

Create a short URL for a given long URL.
Return the long URL associated with a given short URL.

Non-Functional Requirements:

Low latency, scalability, reliability, High availability

API Design

It'll be Rest API like v1/makeItTiny Post request which responds with a Base64 encode ID for that url like a245B which represents our given url in post request which then can be accessed by hitting ourDomainName.com/Base64encodedString

We'll also may another API which will return the required url based on Base64 encoded string provided example /encodedString get request responds with required url + timestamp for expiry (although not necessary as we are only required to redirect to the original url and this API will only be used internally for may be cleanup of old expired urls)

v1/base64encodedTinyUrl -> response will be 3xx redirect to original url

v1/tinyUrl/analytics -> get response for analytics on url (possibily only internal use)

High-Level Design

HLD looks like client connecting to our service by requsting a url creation url or redirect url, the request goes to CDN (request is served if cached in cdn) then it hits load balancer which based on the location of the client and server loads redirect the request to one of the API gateways, now API gateways (multiple regions have their own API gateways or services) have to caches 1. Redis L1 cache and 2. local API service cache, in case of cache miss API gets the response from DB, populate the cache then return the response for the request from the database, so basically design is request -> CDN -> Loadbalancer -> API gateways (L1 redis cache + L0 API local cache) -> Database

Database here can be postgres relational Database

API service also connects to ID generator service which can be scaled independently, this ID generator uses ID from DB to generate unique base64 encoded string, also if our system is highly distrubuted then ID generator can use regionID+machineID+timestamp to generate a unique string

Also we'll have ansync analytics service which will look at url analytics about urls + entire system, things like url hit rate etc.

Scaling horizontaly is easy as we can add as many API services as we want and re config the load balancer to efficiently distribute loads

Detailed Component Design

CDN -> It is used to as first defence of service against massive traffice, malicious actors etc, content delivery networks ensures that duplicate request as served from the cache located nearest to them so that latency is kept under check, also it protects our APIs and database from massive load and helps in scaling
LoadBalancer -> we have multiple servers + services (our API) at different locations so that we can serve users on acceptable latency for this to work we need a service like load balancer so that user request are redirected to nearest service based on load + geo distance of where the request was originated
APIs -> these are the heart of the system where logic to convert a url to base64encoded string is written, also logic for redirecting the tiny url to orignial url is written, it also has L1 redis cache + L0 local cache so that we can scale and serve many users as fast as possible, this keeps latency low, helps in scalability, and only few request goes to database
Database -> Here we choose postgres relational database as writes paths are rare and reads paths are most common for our service, so a relational database can scale easily in our case for billions of useres
ID generated as explained before uses ID generated from database to get base64encoded unique string, in case the ID generator is down or system becomes highly distrubuted we can use snowflake style regionID+machineID+timestamp(centralized in best case) to generated unique ID in API services, regionID+machineID+timestamp snowflake style ensures that we can unique IDs across highly distributed system without any central authority or service
Caching - we have multiple layers to handle cache misses, first layer is CDN then redis layer on top of APIs then local API caches , if all of them misses then we hit database, also we use LRU eviction policy here so that more frequently used url stay in cache for longer, also in case we support TTL we can have a async worker which invalidates caches and remove url entiries from database (it can put it old records database if required)
Error handling - In case users enters a invalid url then API will throw out approriate codes for error like 404 not found etc.
Failures - CDN -> if cdn fails then we can fallback to another cdn or let the request reach load balancer/ API gateway, if Redis failed then serve from API cache, if API cache fails then serve from Database, if Database failed then serve from another shard or backup Database, In case of hotkeys we try to serve most request from cache also we can use request coalescing for example if a million request comes for a single url then check 1st request and fetch from DB if required then populate cache then serve all remaining million -1 requests from cache
Scaling - It is easy in this design as we can add as much API services behind load balancer as we want
Handling Hotkeys or Brust traffic - we can use request coalescing for example if a million request comes for a single url then check 1st request and fetch from DB if required then populate cache then serve all remaining million -1 requests from cache, also we must use rate limiting with some backoff to protect our system from malicious actors or unsual brust traffic