Requirements


Functional Requirements:


  • Create a short URL for a given long URL.
  • Return the long URL associated with a given short URL.



Non-Functional Requirements:


  • scalable
  • highly available
  • low latency 200ms
  • zero data loss
  • fault tolerant



API Design

POST tinyurl.com/create/long_url

DELETE tinyurl.com/delete/short_url

GET tinyurl.com/short_url


High-Level Design

We will have API Gateway, tinyURL Service, cache, encoder and database. API Gateway itself handles load balancing, request forwarding, authentication, rate limiter. tinyurl service is the main service which has APIs hosted and acts as bridge between users and backend system. Encoder Service encodes long url to short url. For create API, flow is API Gateway, tinyurl service, cache, encoder, database then shorturl retuned to user. For GET REQUEST the flow is tinyurl service, cache, database and return the url with 303 HTTP code.




Detailed Component Design

1) Create API(Write) Flow, Request comes to API Gateway which works as LB, does authentication and rate limiting. Then request is forwarded to tinyURL Service then it checks whether already a mapping exists in cache or database if so return the mapped short url. Else Service passes the request to encoder, and then encoder encodes the long url and returns the encoded short url. tinyurl service saves shorturl and original url mapping along with other details like user_id, created_dt, expiry into the database. and can add it into cache as well.

2) Get API(Read) Flow: Request comes to API Gateway, it is forwarded to tinyURL Service then cache is checked, if found then long url is returned with redirect http code. Also tinyurl service writes to message queue to build analytics like count and user region. analytics service consumes and builds analytics and saves it into OLAP DB.

3) Encoder takes long url and converts into shorturl by getting a pre-generated fixed length strings or can use base62 method to hash longurl to short url. It takes care of key collision. Base62 contains [a-b, A-Z, 0-9] characters for encoding.

4) Cache: The data is cached in Cache with TTL and eviction methods like LRU and LFU can me configured.


5) Database: we can keep the SQL server with primary and replica setup. If scale is really high, database can be sharded.